From: aidotengineer
Introduction to RAG and GraphRAG
Retrieval Augmented Generation (RAG) is a technique that enhances large language models (LLMs) by providing them with access to external knowledge bases, helping to ground their responses and reduce hallucinations. A common question in the field is whether RAG is still relevant or if agent-based systems are taking over [00:00:07]. However, for problems solvable by RAG in production, agents may not be necessary [00:00:28]. There are many use cases where RAG has found successful application [00:00:45].
Graph RAG (or Graph-based Retrieval Augmented Generation) is an advanced form of RAG that leverages the structured nature of knowledge graphs to improve information retrieval and generation [01:50:00]. This approach aims to make LLMs “smarter” by integrating structured knowledge [02:24:52].
What is a Knowledge Graph?
A knowledge graph is a network that represents relationships between different entities [02:27:50]. These entities can be people, places, concepts, or events [02:37:37]. The “edge” or relationship between two entities is crucial, as it defines connections that only graph-based networks or knowledge graphs can effectively exploit [02:58:02]. This representation is vital for understanding complex relationships and organizing data from multiple sources [03:50:09].
Advantages of Graph-Based Systems in RAG
Knowledge graphs enhance RAG systems in several ways:
- Detailed Information Capture: They capture information between entities in much greater detail than traditional semantic RAG systems, providing a comprehensive view of knowledge [03:47:00].
- Relationship Exploitation: The ability to exploit the relationships between entities is a unique advantage [03:37:37]. Semantic vector databases do not exploit these relationships as well [09:40:02].
- Causality and Reasoning: Knowledge graphs can model the “why” and “when” behind changes, enabling temporal and relational reasoning that traditional RAG approaches lack [04:09:56]. This capability helps solve hallucinations and optimize hypothesis generation [01:38:44].
- Explainability: GraphRAG allows for explaining answers by exposing the thought process, relationships, and additional context, which is crucial for applications requiring transparency [03:43:03].
- Handling Complex Data: GraphRAG can handle complex data formats where answers might be split across multiple pages or involve similar but non-matching terms, leveraging the graph structure and relationships [03:46:00].
Building a Graph RAG / Hybrid System
Building a Graph RAG or hybrid system involves several key components [02:27:07]:
- Data Processing: The quality of processed data directly impacts the quality of the knowledge graph and subsequent retrieval [04:32:04]. This is a one-time, offline process [05:05:00].
- Graph Creation: This involves extracting entities and relationships (triplets) from unstructured documents [06:21:00]. LLMs are crucial for structuring this information, guided by prompt engineering and defining an ontology specific to the use case [07:25:00]. This step often requires significant iterative refinement to ensure accurate triplets, as noisy triplets lead to noisy retrieval [08:11:00].
- Semantic Vector Database Creation: For hybrid systems, documents are broken into chunks, embedded into vector representations, and stored in a vector database [08:50:00]. Overlap between chunks helps maintain context [09:20:00].
- Inferencing (Querying): This is the online phase where users ask questions, and the system retrieves information and generates responses [05:00:00].
Retrieval Strategies and Performance
In GraphRAG, retrieval is not limited to simple lookups. It involves traversing relationships between nodes [10:06:00].
- Multi-Hop Retrieval: Exploiting relationships through multiple nodes (multi-hop) is very important for comprehensive context [10:32:00].
- Depth vs. Latency: There’s a trade-off between going deeper into the graph for better context and increasing latency [11:02:00]. Finding the sweet spot is crucial for production environments [11:16:00].
- Acceleration: Libraries like
cool graph
(integrated with NetworkX) can accelerate searches in large graphs, allowing for deeper traversal while reducing latency [11:37:00].
Evaluation and Improvement of GraphRAG Solutions
Evaluating the performance of GraphRAG systems involves assessing multiple factors: faithfulness, answer relevancy, precision, recall, helpfulness, correctness, coherence, and complexity [12:24:00].
- Ragas Library:
Ragas
is a Python library specifically designed to evaluate RAG workflows end-to-end, assessing the response, retrieval, and query interpretation [12:42:00]. It uses an LLM (defaulting to GPT, but configurable) for evaluation [13:31:00]. - Reward Models: Models like Lanimotron 340B are trained to evaluate the responses of other LLMs based on various parameters [14:04:00].
- Improvement Strategies:
- Data Cleaning: Removing irrelevant characters (e.g., apostrophes) can improve triplet generation and overall results [16:31:00].
- Fine-tuning LLMs: Fine-tuning an LLM model can significantly improve the quality of generated triplets [15:38:00]. For example, fine-tuning LLaMA 3.1 with LoRA improved accuracy from 71% to 87% in one experiment [17:22:00].
- Optimizing Retrieval: Small tweaks, such as adjusting multi-hop depth or leveraging acceleration libraries, can drastically reduce latency and improve performance [18:44:00].
Deciding Between Semantic, Graph, or Hybrid RAG
The choice between a semantic, graph-based, or hybrid RAG system depends on two main factors [19:16:00]:
- Data Structure: If the data is traditionally structured (e.g., retail, FSI, employee databases), graph-based systems are often a good fit [19:36:00]. Even with unstructured data, if a high-quality knowledge graph can be created, it’s worth experimenting with the graph path [19:57:00].
- Application and Use Case: If the use case requires understanding complex relationships to extract information for responses, then a graph system makes sense [20:10:10]. However, it’s important to consider that graph-based systems can be compute-heavy [20:24:00].
Philosophical and Practical Applications / Insights
Knowledge Augment Generations (KAG)
Knowledge Augment Generations (KAG) is differentiated from RAG by its emphasis on integrating structured knowledge graphs for more accurate and insightful responses [02:50:00]. KAG aims not just to retrieve but to understand, synthesize, and advise [02:50:00]. This “wisdom” is actively guided by decision-making and fed by knowledge, experience, and insight [02:56:00].
Multi-Agent Systems and Knowledge Graphs
Knowledge graphs are particularly well-suited for building expert AI systems and multi-agent systems [02:59:00]. They provide a systematic method for preserving wisdom by connecting concepts and creating a network of interconnected relationships [02:45:00].
One practical application is in competitive analysis, where a “wisdom engine” (orchestration agent) leverages different types of data (market data, past campaign experience, industrial insight, competitor weaknesses) to generate strategies and advise clients [03:00:00]. Tools like Node-RED, with its AI agent nodes, can facilitate the prototyping of such complex state diagrams [03:17:00].
Benchmarking and Performance
Leveraging knowledge graphs with techniques like fusion-in-decoder can improve efficiency and reduce hallucination rates in RAG systems [03:31:00]. Benchmarking on datasets like Amazon’s Robust QA has shown that graph-enhanced retrieval systems can achieve superior accuracy and faster response times compared to vector-only search systems [03:40:00].
Agent Memory and Knowledge Graphs
Traditional vector database-based RAG approaches fall short for agent memory because they treat each fact as isolated and immutable, lacking native temporal and relational reasoning [03:42:00]. This can lead to generic or hallucinatory responses when agents forget important dynamic context about users [03:52:00].
Graffiti: A Temporal Graph Framework
Graffiti is an open-source framework for building real-time, dynamic, temporal graphs designed to address these memory problems [04:45:00].
- Temporal Awareness: Graffiti extracts and tracks multiple temporal dimensions for each fact, identifying when a fact is valid and becomes invalid [04:54:00]. This enables temporal reasoning, allowing agents to understand how preferences or traits change over time without deleting historical facts, instead marking them as invalid [04:29:00].
- Relational Structure: It allows for explicit relationships between facts, modeling causality [04:18:00].
- Hybrid Search: Graffiti uses semantic search and BM25 full-text retrieval to identify subgraphs, which can then be traversed for a richer understanding of memory [04:56:00].
- Domain-Aware Memory: Developers can model their business domain on the graph by building custom entities and edges, enabling specific retrieval of relevant information and preventing irrelevant facts from polluting memory [04:57:00].
Current Industry Trends and Applications
The industry is seeing a rise in GraphRAG adoption due to its ability to provide context-rich, grounded, and explainable answers [02:30:00]. Gartner’s 2024 hype cycle shows GraphRAG trending upwards, providing more life to the AI ecosystem [02:30:00].
Enterprise Applications
- LinkedIn Customer Support: Using knowledge graphs for customer support scenarios has shown improved results, including a 28.6% reduction in median per-issue resolution time [02:50:00].
- Data.world Study: A comparison of RAG on SQL vs. graph databases demonstrated a three-times improvement in LLM response accuracy with graphs [03:00:00].
- Network Analysis: Cisco’s Outshift group is developing a multi-agent framework for network analysis. This system uses a network knowledge graph (digital twin) to represent complex network environments, enabling agents to perform impact assessments, create test plans, and run tests in a predictive manner [02:47:00]. Fine-tuning query agents reduced token consumption and response time for interacting with the knowledge graph [02:57:00].
- Legal Industry: Companies like YHAR.AI use GraphRAG and multi-agent systems to find class action/mass cases, support legal discovery, and conduct case research from web-scraped data [03:06:00]. The ability to structure and query information about individuals, products, harms, and jurisdictions within a knowledge graph allows for precise filtering and reporting, even with natural language queries [03:13:00]. This approach addresses the legal industry’s need for high accuracy and explainability [03:09:00].
Building and Querying Knowledge Graphs for RAG
The construction of knowledge graphs for RAG typically involves three phases [02:31:00]:
- Lexical Graph Construction: Substructuring unstructured information into a lexical graph representing documents, chunks, and their basic relationships (e.g., predecessor/successor, parent) [02:31:00].
- Entity Extraction: Using LLMs with a defined graph schema to extract entities and relationships from the lexical graph [02:31:00]. This can involve extracting new entities or recognizing and connecting to existing ground truth data in an existing knowledge graph [03:50:00].
- Graph Enrichment: Running graph algorithms (e.g., PageRank, community detection) to enrich the graph, identifying cross-document topics or clusters [03:19:00].
For retrieval, a GraphRAG retriever performs an initial index search (vector, full-text, etc.) to find entry points in the graph [02:37:00]. It then follows relationships to a certain depth or relevancy, fetching additional context, which can also incorporate external user context [02:37:00]. This richer, more complete subset of the contextual graph is then passed to the LLM for answer generation [02:38:00]. Modern LLMs are increasingly trained to process these structured patterns, such as node-relationship-node patterns [02:38:00].
Libraries and tools supporting these processes include Neo4j, ArangoDB, Cognify, Zep’s Graffiti, and the graph-rag
Python package, offering functionalities from knowledge graph construction to various retrieval strategies [03:56:00].