Document GraphRAG: Knowledge Graph Enhanced Retrieval Augmented Generation for Document Question Answering Within the Manufacturing Domain
Problem Statement
Standard RAG systems suffer from poor retrieval precision and suboptimal context selection, especially for complex multi-hop questions requiring cross-document reasoning. Existing approaches lack structural awareness of document content, leading to fragmented or irrelevant context chunks being passed to the LLM. This is particularly limiting in specialized domains like manufacturing where precision and technical accuracy are critical.
Key Novelty
- Knowledge Graph construction based on a document's intrinsic structure rather than external ontologies, enabling structure-aware retrieval
- Keyword-based semantic linking mechanism that connects related document chunks across the graph for improved context traversal
- A newly developed manufacturing-domain QA dataset used alongside SQuAD and HotpotQA for domain-specific evaluation
Evaluation Highlights
- Consistent improvement in Context Relevance metrics over naive RAG baseline across SQuAD, HotpotQA, and the manufacturing dataset
- Multi-hop questions (HotpotQA) showed the greatest performance gains, validating the graph-based retrieval strategy for complex reasoning tasks
Breakthrough Assessment
Methodology
- Parse documents to extract intrinsic structural elements (sections, paragraphs, entities) and construct a Knowledge Graph that reflects document hierarchy and relationships
- Apply a keyword-based semantic linking mechanism to connect semantically related nodes across the graph, enabling multi-hop traversal during retrieval
- At query time, traverse the KG to retrieve contextually relevant and structurally coherent chunks, then pass them to an LLM for answer generation; tune chunk size, keyword density, and top-k parameters per task
System Components
Parses documents and constructs a graph capturing intrinsic structural relationships (e.g., sections, subsections, entities) as nodes and edges
Identifies and creates semantic edges between graph nodes sharing important keywords, enabling cross-section context linking without external ontologies
Traverses the KG to find the most relevant and contextually connected chunks for a given query, replacing standard vector-similarity-only retrieval
Feeds the graph-retrieved context into a standard LLM generation step, maintaining compatibility with existing RAG infrastructure
A newly created domain-specific evaluation benchmark for testing QA systems on manufacturing documentation
Results
| Metric/Benchmark | Naive RAG Baseline | Document GraphRAG | Delta |
|---|---|---|---|
| Context Relevance (SQuAD) | Baseline | Improved | Positive gain |
| Context Relevance (HotpotQA) | Baseline | Notably improved | Largest gain (multi-hop) |
| Context Relevance (Manufacturing) | Baseline | Improved | Positive gain |
| Answer Generation Quality | Baseline | Improved | Task-dependent positive gain |
Key Takeaways
- Graph-based retrieval using document-intrinsic structure is particularly effective for multi-hop and complex reasoning questions, making it a strong choice over flat chunking for technical documentation QA.
- Chunk size, keyword density, and top-k retrieval are important hyperparameters in GraphRAG and should be tuned per task type rather than fixed globally.
- Building KGs from document structure (rather than requiring external ontologies) makes this approach practically deployable in specialized domains like manufacturing without heavy domain-expert involvement.
Abstract
Retrieval-Augmented Generation (RAG) systems have shown significant potential for domain-specific Question Answering (QA) tasks, although persistent challenges in retrieval precision and context selection continue to hinder their effectiveness. This study introduces Document Graph RAG (GraphRAG), a novel framework that bolsters retrieval robustness and enhances answer generation by incorporating Knowledge Graphs (KGs) built upon a document’s intrinsic structure into the RAG pipeline. Through the application of the Design Science Research methodology, we systematically design, implement, and evaluate GraphRAG, leveraging graph-based document structuring and a keyword-based semantic linking mechanism to improve retrieval quality. The evaluation, conducted on well-established datasets including SQuAD, HotpotQA, and a newly developed manufacturing dataset, demonstrates consistent performance gains over a naive RAG baseline across both retrieval and generation metrics. The results indicate that GraphRAG improves Context Relevance metrics, with task-dependent optimizations for chunk size, keyword density, and top-k retrieval further enhancing performance. Notably, multi-hop questions benefit most from GraphRAG’s structured retrieval strategy, highlighting its advantages in complex reasoning tasks.