Claim Knowledge Graph Construction and GraphRAG-Based Question-Answering System
Problem Statement
Traditional construction claim management depends on manual expert analysis, leading to inefficiencies, information gaps, and increased dispute risk. Existing LLM-based QA systems lack structured domain knowledge, and generic Vector RAG approaches fail to capture the relational semantics inherent in claim management workflows. A structured, ontology-driven approach is needed to enable accurate, explainable, and reusable knowledge retrieval in this specialized domain.
Key Novelty
- Construction of a domain-specific ontology for construction engineering claims organized into five unified core classes, enabling structured knowledge sharing and reuse
- Integration of Neo4j-stored knowledge graph with LLMs via GraphRAG, leveraging graph traversal for context-aware retrieval over flat vector similarity
- Empirical comparison of GraphRAG vs. Vector RAG vs. base LLM specifically in the construction claims domain using multi-metric NLP evaluation (BLEU-4, BERT-Cosine, ROUGE-1, ROUGE-L)
Evaluation Highlights
- GraphRAG-based QA system outperforms base LLM on BLEU-4, BERT-Cosine similarity, ROUGE-1, and ROUGE-L metrics
- GraphRAG-based QA system outperforms Vector RAG approach across all four evaluation metrics, demonstrating the added value of structured graph retrieval over dense vector retrieval
Breakthrough Assessment
Methodology
- Design a five-step domain-specific ontology for construction engineering claims, organizing knowledge into five core classes (e.g., claim types, parties, events, documents, regulations)
- Populate and store the resulting knowledge graph in Neo4j, encoding entities and their relational structure from construction claim documents
- Build a GraphRAG-based QA pipeline that queries the Neo4j knowledge graph to retrieve structured context, augmenting LLM generation, and evaluate against base LLM and Vector RAG baselines using BLEU-4, BERT-Cosine, ROUGE-1, and ROUGE-L
System Components
A five-step constructed ontology with five unified core classes organizing construction claim knowledge for reuse and sharing
A graph database storing ontology-instantiated entities and relationships from construction engineering claim documents
A retrieval module that traverses the knowledge graph to extract structured, relational context relevant to user queries
A large language model augmented with graph-retrieved context to generate accurate, grounded answers to construction claim questions
Multi-metric NLP evaluation using BLEU-4, BERT-Cosine similarity, ROUGE-1, and ROUGE-L to benchmark response quality
Results
| Metric | Base LLM | Vector RAG | GraphRAG (This Paper) |
|---|---|---|---|
| BLEU-4 | Lower | Moderate | Highest |
| BERT-Cosine Similarity | Lower | Moderate | Highest |
| ROUGE-1 | Lower | Moderate | Highest |
| ROUGE-L | Lower | Moderate | Highest |
Key Takeaways
- For domain-specific QA tasks with rich relational structure (legal, engineering, medical), GraphRAG over a structured knowledge graph consistently outperforms flat vector retrieval, making ontology investment worthwhile
- A five-step ontology construction methodology with unified core classes is a replicable pattern for building domain KGs that support both knowledge reuse and downstream LLM augmentation
- Neo4j is a practical production choice for storing and querying domain knowledge graphs in RAG pipelines, and this paper provides a concrete blueprint for integrating it with LLM QA systems in vertical industries
Abstract
Traditional claim management relies heavily on manual analysis and expert judgment, resulting in inefficiencies, information omissions, and heightened risks of disputes. To address these challenges, this paper constructs a domain-specific ontology for construction engineering claims through a five-step process, organizing the relevant knowledge into five unified core classes. Based on this ontology, a knowledge graph is built and stored in Neo4j. The resulting knowledge graph-enhanced LLM question-answering system, evaluated using BLEU-4, BERT-Cosine similarity, ROUGE-1, and ROUGE-L metrics, demonstrates superior performance compared to both the base LLM and Vector RAG approaches. The results indicate that the proposed ontology effectively serves the purpose of knowledge sharing and reuse while providing practical support for construction claim management.