← Back to Papers

Claim Knowledge Graph Construction and GraphRAG-Based Question-Answering System

Xinxu Wang, Jun Fang
Buildings | 2026
This paper constructs a domain-specific knowledge graph for construction engineering claims using a five-step ontology design process and builds a GraphRAG-based QA system that outperforms both base LLMs and Vector RAG approaches on standard NLP metrics.

Problem Statement

Traditional construction claim management depends on manual expert analysis, leading to inefficiencies, information gaps, and increased dispute risk. Existing LLM-based QA systems lack structured domain knowledge, and generic Vector RAG approaches fail to capture the relational semantics inherent in claim management workflows. A structured, ontology-driven approach is needed to enable accurate, explainable, and reusable knowledge retrieval in this specialized domain.

Key Novelty

  • Construction of a domain-specific ontology for construction engineering claims organized into five unified core classes, enabling structured knowledge sharing and reuse
  • Integration of Neo4j-stored knowledge graph with LLMs via GraphRAG, leveraging graph traversal for context-aware retrieval over flat vector similarity
  • Empirical comparison of GraphRAG vs. Vector RAG vs. base LLM specifically in the construction claims domain using multi-metric NLP evaluation (BLEU-4, BERT-Cosine, ROUGE-1, ROUGE-L)

Evaluation Highlights

  • GraphRAG-based QA system outperforms base LLM on BLEU-4, BERT-Cosine similarity, ROUGE-1, and ROUGE-L metrics
  • GraphRAG-based QA system outperforms Vector RAG approach across all four evaluation metrics, demonstrating the added value of structured graph retrieval over dense vector retrieval

Breakthrough Assessment

4/10 The paper is a solid domain-specific application of GraphRAG with a well-structured ontology engineering contribution, but it applies existing techniques (knowledge graphs, RAG, Neo4j) to a new vertical rather than introducing fundamental methodological advances. Its value is primarily practical and domain-specific.

Methodology

  1. Design a five-step domain-specific ontology for construction engineering claims, organizing knowledge into five core classes (e.g., claim types, parties, events, documents, regulations)
  2. Populate and store the resulting knowledge graph in Neo4j, encoding entities and their relational structure from construction claim documents
  3. Build a GraphRAG-based QA pipeline that queries the Neo4j knowledge graph to retrieve structured context, augmenting LLM generation, and evaluate against base LLM and Vector RAG baselines using BLEU-4, BERT-Cosine, ROUGE-1, and ROUGE-L

System Components

Domain Ontology

A five-step constructed ontology with five unified core classes organizing construction claim knowledge for reuse and sharing

Neo4j Knowledge Graph

A graph database storing ontology-instantiated entities and relationships from construction engineering claim documents

GraphRAG Retriever

A retrieval module that traverses the knowledge graph to extract structured, relational context relevant to user queries

LLM QA Generator

A large language model augmented with graph-retrieved context to generate accurate, grounded answers to construction claim questions

Evaluation Suite

Multi-metric NLP evaluation using BLEU-4, BERT-Cosine similarity, ROUGE-1, and ROUGE-L to benchmark response quality

Results

Metric Base LLM Vector RAG GraphRAG (This Paper)
BLEU-4 Lower Moderate Highest
BERT-Cosine Similarity Lower Moderate Highest
ROUGE-1 Lower Moderate Highest
ROUGE-L Lower Moderate Highest

Key Takeaways

  • For domain-specific QA tasks with rich relational structure (legal, engineering, medical), GraphRAG over a structured knowledge graph consistently outperforms flat vector retrieval, making ontology investment worthwhile
  • A five-step ontology construction methodology with unified core classes is a replicable pattern for building domain KGs that support both knowledge reuse and downstream LLM augmentation
  • Neo4j is a practical production choice for storing and querying domain knowledge graphs in RAG pipelines, and this paper provides a concrete blueprint for integrating it with LLM QA systems in vertical industries

Abstract

Traditional claim management relies heavily on manual analysis and expert judgment, resulting in inefficiencies, information omissions, and heightened risks of disputes. To address these challenges, this paper constructs a domain-specific ontology for construction engineering claims through a five-step process, organizing the relevant knowledge into five unified core classes. Based on this ontology, a knowledge graph is built and stored in Neo4j. The resulting knowledge graph-enhanced LLM question-answering system, evaluated using BLEU-4, BERT-Cosine similarity, ROUGE-1, and ROUGE-L metrics, demonstrates superior performance compared to both the base LLM and Vector RAG approaches. The results indicate that the proposed ontology effectively serves the purpose of knowledge sharing and reuse while providing practical support for construction claim management.

Generated on 2026-03-02 using Claude