Search-on-Graph: Iterative Informed Navigation for Large Language Model Reasoning on Knowledge Graphs
Problem Statement
LLMs struggle with knowledge-intensive multi-hop questions due to hallucinations, outdated knowledge, and long-tail fact gaps. Existing KGQA methods are brittle — SPARQL compilation fails without knowing available relations, large subgraph retrieval introduces noise, and agent-based parallel exploration causes exponential search space expansion. A principled, schema-agnostic navigation strategy is needed that balances precision and scalability.
Key Novelty
- A single unified 'Search' function that drives iterative, informed graph traversal without pre-planning full paths or retrieving entire subgraphs
- An 'observe-then-navigate' paradigm where the LLM inspects actual available relations from the current entity before deciding on the next hop, making reasoning grounded and adaptive
- Adaptive filtering for high-degree nodes that prevents search space explosion and seamlessly handles diverse KG schemas (Freebase and Wikidata) without fine-tuning
Evaluation Highlights
- +16% improvement over previous best methods on Wikidata benchmarks, demonstrating strong generalization to non-Freebase KG schemas
- State-of-the-art performance across six KGQA benchmarks spanning both Freebase and Wikidata without any fine-tuning of the LLM
Breakthrough Assessment
Methodology
- At each reasoning step, the LLM is presented with the current entity and its actual available outgoing relations in the KG (observe phase), grounding the model in real graph structure
- The LLM selects the most relevant relation to follow based on the question context (navigate phase), executing a single-hop traversal via the Search function to retrieve the next entity/entities
- This observe-navigate loop repeats iteratively until the LLM determines it has gathered sufficient evidence to answer the question, with adaptive filtering applied at high-degree nodes to manage relation count
System Components
A single, carefully designed function that retrieves available relations from a current entity in the KG, serving as the unified interface between the LLM reasoning loop and the graph database
An iterative reasoning loop where the LLM first inspects real available relations at the current graph node before committing to a traversal direction, preventing hallucinated or invalid path choices
A mechanism that filters and prioritizes relations at high-degree nodes to prevent the LLM context from being overwhelmed and the search space from expanding exponentially
The framework operates without hard-coded schema assumptions, allowing it to generalize across heterogeneous KG schemas such as Freebase and Wikidata without modification
Results
| Benchmark | Previous Best | SoG (This Paper) | Delta |
|---|---|---|---|
| Wikidata benchmarks (avg) | Prior SOTA | New SOTA | +16% |
| Freebase benchmarks (avg) | Prior SOTA | New SOTA | Consistent improvement |
| Overall (6 benchmarks) | Prior SOTA | State-of-the-art | Positive across all |
Key Takeaways
- Grounding each LLM reasoning step in observed real KG relations (rather than pre-planned queries) substantially reduces hallucination and invalid path errors in multi-hop KGQA
- A single well-designed retrieval function can outperform complex parallel agent frameworks by avoiding exponential search space expansion — simpler architectures with smarter step-wise decisions can win
- The approach works zero-shot (no fine-tuning) across Freebase and Wikidata, making it a practical drop-in enhancement for any LLM-based QA system backed by a structured knowledge graph
Abstract
Large language models (LLMs) have demonstrated impressive reasoning abilities yet remain unreliable on knowledge-intensive, multi-hop questions -- they miss long-tail facts, hallucinate when uncertain, and their internal knowledge lags behind real-world change. Knowledge graphs (KGs) offer a structured source of relational evidence, but existing KGQA methods face fundamental trade-offs: compiling complete SPARQL queries without knowing available relations proves brittle, retrieving large subgraphs introduces noise, and complex agent frameworks with parallel exploration exponentially expand search spaces. To address these limitations, we propose Search-on-Graph (SoG), a simple yet effective framework that enables LLMs to perform iterative informed graph navigation using a single, carefully designed \textsc{Search} function. Rather than pre-planning paths or retrieving large subgraphs, SoG follows an ``observe-then-navigate''principle: at each step, the LLM examines actual available relations from the current entity before deciding on the next hop. This approach further adapts seamlessly to different KG schemas and handles high-degree nodes through adaptive filtering. Across six KGQA benchmarks spanning Freebase and Wikidata, SoG achieves state-of-the-art performance without fine-tuning. We demonstrate particularly strong gains on Wikidata benchmarks (+16\% improvement over previous best methods) alongside consistent improvements on Freebase benchmarks.