AutoTool: Efficient Tool Selection for Large Language Model Agents
Problem Statement
Current LLM agent frameworks like ReAct incur high inference costs because they invoke the LLM at every step to decide which tool to use next, making them expensive and slow at scale. This repeated LLM querying creates a bottleneck that limits the practical deployment of agents in cost-sensitive or latency-sensitive environments. There is no efficient mechanism to exploit the predictable, sequential nature of tool usage patterns observed in real agent trajectories.
Key Novelty
- Discovery and formalization of 'tool usage inertia': the empirical observation that tool invocations follow predictable sequential patterns in agent trajectories
- A directed graph structure built from historical trajectories where nodes are tools and weighted edges capture transition probabilities, enabling statistical tool selection without LLM calls
- Integration of parameter-level information into the graph framework to refine tool input generation alongside tool selection
Evaluation Highlights
- AutoTool reduces LLM inference costs by up to 30% compared to inference-heavy baselines like ReAct across diverse agent tasks
- Task completion rates remain competitive with full LLM-based tool selection baselines, demonstrating the efficiency-performance tradeoff is favorable
Breakthrough Assessment
Methodology
- Step 1 - Trajectory Mining: Collect historical agent execution trajectories and extract tool call sequences to identify sequential patterns and transition frequencies between tools
- Step 2 - Graph Construction: Build a directed weighted graph where each node is a tool, edges represent observed tool-to-tool transitions, and edge weights encode empirical transition probabilities; also attach parameter-level templates or constraints to nodes
- Step 3 - Inference-Time Tool Selection: At runtime, use the current tool context to traverse the graph probabilistically, selecting the next tool (and generating its parameters) based on transition probabilities rather than invoking the LLM at each step
System Components
A directed graph constructed from historical agent trajectories where nodes are tools and edges with probabilistic weights capture how frequently one tool follows another, encoding tool usage inertia
Augments the graph nodes with parameter templates and contextual constraints to guide input generation for selected tools without requiring a full LLM call
The runtime component that uses the current agent state and last-used tool to traverse the transition graph and select the next tool with minimal or zero LLM invocations
A preprocessing pipeline that parses past agent execution logs to extract tool usage sequences used to build and weight the transition graph
Results
| Metric/Benchmark | Baseline (ReAct) | AutoTool | Delta |
|---|---|---|---|
| LLM Inference Cost | High (1 LLM call/step) | Reduced | Up to -30% |
| Task Completion Rate | Competitive baseline | Competitive | Minimal degradation |
| Tool Selection Accuracy | LLM-driven (high quality) | Graph-driven (near-competitive) | Slight tradeoff for efficiency |
| Scalability | Degrades with more tools | Scales via graph structure | Improved |
Key Takeaways
- Practitioners deploying ReAct-style agents at scale can integrate AutoTool as a drop-in efficiency layer by pre-mining historical trajectories, potentially cutting inference costs by ~30% with minimal impact on task performance
- Tool usage inertia is a practically useful inductive bias: if your agent tasks follow recurring procedural patterns, statistical modeling of tool transitions is a lightweight and effective alternative to repeated LLM reasoning
- The approach requires sufficient historical trajectory data to build a reliable transition graph, making it most suitable for mature agent deployments with stable tool ecosystems rather than highly dynamic or novel task environments
Abstract
Large Language Model (LLM) agents have emerged as powerful tools for automating complex tasks by leveraging the reasoning and decision-making abilities of LLMs. However, a major bottleneck in current agent frameworks lies in the high inference cost of tool selection, especially in approaches like ReAct that repeatedly invoke the LLM to determine which tool to use at each step. In this work, we propose AutoTool, a novel graph-based framework that bypasses repeated LLM inference by exploiting a key empirical observation: tool usage inertia—the tendency of tool invocations to follow predictable sequential patterns. AutoTool constructs a directed graph from historical agent trajectories, where nodes represent tools and edges capture transition probabilities, effectively modeling the inertia in tool selection. It further integrates parameter-level information to refine tool input generation. By traversing this structured representation, AutoTool efficiently selects tools and their parameters with minimal reliance on LLM inference. Extensive experiments across diverse agent tasks demonstrate that AutoTool reduces inference costs by up to 30% while maintaining competitive task completion rates, offering a practical and scalable enhancement for inference-heavy frameworks. Our work highlights the promise of integrating statistical structure into LLM agent design for greater efficiency without sacrificing performance.