iCLP: Large Language Model Reasoning with Implicit Cognition Latent Planning
Problem Statement
Explicit textual planning in LLMs suffers from hallucinations and poor generalization across diverse task-specific questions, making step-by-step reasoning unreliable. Current chain-of-thought approaches require verbose intermediate plans that are difficult to generate accurately and add computational overhead. There is a gap between how humans leverage subconscious, generalized reasoning patterns and how LLMs perform deliberate, explicit planning.
Key Novelty
- Introduction of latent plans (LPs) as discrete, compact encodings of reasoning instructions learned via a vector-quantized autoencoder with a codebook, enabling planning in latent space while reasoning in language space
- A three-stage pipeline: distilling explicit plans from reasoning trajectories, learning discrete representations via VQ-autoencoder, and fine-tuning LLMs on paired latent plan–reasoning step data
- Strong cross-domain generalization of latent plans across different task types (math reasoning and code generation) while preserving chain-of-thought interpretability
Evaluation Highlights
- Significant accuracy improvements over baselines on mathematical reasoning benchmarks, demonstrating that latent planning outperforms both no-planning and explicit textual planning approaches
- Improved efficiency (reduced token overhead) compared to explicit textual plan generation, with strong cross-domain transfer between mathematical reasoning and code generation tasks
Breakthrough Assessment
Methodology
- Step 1 - Plan Distillation: Extract and distill explicit textual plans from existing step-by-step reasoning trajectories (e.g., chain-of-thought solutions) to create a supervised plan corpus
- Step 2 - Latent Representation Learning: Train a vector-quantized (VQ) autoencoder to encode explicit plans into discrete latent codes stored in a codebook, learning compact generalizable plan representations
- Step 3 - LLM Fine-tuning with Latent Plans: Fine-tune the LLM on paired data (latent plan tokens + corresponding reasoning steps), teaching the model to generate appropriate latent plan codes before producing reasoning chains
System Components
Conceptual framework borrowed from cognitive science: subconscious, generalizable decision-guiding patterns that don't require explicit verbalization, used to motivate compact latent planning
Extracts structured, explicit plans from existing step-by-step reasoning trajectories to create training supervision for the latent space
Encodes explicit textual plans into discrete latent codes via a learned codebook, enabling compact and generalizable plan representations
A finite set of learned discrete latent vectors that serve as the vocabulary of latent plans, allowing the LLM to select and compose reasoning strategies
The LLM fine-tuned to emit latent plan tokens before generating reasoning steps, enabling implicit planning that guides chain-of-thought without explicit text plans
Results
| Metric/Benchmark | Baseline (Explicit Text Plan / CoT) | iCLP (Latent Plan) | Delta |
|---|---|---|---|
| Mathematical Reasoning Accuracy | Standard CoT baseline | Significant improvement | Positive, substantial |
| Code Generation Accuracy | Standard CoT baseline | Significant improvement | Positive, substantial |
| Cross-domain Generalization | Poor transfer | Strong transfer | Large qualitative gain |
| Planning Token Efficiency | High (verbose text plans) | Low (compact latent codes) | Reduced overhead |
Key Takeaways
- Replacing verbose textual plans with compact discrete latent codes (via VQ-autoencoders) is a viable and effective way to inject planning capability into LLMs while reducing hallucination risk in plan generation
- The three-stage pipeline (distill → encode → fine-tune) is modular and could be applied to other reasoning domains beyond math and code, making it a reusable recipe for practitioners looking to improve LLM reasoning
- Latent planning preserves chain-of-thought interpretability in the output while abstracting the planning process, offering a practical middle ground between opaque latent reasoning and fragile explicit textual planning
Abstract
Large language models (LLMs), when guided by explicit textual plans, can perform reliable step-by-step reasoning during problem-solving. However, generating accurate and effective textual plans remains challenging due to LLM hallucinations and the high diversity of task-specific questions. To address this, we draw inspiration from human Implicit Cognition (IC), the subconscious process by which decisions are guided by compact, generalized patterns learned from past experiences without requiring explicit verbalization. We propose iCLP, a novel framework that enables LLMs to adaptively generate latent plans (LPs), which are compact encodings of effective reasoning instructions. iCLP first distills explicit plans from existing step-by-step reasoning trajectories. It then learns discrete representations of these plans via a vector-quantized autoencoder coupled with a codebook. Finally, by fine-tuning LLMs on paired latent plans and corresponding reasoning steps, the models learn to perform implicit planning during reasoning. Experimental results on mathematical reasoning and code generation tasks demonstrate that, with iCLP, LLMs can plan in latent space while reasoning in language space. This approach yields significant improvements in both accuracy and efficiency and, crucially, demonstrates strong cross-domain generalization while preserving the interpretability of chain-of-thought reasoning.