← Back to Papers

Efficient Text Classification with Conformal In-Context Learning

Ippokratis Pantelidis, Korbinian Randl, Aron Henriksson
arXiv.org | 2025
Conformal In-Context Learning (CICLe) combines a lightweight base classifier with Conformal Prediction to adaptively reduce candidate classes before LLM prompting, achieving efficient and accurate text classification across diverse NLP benchmarks.

Problem Statement

LLM-based text classification is expensive due to long prompts with many candidate classes, and performance is highly sensitive to prompt design. Existing few-shot prompting approaches scale poorly with the number of classes and incur high computational costs. CICLe was previously validated only in a single domain, leaving its generalizability and efficiency benefits unproven.

Key Novelty

  • Comprehensive multi-benchmark evaluation of CICLe across diverse NLP classification tasks, establishing its generalizability beyond a single domain
  • Demonstration that CICLe reduces prompt length by up to 25.16% and number of shots by up to 34.45% through adaptive class-set reduction via Conformal Prediction
  • Empirical evidence that CICLe is especially advantageous for highly class-imbalanced datasets and enables competitive performance with smaller LLMs

Evaluation Highlights

  • CICLe reduces the number of in-context shots and prompt length by up to 34.45% and 25.16% respectively compared to full few-shot prompting baselines
  • CICLe consistently outperforms its base classifier and few-shot prompting baselines when sufficient training data is available, and matches them in low-data regimes

Breakthrough Assessment

4/10 The paper is a solid empirical contribution that broadens validation of an existing framework (CICLe) across multiple benchmarks and quantifies efficiency gains, but does not introduce fundamentally new methods or paradigm-shifting ideas.

Methodology

  1. Train a lightweight base classifier (e.g., logistic regression or small neural model) on available labeled data for the target classification task
  2. Apply Conformal Prediction on the base classifier's output to construct a prediction set that contains the true class with high probability, reducing the candidate class set per input
  3. Use only the reduced candidate classes to construct a shorter, more focused few-shot LLM prompt, then query the LLM for the final classification decision

System Components

Lightweight Base Classifier

A traditional or small ML classifier trained on labeled data that provides initial class probability estimates for each input

Conformal Prediction Module

A statistically grounded uncertainty quantification method that converts base classifier scores into prediction sets with guaranteed coverage, filtering out unlikely classes

Adaptive LLM Prompting

Constructs few-shot prompts using only the conformal prediction set of candidate classes, reducing prompt length and computational cost while maintaining accuracy

Results

Metric/Benchmark Baseline (Few-shot LLM) CICLe Delta
Number of shots Full shot set Up to 34.45% fewer shots -34.45%
Prompt length Full prompt Up to 25.16% shorter -25.16%
Classification accuracy (sufficient data) Few-shot baseline Outperforms baseline Positive
Classification accuracy (low-data regime) Few-shot baseline Comparable performance ~0
Class-imbalanced tasks Few-shot baseline Particularly advantageous Positive

Key Takeaways

  • CICLe is a practical drop-in framework for reducing LLM API costs in text classification by pruning irrelevant classes before prompting, making it appealing for production deployments with many classes
  • Practitioners with sufficient labeled data (enough to train a base classifier) should prefer CICLe over naive few-shot prompting, especially for tasks with many or imbalanced classes
  • CICLe enables the use of smaller, cheaper LLMs while maintaining competitive accuracy, offering a cost-quality tradeoff lever that is highly relevant for resource-constrained settings

Abstract

Large Language Models (LLMs) demonstrate strong in-context learning abilities, yet their effectiveness in text classification depends heavily on prompt design and incurs substantial computational cost. Conformal In-Context Learning (CICLe) has been proposed as a resource-efficient framework that integrates a lightweight base classifier with Conformal Prediction to guide LLM prompting by adaptively reducing the set of candidate classes. However, its broader applicability and efficiency benefits beyond a single domain have not yet been systematically explored. In this paper, we present a comprehensive evaluation of CICLe across diverse NLP classification benchmarks. The results show that CICLe consistently improves over its base classifier and outperforms few-shot prompting baselines when the sample size is sufficient for training the base classifier, and performs comparably in low-data regimes. In terms of efficiency, CICLe reduces the number of shots and prompt length by up to 34.45% and 25.16%, respectively, and enables the use of smaller models with competitive performance. CICLe is furthermore particularly advantageous for text classification tasks with high class imbalance. These findings highlight CICLe as a practical and scalable approach for efficient text classification, combining the robustness of traditional classifiers with the adaptability of LLMs, and achieving substantial gains in data and computational efficiency.

Generated on 2026-03-03 using Claude