v1v2v3 (latest)

Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive- $k$

10 June 2025

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)Github

Main:9 Pages

5 Figures

Bibliography:2 Pages

19 Tables

Appendix:15 Pages

Abstract

Retrieval-augmented generation (RAG) and long-context language models (LCLMs) both address context limitations of LLMs in open-domain question answering (QA). However, optimal external context to retrieve remains an open problem: fixing the retrieval size risks either wasting tokens or omitting key evidence. Existing adaptive methods like Self-RAG and Self-Route rely on iterative LLM prompting and perform well on factoid QA, but struggle with aggregation QA, where the optimal context size is both unknown and variable. We present Adaptive- $k$ retrieval, a simple and effective single-pass method that adaptively selects the number of passages based on the distribution of the similarity scores between the query and the candidate passages. It does not require model fine-tuning, extra LLM inferences or changes to existing retriever-reader pipelines. On both factoid and aggregation QA benchmarks, Adaptive- $k$ matches or outperforms fixed- $k$ baselines while using up to 10x fewer tokens than full-context input, yet still retrieves 70% of relevant passages. It improves accuracy across five LCLMs and two embedding models, highlighting that dynamically adjusting context size leads to more efficient and accurate QA.

View on arXiv

Comments on this paper

Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive-kkk

Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive- $k$