FunnelRAG: A Coarse-to-Fine Progressive Retrieval Paradigm for RAG

14 October 2024

Abstract

Retrieval-Augmented Generation (RAG) prevails in Large Language Models. It mainly consists of retrieval and generation. The retrieval modules (a.k.a. retrievers) aim to find useful information used to facilitate the generation modules (a.k.a. generators). As such, generators' performance largely depends on the effectiveness and efficiency of retrievers. However, the widely used retrieval paradigm remains flat. It treats retrieval procedures as a one-off deal with constant granularity. Despite effectiveness, we argue that they suffer from two limitations: (1) flat retrieval exerts a significant burden on one retriever; (2) constant granularity limits the ceiling of retrieval performance. In this work, we propose a progressive retrieval paradigm with coarse-to-fine granularity for RAG, termed FunnelRAG, so as to balance effectiveness and efficiency. Specifically, FunnelRAG establishes a progressive retrieval pipeline by collaborating coarse-to-fine granularity, large-to-small quantity, and low-to-high capacity, which can relieve the burden on one retriever and also promote the ceiling of retrieval performance. Extensive experiments manifest that FunnelRAG achieves comparable retrieval performance while the time overhead is reduced by nearly 40 percent.

View on arXiv

@article{zhao2025_2410.10293,
  title={ FunnelRAG: A Coarse-to-Fine Progressive Retrieval Paradigm for RAG },
  author={ Xinping Zhao and Yan Zhong and Zetian Sun and Xinshuo Hu and Zhenyu Liu and Dongfang Li and Baotian Hu and Min Zhang },
  journal={arXiv preprint arXiv:2410.10293},
  year={ 2025 }
}

Comments on this paper