We present QuOTE (Question-Oriented Text Embeddings), a novel enhancement to retrieval-augmented generation (RAG) systems, aimed at improving document representation for accurate and nuanced retrieval. Unlike traditional RAG pipelines, which rely on embedding raw text chunks, QuOTE augments chunks with hypothetical questions that the chunk can potentially answer, enriching the representation space. This better aligns document embeddings with user query semantics, and helps address issues such as ambiguity and context-dependent relevance. Through extensive experiments across diverse benchmarks, we demonstrate that QuOTE significantly enhances retrieval accuracy, including in multi-hop question-answering tasks. Our findings highlight the versatility of question generation as a fundamental indexing strategy, opening new avenues for integrating question generation into retrieval-based AI pipelines.
View on arXiv@article{neeser2025_2502.10976, title={ QuOTE: Question-Oriented Text Embeddings }, author={ Andrew Neeser and Kaylen Latimer and Aadyant Khatri and Chris Latimer and Naren Ramakrishnan }, journal={arXiv preprint arXiv:2502.10976}, year={ 2025 } }