RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation

16 February 2025

Abstract

Retrieval-augmented language models often struggle with knowledge-intensive tasks due to inefficient retrieval, unstructured knowledge integration, and single-pass architectures. We present Retrieval-And-Structuring (RAS), a novel framework that dynamically constructs and reasons over query-specific knowledge graphs through iterative retrieval and structuring. RAS introduces four key technical innovations: (1) a themescoped retrieval mechanism that efficiently narrows the search space while maintaining retrieval quality, (2) an action planning module that determines knowledge needs and generates focused sub-queries, (3) a dynamic knowledge structuring approach that converts retrieved text into an evolving knowledge graph, and (4) a graph-augmented answering component that leverages the accumulated structured information. Our framework achieves state-of-the-art performance, surpassing leading baselines by 6.4% with open-source language models and 7.0% with proprietary models on seven knowledge-intensive generation datasets across all evaluation metrics. Detailed ablation studies verify the contribution of each technical component to the overall system performance.

View on arXiv

@article{jiang2025_2502.10996,
  title={ RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation },
  author={ Pengcheng Jiang and Lang Cao and Ruike Zhu and Minhao Jiang and Yunyi Zhang and Jimeng Sun and Jiawei Han },
  journal={arXiv preprint arXiv:2502.10996},
  year={ 2025 }
}

Comments on this paper