Rethinking All Evidence: Enhancing Trustworthy Retrieval-Augmented Generation via Conflict-Driven Summarization
- RALMHILM

Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by integrating their parametric knowledge with external retrieved content. However, knowledge conflicts caused by internal inconsistencies or noisy retrieved content can severely undermine the generation reliability of RAGthis http URLthis work, we argue that LLMs should rethink all evidence, including both retrieved content and internal knowledge, before generatingthis http URLpropose CARE-RAG (Conflict-Aware and Reliable Evidence for RAG), a novel framework that improves trustworthiness through Conflict-Driven Summarization of all availablethis http URL-RAG first derives parameter-aware evidence by comparing parameter records to identify diverse internal perspectives. It then refines retrieved evidences to produce context-aware evidence, removing irrelevant or misleading content. To detect and summarize conflicts, we distill a 3B LLaMA3.2 model to perform conflict-driven summarization, enabling reliable synthesis across multiplethis http URLfurther ensure evaluation integrity, we introduce a QA Repair step to correct outdated or ambiguous benchmarkthis http URLon revised QA datasets with retrieval data show that CARE-RAG consistently outperforms strong RAG baselines, especially in scenarios with noisy or conflicting evidence.
View on arXiv@article{chen2025_2507.01281, title={ Rethinking All Evidence: Enhancing Trustworthy Retrieval-Augmented Generation via Conflict-Driven Summarization }, author={ Juan Chen and Baolong Bi and Wei Zhang and Jingyan Sui and Xiaofei Zhu and Yuanzhuo Wang and Lingrui Mei and Shenghua Liu }, journal={arXiv preprint arXiv:2507.01281}, year={ 2025 } }