Recent advances in large language models (LLMs) have significantly improved multi-hop question answering (QA) through direct Chain-of-Thought (CoT) reasoning. However, the irreversible nature of CoT leads to error accumulation, making it challenging to correct mistakes in multi-hop reasoning. This paper introduces ReAgent: a Reversible multi-Agent collaborative framework augmented with explicit backtracking mechanisms, enabling reversible multi-hop reasoning. By incorporating text-based retrieval, information aggregation and validation, our system can detect and correct errors mid-reasoning, leading to more robust and interpretable QA outcomes. The framework and experiments serve as a foundation for future work on error-tolerant QA systems. Empirical evaluations across three benchmarks indicate ReAgent's efficacy, yielding average about 6\% improvements against baseline models.
View on arXiv@article{xinjie2025_2503.06951, title={ ReAgent: Reversible Multi-Agent Reasoning for Knowledge-Enhanced Multi-Hop QA }, author={ Zhao Xinjie and Fan Gao and Rui Yang and Yingjian Chen and Yuyang Wang and Ying Zhu and Jiacheng Tang and Irene Li }, journal={arXiv preprint arXiv:2503.06951}, year={ 2025 } }