40
0

An Analysis of Decoding Methods for LLM-based Agents for Faithful Multi-Hop Question Answering

Abstract

Large Language Models (LLMs) frequently produce factually inaccurate outputs - a phenomenon known as hallucination - which limits their accuracy in knowledge-intensive NLP tasks. Retrieval-augmented generation and agentic frameworks such as Reasoning and Acting (ReAct) can address this issue by giving the model access to external knowledge. However, LLMs often fail to remain faithful to retrieved information. Mitigating this is critical, especially if LLMs are required to reason about the retrieved information. Recent research has explored training-free decoding strategies to improve the faithfulness of model generations. We present a systematic analysis of how the combination of the ReAct framework and decoding strategies (i.e., DeCoRe, DoLa, and CAD) can influence the faithfulness of LLM-generated answers. Our results show that combining an agentic framework for knowledge retrieval with decoding methods that enhance faithfulness can increase accuracy on the downstream Multi-Hop Question Answering tasks. For example, we observe an F1 increase from 19.5 to 32.6 on HotpotQA when using ReAct and DoLa.

View on arXiv
@article{murphy2025_2503.23415,
  title={ An Analysis of Decoding Methods for LLM-based Agents for Faithful Multi-Hop Question Answering },
  author={ Alexander Murphy and Mohd Sanad Zaki Rizvi and Aden Haussmann and Ping Nie and Guifu Liu and Aryo Pradipta Gema and Pasquale Minervini },
  journal={arXiv preprint arXiv:2503.23415},
  year={ 2025 }
}
Comments on this paper