v1v2 (latest)

DebUnc: Improving Large Language Model Agent Communication With Uncertainty Metrics

8 July 2024

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)

Abstract

Multi-agent debates have been introduced to improve the accuracy of Large Language Models (LLMs) by having multiple agents discuss solutions to a problem over several rounds of debate. However, models often generate incorrect yet confident-sounding responses, which can mislead others. This issue arises partly because agents do not consider how confident their peers are. To address this, we propose DebUnc, a debate framework that uses uncertainty metrics to assess agent confidence. Confidence is then conveyed through a modified attention mechanism that adjusts token weights, or through textual prompts. Evaluations across benchmarks show that attention-based methods are particularly effective and that performance continues to improve as uncertainty estimation becomes more reliable. The code is available atthis https URL.

View on arXiv

Comments on this paper