ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.19477
60
0

Judging with Many Minds: Do More Perspectives Mean Less Prejudice?

26 May 2025
Chiyu Ma
Enpei Zhang
Yilun Zhao
Wenjun Liu
Yaning Jia
Peijun Qing
Lin Shi
Arman Cohan
Yujun Yan
Soroush Vosoughi
    LLMAGELM
ArXiv (abs)PDFHTML
Main:8 Pages
25 Figures
Bibliography:4 Pages
13 Tables
Appendix:24 Pages
Abstract

LLM-as-Judge has emerged as a scalable alternative to human evaluation, enabling large language models (LLMs) to provide reward signals in trainings. While recent work has explored multi-agent extensions such as multi-agent debate and meta-judging to enhance evaluation quality, the question of how intrinsic biases manifest in these settings remains underexplored. In this study, we conduct a systematic analysis of four diverse bias types: position bias, verbosity bias, chain-of-thought bias, and bandwagon bias. We evaluate these biases across two widely adopted multi-agent LLM-as-Judge frameworks: Multi-Agent-Debate and LLM-as-Meta-Judge. Our results show that debate framework amplifies biases sharply after the initial debate, and this increased bias is sustained in subsequent rounds, while meta-judge approaches exhibit greater resistance. We further investigate the incorporation of PINE, a leading single-agent debiasing method, as a bias-free agent within these systems. The results reveal that this bias-free agent effectively reduces biases in debate settings but provides less benefit in meta-judge scenarios. Our work provides a comprehensive study of bias behavior in multi-agent LLM-as-Judge systems and highlights the need for targeted bias mitigation strategies in collaborative evaluation settings.

View on arXiv
@article{ma2025_2505.19477,
  title={ Judging with Many Minds: Do More Perspectives Mean Less Prejudice? },
  author={ Chiyu Ma and Enpei Zhang and Yilun Zhao and Wenjun Liu and Yaning Jia and Peijun Qing and Lin Shi and Arman Cohan and Yujun Yan and Soroush Vosoughi },
  journal={arXiv preprint arXiv:2505.19477},
  year={ 2025 }
}
Comments on this paper