ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.02743
  4. Cited By
Beyond Performance: Quantifying and Mitigating Label Bias in LLMs

Beyond Performance: Quantifying and Mitigating Label Bias in LLMs

North American Chapter of the Association for Computational Linguistics (NAACL), 2024
4 May 2024
Philipp Benz
Maitreya Patel
ArXiv (abs)PDFHTML

Papers citing "Beyond Performance: Quantifying and Mitigating Label Bias in LLMs"

14 / 14 papers shown
Title
Improving Score Reliability of Multiple Choice Benchmarks with Consistency Evaluation and Altered Answer Choices
Improving Score Reliability of Multiple Choice Benchmarks with Consistency Evaluation and Altered Answer Choices
Paulo Cavalin
Cassia Sanctos
Marcelo Grave
Claudio S. Pinhanez
Yago Primerano
8
0
0
26 Nov 2025
Quantifying and Mitigating Selection Bias in LLMs: A Transferable LoRA Fine-Tuning and Efficient Majority Voting Approach
Quantifying and Mitigating Selection Bias in LLMs: A Transferable LoRA Fine-Tuning and Efficient Majority Voting Approach
Blessed Guda
Lawrence Francis
Gabrial Zencha A.
Carlee Joe-Wong
Moise Busogi
8
0
0
17 Nov 2025
Person-Centric Annotations of LAION-400M: Auditing Bias and Its Transfer to Models
Person-Centric Annotations of LAION-400M: Auditing Bias and Its Transfer to Models
Leander Girrbach
Stephan Alaniz
Genevieve Smith
Trevor Darrell
Zeynep Akata
181
1
0
04 Oct 2025
Hearing the Order: Investigating Selection Bias in Large Audio-Language Models
Hearing the Order: Investigating Selection Bias in Large Audio-Language Models
Yu-Xiang Lin
Chen-An Li
Sheng-Lun Wei
Po-Chun Chen
Hsin-Hsi Chen
Hung-yi Lee
108
0
0
01 Oct 2025
Metric assessment protocol in the context of answer fluctuation on MCQ tasks
Metric assessment protocol in the context of answer fluctuation on MCQ tasks
Ekaterina Goliakova
X. Renard
Marie-Jeanne Lesot
Thibault Laugel
Christophe Marsala
Marcin Detyniecki
107
0
0
21 Jul 2025
PromptSuite: A Task-Agnostic Framework for Multi-Prompt Generation
PromptSuite: A Task-Agnostic Framework for Multi-Prompt Generation
Eliya Habba
Noam Dahan
Gili Lior
Gabriel Stanovsky
LRM
310
1
0
20 Jul 2025
SATA-BENCH: Select All That Apply Benchmark for Multiple Choice Questions
SATA-BENCH: Select All That Apply Benchmark for Multiple Choice Questions
Weijie Xu
Shixian Cui
Xi Fang
Chi Xue
Stephanie Eckman
Chandan K. Reddy
ELM
243
4
0
31 May 2025
Through the LLM Looking Glass: A Socratic Probing of Donkeys, Elephants, and Markets
Through the LLM Looking Glass: A Socratic Probing of Donkeys, Elephants, and Markets
Molly Kennedy
Ayyoob Imani
Timo Spinde
Hinrich Schütze
236
0
0
20 Mar 2025
Towards AI-assisted Academic Writing
Towards AI-assisted Academic Writing
Daniel J. Liebling
Malcolm Kane
Madeleine Grunde-Mclaughlin
Ian J. Lang
Subhashini Venugopalan
Michael P. Brenner
196
3
0
17 Mar 2025
DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM Evaluation
DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM EvaluationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Eliya Habba
Ofir Arviv
Itay Itzhak
Yotam Perlitz
Elron Bandel
Leshem Choshen
Michal Shmueli-Scheuer
Gabriel Stanovsky
311
10
0
03 Mar 2025
Aligning Black-box Language Models with Human Judgments
Aligning Black-box Language Models with Human JudgmentsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025
Gerrit J. J. van den Burg
Gen Suzuki
Wei Liu
Murat Sensoy
ALM
239
2
0
07 Feb 2025
Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Improving Model Evaluation using SMART Filtering of Benchmark DatasetsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Vipul Gupta
Candace Ross
David Pantoja
R. Passonneau
Megan Ung
Adina Williams
650
11
0
26 Oct 2024
Mitigating Selection Bias with Node Pruning and Auxiliary Options
Mitigating Selection Bias with Node Pruning and Auxiliary OptionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Hyeong Kyu Choi
Weijie Xu
Chi Xue
Stephanie Eckman
Chandan K. Reddy
362
10
0
27 Sep 2024
Self-Recognition in Language Models
Self-Recognition in Language Models
Tim R. Davidson
Viacheslav Surkov
V. Veselovsky
Giuseppe Russo
Robert West
Çağlar Gülçehre
PILM
475
8
0
09 Jul 2024
1