ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.10160
  4. Cited By
Reinforcement Learning from Multi-role Debates as Feedback for Bias
  Mitigation in LLMs

Reinforcement Learning from Multi-role Debates as Feedback for Bias Mitigation in LLMs

15 April 2024
Ruoxi Cheng
Haoxuan Ma
Shuirong Cao
Jiaqi Li
Aihua Pei
Zhiqiang Wang
Pengliang Ji
Haoyu Wang
Jiaqi Huo
    AI4CE
ArXivPDFHTML

Papers citing "Reinforcement Learning from Multi-role Debates as Feedback for Bias Mitigation in LLMs"

13 / 13 papers shown
Title
Internet of Agents: Fundamentals, Applications, and Challenges
Internet of Agents: Fundamentals, Applications, and Challenges
Yuntao Wang
Shaolong Guo
Yanghe Pan
Zhou Su
Fahao Chen
Tom H. Luan
Peng Li
Jiawen Kang
Dusit Niyato
LLMAG
LM&Ro
AI4CE
30
0
0
12 May 2025
DMRL: Data- and Model-aware Reward Learning for Data Extraction
DMRL: Data- and Model-aware Reward Learning for Data Extraction
Zhiqiang Wang
Ruoxi Cheng
9
0
0
07 May 2025
Towards Implicit Bias Detection and Mitigation in Multi-Agent LLM
  Interactions
Towards Implicit Bias Detection and Mitigation in Multi-Agent LLM Interactions
Angana Borah
Rada Mihalcea
22
7
0
03 Oct 2024
Training Language Models to Win Debates with Self-Play Improves Judge
  Accuracy
Training Language Models to Win Debates with Self-Play Improves Judge Accuracy
Samuel Arnesen
David Rein
Julian Michael
ELM
15
0
0
25 Sep 2024
Debatrix: Multi-dimensional Debate Judge with Iterative Chronological
  Analysis Based on LLM
Debatrix: Multi-dimensional Debate Judge with Iterative Chronological Analysis Based on LLM
Jingcong Liang
Rong Ye
Meng Han
Ruofei Lai
Xinyu Zhang
Xuanjing Huang
Zhongyu Wei
24
5
0
12 Mar 2024
Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of
  Large Language Models
Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of Large Language Models
Loka Li
Zhenhao Chen
Guan-Hong Chen
Yixuan Zhang
Yusheng Su
Eric P. Xing
Kun Zhang
LRM
25
10
0
19 Feb 2024
Can LLMs Produce Faithful Explanations For Fact-checking? Towards
  Faithful Explainable Fact-Checking via Multi-Agent Debate
Can LLMs Produce Faithful Explanations For Fact-checking? Towards Faithful Explainable Fact-Checking via Multi-Agent Debate
Kyungha Kim
Sangyun Lee
Kung-Hsiang Huang
Hou Pong Chan
Manling Li
Heng Ji
LRM
49
37
0
12 Feb 2024
Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and
  Nationality Bias in Generative Models
Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and Nationality Bias in Generative Models
M. Kamruzzaman
M. M. I. Shovon
Gene Louis Kim
38
12
0
16 Sep 2023
Fairness Reprogramming
Fairness Reprogramming
Guanhua Zhang
Yihua Zhang
Yang Zhang
Wenqi Fan
Qing Li
Sijia Liu
Shiyu Chang
AAML
72
38
0
21 Sep 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
BBQ: A Hand-Built Bias Benchmark for Question Answering
BBQ: A Hand-Built Bias Benchmark for Question Answering
Alicia Parrish
Angelica Chen
Nikita Nangia
Vishakh Padmakumar
Jason Phang
Jana Thompson
Phu Mon Htut
Sam Bowman
202
364
0
15 Oct 2021
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
273
1,561
0
18 Sep 2019
1