ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.15006
  4. Cited By
Fine-tuning language models to find agreement among humans with diverse
  preferences

Fine-tuning language models to find agreement among humans with diverse preferences

28 November 2022
Michiel A. Bakker
Martin Chadwick
Hannah R. Sheahan
Michael Henry Tessler
Lucy Campbell-Gillingham
Jan Balaguer
Nat McAleese
Amelia Glaese
John Aslanides
M. Botvinick
Christopher Summerfield
    ALM
ArXivPDFHTML

Papers citing "Fine-tuning language models to find agreement among humans with diverse preferences"

50 / 123 papers shown
Title
AgentsCoDriver: Large Language Model Empowered Collaborative Driving
  with Lifelong Learning
AgentsCoDriver: Large Language Model Empowered Collaborative Driving with Lifelong Learning
Senkang Hu
Zhengru Fang
Zihan Fang
Yiqin Deng
Xianhao Chen
Yuguang Fang
54
33
0
09 Apr 2024
Aligning Diffusion Models by Optimizing Human Utility
Aligning Diffusion Models by Optimizing Human Utility
Shufan Li
Konstantinos Kallidromitis
Akash Gokul
Yusuke Kato
Kazuki Kozuka
105
27
0
06 Apr 2024
The Strong Pull of Prior Knowledge in Large Language Models and Its
  Impact on Emotion Recognition
The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition
Georgios Chochlakis
Alexandros Potamianos
Kristina Lerman
Shrikanth Narayanan
27
5
0
25 Mar 2024
ChatGPT Incorrectness Detection in Software Reviews
ChatGPT Incorrectness Detection in Software Reviews
M. Tanzil
Junaed Younus Khan
Gias Uddin
19
4
0
25 Mar 2024
MedInsight: A Multi-Source Context Augmentation Framework for Generating
  Patient-Centric Medical Responses using Large Language Models
MedInsight: A Multi-Source Context Augmentation Framework for Generating Patient-Centric Medical Responses using Large Language Models
Subash Neupane
Shaswata Mitra
Sudip Mittal
Noorbakhsh Amiri Golilarz
Shahram Rahimi
Amin Amirlatifi
60
3
0
13 Mar 2024
Provable Multi-Party Reinforcement Learning with Diverse Human Feedback
Provable Multi-Party Reinforcement Learning with Diverse Human Feedback
Huiying Zhong
Zhun Deng
Weijie J. Su
Zhiwei Steven Wu
Linjun Zhang
52
13
0
08 Mar 2024
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Akila Wickramasekara
F. Breitinger
Mark Scanlon
42
8
0
29 Feb 2024
Arithmetic Control of LLMs for Diverse User Preferences: Directional
  Preference Alignment with Multi-Objective Rewards
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
Haoxiang Wang
Yong Lin
Wei Xiong
Rui Yang
Shizhe Diao
Shuang Qiu
Han Zhao
Tong Zhang
40
70
0
28 Feb 2024
SoFA: Shielded On-the-fly Alignment via Priority Rule Following
SoFA: Shielded On-the-fly Alignment via Priority Rule Following
Xinyu Lu
Bowen Yu
Yaojie Lu
Hongyu Lin
Haiyang Yu
Le Sun
Xianpei Han
Yongbin Li
55
13
0
27 Feb 2024
Unintended Impacts of LLM Alignment on Global Representation
Unintended Impacts of LLM Alignment on Global Representation
Michael Joseph Ryan
William B. Held
Diyi Yang
35
40
0
22 Feb 2024
Wikibench: Community-Driven Data Curation for AI Evaluation on Wikipedia
Wikibench: Community-Driven Data Curation for AI Evaluation on Wikipedia
Tzu-Sheng Kuo
Aaron L Halfaker
Zirui Cheng
Jiwoo Kim
Meng-Hsin Wu
Tongshuang Wu
Kenneth Holstein
Haiyi Zhu
57
21
0
21 Feb 2024
Large Language Models for Data Annotation: A Survey
Large Language Models for Data Annotation: A Survey
Zhen Tan
Dawei Li
Song Wang
Alimohammad Beigi
Bohan Jiang
Amrita Bhattacharjee
Mansooreh Karami
Jundong Li
Lu Cheng
Huan Liu
SyDa
44
49
0
21 Feb 2024
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
Fengqing Jiang
Zhangchen Xu
Luyao Niu
Zhen Xiang
Bhaskar Ramasubramanian
Bo Li
Radha Poovendran
26
86
0
19 Feb 2024
Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate
  Controllable Controversial Statements
Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements
Ming Li
Jiuhai Chen
Lichang Chen
Tianyi Zhou
66
17
0
16 Feb 2024
MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with
  Diverse Human Preferences
MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences
Souradip Chakraborty
Jiahao Qiu
Hui Yuan
Alec Koppel
Furong Huang
Dinesh Manocha
Amrit Singh Bedi
Mengdi Wang
ALM
30
47
0
14 Feb 2024
ChatGPT vs LLaMA: Impact, Reliability, and Challenges in Stack Overflow
  Discussions
ChatGPT vs LLaMA: Impact, Reliability, and Challenges in Stack Overflow Discussions
Leuson Da Silva
Jordan Samhi
Foutse Khomh
ALM
SILM
ELM
AI4MH
26
10
0
13 Feb 2024
Mercury: A Code Efficiency Benchmark for Code Large Language Models
Mercury: A Code Efficiency Benchmark for Code Large Language Models
Mingzhe Du
A. Luu
Bin Ji
Qian Liu
See-Kiong Ng
ALM
ELM
OffRL
24
5
0
12 Feb 2024
Calibrating Long-form Generations from Large Language Models
Calibrating Long-form Generations from Large Language Models
Yukun Huang
Yixin Liu
Raghuveer Thirukovalluru
Arman Cohan
Bhuwan Dhingra
19
7
0
09 Feb 2024
A Roadmap to Pluralistic Alignment
A Roadmap to Pluralistic Alignment
Taylor Sorensen
Jared Moore
Jillian R. Fisher
Mitchell L. Gordon
Niloofar Mireshghallah
...
Liwei Jiang
Ximing Lu
Nouha Dziri
Tim Althoff
Yejin Choi
65
80
0
07 Feb 2024
Transforming and Combining Rewards for Aligning Large Language Models
Transforming and Combining Rewards for Aligning Large Language Models
Zihao Wang
Chirag Nagpal
Jonathan Berant
Jacob Eisenstein
Alex DÁmour
Oluwasanmi Koyejo
Victor Veitch
14
11
0
01 Feb 2024
Even-if Explanations: Formal Foundations, Priorities and Complexity
Even-if Explanations: Formal Foundations, Priorities and Complexity
Gianvincenzo Alfano
S. Greco
Domenico Mandaglio
Francesco Parisi
Reza Shahbazian
I. Trubitsyna
26
2
0
17 Jan 2024
Align on the Fly: Adapting Chatbot Behavior to Established Norms
Align on the Fly: Adapting Chatbot Behavior to Established Norms
Chunpu Xu
Steffi Chern
Ethan Chern
Ge Zhang
Zekun Wang
Ruibo Liu
Jing Li
Jie Fu
Pengfei Liu
11
20
0
26 Dec 2023
From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the
  Generative Artificial Intelligence (AI) Research Landscape
From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape
Timothy R. McIntosh
Teo Susnjak
Tong Liu
Paul Watters
Malka N. Halgamuge
79
46
0
18 Dec 2023
A Survey of Reasoning with Foundation Models
A Survey of Reasoning with Foundation Models
Jiankai Sun
Chuanyang Zheng
E. Xie
Zhengying Liu
Ruihang Chu
...
Xipeng Qiu
Yi-Chen Guo
Hui Xiong
Qun Liu
Zhenguo Li
ReLM
LRM
AI4CE
22
76
0
17 Dec 2023
"I Want It That Way": Enabling Interactive Decision Support Using Large
  Language Models and Constraint Programming
"I Want It That Way": Enabling Interactive Decision Support Using Large Language Models and Constraint Programming
Connor Lawless
Jakob Schoeffer
Lindy Le
Kael Rowan
Shilad Sen
Cristina St. Hill
Jina Suh
Bahar Sarrafzadeh
33
8
0
12 Dec 2023
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models
Samuele Poppi
Tobia Poppi
Federico Cocchi
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
VLM
19
8
0
27 Nov 2023
Diffusion Model Alignment Using Direct Preference Optimization
Diffusion Model Alignment Using Direct Preference Optimization
Bram Wallace
Meihua Dang
Rafael Rafailov
Linqi Zhou
Aaron Lou
Senthil Purushwalkam
Stefano Ermon
Caiming Xiong
Shafiq R. Joty
Nikhil Naik
EGVM
33
224
0
21 Nov 2023
Aligned: A Platform-based Process for Alignment
Aligned: A Platform-based Process for Alignment
Ethan Shaotran
Ido Pesok
Sam Jones
Emi Liu
11
1
0
15 Nov 2023
Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small
  Scorer
Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer
Bowen Tan
Yun Zhu
Lijuan Liu
Eric P. Xing
Zhiting Hu
Jindong Chen
ALM
LRM
37
7
0
12 Nov 2023
Identifying and Mitigating Vulnerabilities in LLM-Integrated
  Applications
Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications
Fengqing Jiang
Zhangchen Xu
Luyao Niu
Boxin Wang
Jinyuan Jia
Bo Li
Radha Poovendran
AAML
11
18
0
07 Nov 2023
Leveraging Large Language Models for Collective Decision-Making
Leveraging Large Language Models for Collective Decision-Making
Marios Papachristou
Longqi Yang
Chin-Chia Hsu
LLMAG
31
2
0
03 Nov 2023
Knowledge Editing for Large Language Models: A Survey
Knowledge Editing for Large Language Models: A Survey
Song Wang
Yaochen Zhu
Haochen Liu
Zaiyi Zheng
Chen Chen
Jundong Li
KELM
66
133
0
24 Oct 2023
Clinfo.ai: An Open-Source Retrieval-Augmented Large Language Model
  System for Answering Medical Questions using Scientific Literature
Clinfo.ai: An Open-Source Retrieval-Augmented Large Language Model System for Answering Medical Questions using Scientific Literature
Alejandro Lozano
Scott L. Fleming
Chia-Chun Chiang
Nigam Shah
ELM
RALM
26
32
0
24 Oct 2023
Can ChatGPT Perform Reasoning Using the IRAC Method in Analyzing Legal
  Scenarios Like a Lawyer?
Can ChatGPT Perform Reasoning Using the IRAC Method in Analyzing Legal Scenarios Like a Lawyer?
Xiaoxi Kang
Lizhen Qu
Lay-Ki Soon
Adnan Trakic
Terry Yue Zhuo
Patrick Charles Emerton
Genevieve Grant
LRM
AILaw
ELM
115
13
0
23 Oct 2023
Which Prompts Make The Difference? Data Prioritization For Efficient
  Human LLM Evaluation
Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation
M. Boubdir
Edward Kim
B. Ermiş
Marzieh Fadaee
Sara Hooker
ALM
31
18
0
22 Oct 2023
Towards Understanding Sycophancy in Language Models
Towards Understanding Sycophancy in Language Models
Mrinank Sharma
Meg Tong
Tomasz Korbak
D. Duvenaud
Amanda Askell
...
Oliver Rausch
Nicholas Schiefer
Da Yan
Miranda Zhang
Ethan Perez
209
178
0
20 Oct 2023
CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations
CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations
Myra Cheng
Tiziano Piccardi
Diyi Yang
LLMAG
16
67
0
17 Oct 2023
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method
  for Aligning Large Language Models
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Ziniu Li
Tian Xu
Yushun Zhang
Zhihang Lin
Yang Yu
Ruoyu Sun
Zhimin Luo
19
47
0
16 Oct 2023
Towards Better Evaluation of Instruction-Following: A Case-Study in
  Summarization
Towards Better Evaluation of Instruction-Following: A Case-Study in Summarization
Ondrej Skopek
Rahul Aralikatte
Sian Gooding
Victor Carbune
ELM
39
17
0
12 Oct 2023
The Past, Present and Better Future of Feedback Learning in Large
  Language Models for Subjective Human Preferences and Values
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values
Hannah Rose Kirk
Andrew M. Bean
Bertie Vidgen
Paul Röttger
Scott A. Hale
ALM
19
41
0
11 Oct 2023
Confronting Reward Model Overoptimization with Constrained RLHF
Confronting Reward Model Overoptimization with Constrained RLHF
Ted Moskovitz
Aaditya K. Singh
DJ Strouse
T. Sandholm
Ruslan Salakhutdinov
Anca D. Dragan
Stephen Marcus McAleer
29
47
0
06 Oct 2023
Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct
  Preference Optimization
Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
Zhanhui Zhou
Jie Liu
Chao Yang
Jing Shao
Yu Liu
Xiangyu Yue
Wanli Ouyang
Yu Qiao
24
47
0
05 Oct 2023
The Empty Signifier Problem: Towards Clearer Paradigms for
  Operationalising "Alignment" in Large Language Models
The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models
Hannah Rose Kirk
Bertie Vidgen
Paul Röttger
Scott A. Hale
39
2
0
03 Oct 2023
Consistent Aggregation of Objectives with Diverse Time Preferences
  Requires Non-Markovian Rewards
Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards
Silviu Pitis
35
5
0
30 Sep 2023
Decolonial AI Alignment: Openness, Viśe\d{s}a-Dharma, and Including
  Excluded Knowledges
Decolonial AI Alignment: Openness, Viśe\d{s}a-Dharma, and Including Excluded Knowledges
Kush R. Varshney
36
2
0
10 Sep 2023
Generative Social Choice
Generative Social Choice
Sara Fish
Paul Gölz
David C. Parkes
Ariel D. Procaccia
Gili Rusak
Itai Shapira
Manuel Wüthrich
23
26
0
03 Sep 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from
  Human Feedback
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
T. Gilbert
Jérémy Scheurer
...
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
ALM
OffRL
39
470
0
27 Jul 2023
Evaluating the Moral Beliefs Encoded in LLMs
Evaluating the Moral Beliefs Encoded in LLMs
Nino Scherrer
Claudia Shi
Amir Feder
David M. Blei
25
115
0
26 Jul 2023
Decoding ChatGPT: A Taxonomy of Existing Research, Current Challenges,
  and Possible Future Directions
Decoding ChatGPT: A Taxonomy of Existing Research, Current Challenges, and Possible Future Directions
S. Sohail
Faiza Farhat
Yassine Himeur
Mohammad Nadeem
D. Madsen
Yashbir Singh
Shadi Atalla
W. Mansoor
20
112
0
26 Jul 2023
Opinion Mining Using Population-tuned Generative Language Models
Opinion Mining Using Population-tuned Generative Language Models
Allmin Pradhap Singh Susaiyah
Abhinay Pandya
Aki Härmä
8
0
0
24 Jul 2023
Previous
123
Next