Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.12366
Cited By
A Critical Evaluation of AI Feedback for Aligning Large Language Models
19 February 2024
Archit Sharma
Sedrick Scott Keh
Eric Mitchell
Chelsea Finn
Kushal Arora
Thomas Kollar
ALM
LLMAG
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Critical Evaluation of AI Feedback for Aligning Large Language Models"
9 / 9 papers shown
Title
RLTHF: Targeted Human Feedback for LLM Alignment
Yifei Xu
Tusher Chakraborty
Emre Kıcıman
Bibek Aryal
Eduardo Rodrigues
...
Rafael Padilha
Leonardo Nunes
Shobana Balakrishnan
Songwu Lu
Ranveer Chandra
98
1
0
24 Feb 2025
Personality Alignment of Large Language Models
Minjun Zhu
Linyi Yang
Yue Zhang
Yue Zhang
ALM
55
5
0
21 Aug 2024
SAIL: Self-Improving Efficient Online Alignment of Large Language Models
Mucong Ding
Souradip Chakraborty
Vibhu Agrawal
Zora Che
Alec Koppel
Mengdi Wang
Amrit Singh Bedi
Furong Huang
26
9
0
21 Jun 2024
Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning
Jifan Zhang
Lalit P. Jain
Yang Guo
Jiayi Chen
Kuan Lok Zhou
...
Scott Sievert
Timothy Rogers
Kevin Jamieson
Robert Mankoff
Robert Nowak
29
5
0
15 Jun 2024
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
Jiwoo Hong
Sayak Paul
Noah Lee
Kashif Rasul
James Thorne
Jongheon Jeong
31
13
0
10 Jun 2024
Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback
Vincent Conitzer
Rachel Freedman
J. Heitzig
Wesley H. Holliday
Bob M. Jacobs
...
Eric Pacuit
Stuart Russell
Hailey Schoelkopf
Emanuel Tewolde
W. Zwicker
31
28
0
16 Apr 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
139
298
0
05 Jan 2024
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,730
0
04 Mar 2022
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
275
1,561
0
18 Sep 2019
1