Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.15568
Cited By
Robust Reinforcement Learning from Corrupted Human Feedback
21 June 2024
Alexander Bukharin
Ilgee Hong
Haoming Jiang
Zichong Li
Qingru Zhang
Zixuan Zhang
Tuo Zhao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Robust Reinforcement Learning from Corrupted Human Feedback"
4 / 4 papers shown
Title
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Kai Ye
Hongyi Zhou
Jin Zhu
Francesco Quinzan
C. Shi
20
0
0
03 Apr 2025
KTO: Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh
Winnie Xu
Niklas Muennighoff
Dan Jurafsky
Douwe Kiela
159
437
0
02 Feb 2024
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,730
0
04 Mar 2022
Adversarial Machine Learning at Scale
Alexey Kurakin
Ian Goodfellow
Samy Bengio
AAML
256
3,102
0
04 Nov 2016
1