Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.11204
Cited By
Learning from Imperfect Human Feedback: a Tale from Corruption-Robust Dueling
18 May 2024
Yuwei Cheng
Fan Yao
Xuefeng Liu
Haifeng Xu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning from Imperfect Human Feedback: a Tale from Corruption-Robust Dueling"
2 / 2 papers shown
Title
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Kai Ye
Hongyi Zhou
Jin Zhu
Francesco Quinzan
C. Shi
20
0
0
03 Apr 2025
Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions
Jiafan He
Dongruo Zhou
Tong Zhang
Quanquan Gu
61
46
0
13 May 2022
1