Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.17346
Cited By
Prompt Optimization with Human Feedback
27 May 2024
Xiaoqiang Lin
Zhongxiang Dai
Arun Verma
See-Kiong Ng
P. Jaillet
K. H. Low
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Prompt Optimization with Human Feedback"
8 / 8 papers shown
Title
Prompt Optimization with Logged Bandit Data
Haruka Kiyohara
Daniel Yiming Cao
Yuta Saito
Thorsten Joachims
61
0
0
03 Apr 2025
Self-Supervised Prompt Optimization
Jinyu Xiang
Jiayi Zhang
Zhaoyang Yu
Fengwei Teng
Jinhao Tu
Xinbing Liang
Sirui Hong
Chenglin Wu
Yuyu Luo
OffRL
LRM
57
5
0
07 Feb 2025
LiPO: Listwise Preference Optimization through Learning-to-Rank
Tianqi Liu
Zhen Qin
Junru Wu
Jiaming Shen
Misha Khalman
...
Mohammad Saleh
Simon Baumgartner
Jialu Liu
Peter J. Liu
Xuanhui Wang
133
47
0
28 Jan 2025
Neural Dueling Bandits: Preference-Based Optimization with Human Feedback
Arun Verma
Zhongxiang Dai
Xiaoqiang Lin
P. Jaillet
K. H. Low
19
5
0
24 Jul 2024
Direct Preference Optimization with an Offset
Afra Amini
Tim Vieira
Ryan Cotterell
68
54
0
16 Feb 2024
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
278
3,784
0
18 Apr 2021
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
273
1,561
0
18 Sep 2019
1