Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.09520
Cited By
Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
14 October 2023
H. Deng
Colin Raffel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model"
13 / 13 papers shown
Title
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models
Xiaobao Wu
LRM
65
0
0
05 May 2025
TTRL: Test-Time Reinforcement Learning
Yuxin Zuo
Kaiyan Zhang
Shang Qu
Li Sheng
Xuekai Zhu
Biqing Qi
Youbang Sun
Ganqu Cui
Ning Ding
Bowen Zhou
OffRL
41
1
0
22 Apr 2025
Selective Prompt Anchoring for Code Generation
Yuan Tian
Tianyi Zhang
80
3
0
24 Feb 2025
A Critical Look At Tokenwise Reward-Guided Text Generation
Ahmad Rashid
Ruotian Wu
Julia Grosse
Agustinus Kristiadi
Pascal Poupart
OffRL
68
0
0
17 Feb 2025
Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements
Jingyu Zhang
Ahmed Elgohary
Ahmed Magooda
Daniel Khashabi
Benjamin Van Durme
50
2
0
11 Oct 2024
Diffusion Guided Language Modeling
Justin Lovelace
Varsha Kishore
Yiwei Chen
Kilian Q. Weinberger
31
6
0
08 Aug 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
27
17
0
08 Jul 2024
Cascade Reward Sampling for Efficient Decoding-Time Alignment
Bolian Li
Yifan Wang
A. Grama
Ruqi Zhang
Ruqi Zhang
AI4TS
47
8
0
24 Jun 2024
An Emulator for Fine-Tuning Large Language Models using Small Language Models
Eric Mitchell
Rafael Rafailov
Archit Sharma
Chelsea Finn
Christopher D. Manning
ALM
22
51
0
19 Oct 2023
Classifiers are Better Experts for Controllable Text Generation
Askhat Sitdikov
Nikita Balagansky
Daniil Gavrilov
Alexander Markov
30
7
0
15 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
278
3,784
0
18 Apr 2021
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
223
4,424
0
23 Jan 2020
1