ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.00761
  4. Cited By
On Reinforcement Learning and Distribution Matching for Fine-Tuning
  Language Models with no Catastrophic Forgetting

On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting

1 June 2022
Tomasz Korbak
Hady ElSahar
Germán Kruszewski
Marc Dymetman
    CLL
ArXivPDFHTML

Papers citing "On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting"

48 / 48 papers shown
Title
BLEUBERI: BLEU is a surprisingly effective reward for instruction following
BLEUBERI: BLEU is a surprisingly effective reward for instruction following
Yapei Chang
Yekyung Kim
Michael Krumdick
Amir Zadeh
Chuan Li
Chris Tanner
Mohit Iyyer
ALM
17
0
0
16 May 2025
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
Taiwei Shi
Yiyang Wu
Linxin Song
Dinesh Manocha
Jieyu Zhao
LRM
78
1
0
07 Apr 2025
Sample, Don't Search: Rethinking Test-Time Alignment for Language Models
Sample, Don't Search: Rethinking Test-Time Alignment for Language Models
Gonçalo Faria
Noah A. Smith
34
0
0
04 Apr 2025
MultiClear: Multimodal Soft Exoskeleton Glove for Transparent Object Grasping Assistance
MultiClear: Multimodal Soft Exoskeleton Glove for Transparent Object Grasping Assistance
Chen Hu
Timothy Neate
Shan Luo
Letizia Gionfrida
49
0
0
04 Apr 2025
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
Jaihoon Kim
Taehoon Yoon
Jisung Hwang
Minhyuk Sung
DiffM
54
1
0
25 Mar 2025
Representation-based Reward Modeling for Efficient Safety Alignment of Large Language Model
Qiyuan Deng
X. Bai
Kehai Chen
Yaowei Wang
Liqiang Nie
Min Zhang
OffRL
63
0
0
13 Mar 2025
LADDER: Self-Improving LLMs Through Recursive Problem Decomposition
Toby Simonds
Akira Yoshiyama
LRM
32
3
0
02 Mar 2025
R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning
R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning
Minggui He
Yilun Liu
Shimin Tao
Yuanchang Luo
Hongyong Zeng
...
Daimeng Wei
Weibin Meng
Hao Yang
Boxing Chen
Osamu Yoshie
LRM
65
2
0
27 Feb 2025
PIPA: Preference Alignment as Prior-Informed Statistical Estimation
PIPA: Preference Alignment as Prior-Informed Statistical Estimation
Junbo Li
Zhangyang Wang
Qiang Liu
OffRL
102
0
0
09 Feb 2025
Direct Unlearning Optimization for Robust and Safe Text-to-Image Models
Direct Unlearning Optimization for Robust and Safe Text-to-Image Models
Yong-Hyun Park
Sangdoo Yun
Jin-Hwa Kim
Junho Kim
Geonhui Jang
Yonghyun Jeong
Junghyo Jo
Gayoung Lee
76
13
0
17 Jan 2025
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Yueqin Yin
Shentao Yang
Yujia Xie
Ziyi Yang
Yuting Sun
Hany Awadalla
Weizhu Chen
Mingyuan Zhou
50
0
0
07 Jan 2025
Energy-Based Preference Model Offers Better Offline Alignment than the
  Bradley-Terry Preference Model
Energy-Based Preference Model Offers Better Offline Alignment than the Bradley-Terry Preference Model
Yuzhong Hong
Hanshan Zhang
Junwei Bao
Hongfei Jiang
Yang Song
OffRL
77
1
0
18 Dec 2024
Guaranteed Generation from Large Language Models
Guaranteed Generation from Large Language Models
Minbeom Kim
Thibaut Thonet
Jos Rozen
Hwaran Lee
Kyomin Jung
Marc Dymetman
46
1
0
09 Oct 2024
Reward Learning From Preference With Ties
Reward Learning From Preference With Ties
Jinsong Liu
Dongdong Ge
Ruihao Zhu
29
3
0
05 Oct 2024
Reasoning Elicitation in Language Models via Counterfactual Feedback
Reasoning Elicitation in Language Models via Counterfactual Feedback
Alihan Hüyük
Xinnuo Xu
Jacqueline Maasch
Aditya V. Nori
Javier González
ReLM
LRM
151
1
0
02 Oct 2024
FlipGuard: Defending Preference Alignment against Update Regression with
  Constrained Optimization
FlipGuard: Defending Preference Alignment against Update Regression with Constrained Optimization
Mingye Zhu
Yi Liu
Quan Wang
Junbo Guo
Zhendong Mao
26
1
0
01 Oct 2024
AnyBipe: An End-to-End Framework for Training and Deploying Bipedal Robots Guided by Large Language Models
AnyBipe: An End-to-End Framework for Training and Deploying Bipedal Robots Guided by Large Language Models
Yifei Yao
Wentao He
Chenyu Gu
Jiaheng Du
Fuwei Tan
Zhen Zhu
Junguo Lu
OffRL
31
2
0
13 Sep 2024
A Survey of Mamba
A Survey of Mamba
Shuwei Shi
Shibing Chu
Rui An
Wenqi Fan
Yuee Xie
Hui Liu
Yuanping Chen
Qing Li
AI4CE
42
26
0
02 Aug 2024
INTELLECT: Adapting Cyber Threat Detection to Heterogeneous Computing
  Environments
INTELLECT: Adapting Cyber Threat Detection to Heterogeneous Computing Environments
Simone Magnani
Liubov Nedoshivina
Roberto Doriguzzi-Corin
Stefano Braghin
Domenico Siracusa
53
0
0
17 Jul 2024
From Decoding to Meta-Generation: Inference-time Algorithms for Large
  Language Models
From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models
Sean Welleck
Amanda Bertsch
Matthew Finlayson
Hailey Schoelkopf
Alex Xie
Graham Neubig
Ilia Kulikov
Zaid Harchaoui
33
50
0
24 Jun 2024
Cascade Reward Sampling for Efficient Decoding-Time Alignment
Cascade Reward Sampling for Efficient Decoding-Time Alignment
Bolian Li
Yifan Wang
A. Grama
Ruqi Zhang
Ruqi Zhang
AI4TS
49
9
0
24 Jun 2024
QUEST: Quality-Aware Metropolis-Hastings Sampling for Machine
  Translation
QUEST: Quality-Aware Metropolis-Hastings Sampling for Machine Translation
Gonccalo R. A. Faria
Sweta Agrawal
António Farinhas
Ricardo Rei
José G. C. de Souza
André F. T. Martins
26
4
0
28 May 2024
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy
  Data
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Fahim Tajwar
Anika Singh
Archit Sharma
Rafael Rafailov
Jeff Schneider
Tengyang Xie
Stefano Ermon
Chelsea Finn
Aviral Kumar
44
106
0
22 Apr 2024
Asymptotics of Language Model Alignment
Asymptotics of Language Model Alignment
Joy Qiping Yang
Salman Salamatian
Ziteng Sun
A. Suresh
Ahmad Beirami
63
21
0
02 Apr 2024
Direct Alignment of Draft Model for Speculative Decoding with
  Chat-Fine-Tuned LLMs
Direct Alignment of Draft Model for Speculative Decoding with Chat-Fine-Tuned LLMs
Raghavv Goel
Mukul Gagrani
Wonseok Jeon
Junyoung Park
Mingu Lee
Christopher Lott
ALM
26
5
0
29 Feb 2024
Don't Forget Your Reward Values: Language Model Alignment via
  Value-based Calibration
Don't Forget Your Reward Values: Language Model Alignment via Value-based Calibration
Xin Mao
Fengming Li
Huimin Xu
Wei Zhang
A. Luu
ALM
45
6
0
25 Feb 2024
Rewards-in-Context: Multi-objective Alignment of Foundation Models with
  Dynamic Preference Adjustment
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment
Rui Yang
Xiaoman Pan
Feng Luo
Shuang Qiu
Han Zhong
Dong Yu
Jianshu Chen
103
67
0
15 Feb 2024
BRAIn: Bayesian Reward-conditioned Amortized Inference for natural
  language generation from feedback
BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback
Gaurav Pandey
Yatin Nandwani
Tahira Naseem
Mayank Mishra
Guangxuan Xu
Dinesh Raghu
Sachindra Joshi
Asim Munawar
Ramón Fernández Astudillo
BDL
44
3
0
04 Feb 2024
Uncertainty-Penalized Reinforcement Learning from Human Feedback with
  Diverse Reward LoRA Ensembles
Uncertainty-Penalized Reinforcement Learning from Human Feedback with Diverse Reward LoRA Ensembles
Yuanzhao Zhai
Han Zhang
Yu Lei
Yue Yu
Kele Xu
Dawei Feng
Bo Ding
Huaimin Wang
AI4CE
72
32
0
30 Dec 2023
A density estimation perspective on learning from pairwise human
  preferences
A density estimation perspective on learning from pairwise human preferences
Vincent Dumoulin
Daniel D. Johnson
Pablo Samuel Castro
Hugo Larochelle
Yann Dauphin
29
12
0
23 Nov 2023
Exploring the impact of low-rank adaptation on the performance,
  efficiency, and regularization of RLHF
Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF
Simeng Sun
Dhawal Gupta
Mohit Iyyer
19
17
0
16 Sep 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from
  Human Feedback
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
T. Gilbert
Jérémy Scheurer
...
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
ALM
OffRL
47
472
0
27 Jul 2023
Learning to Generate Better Than Your LLM
Learning to Generate Better Than Your LLM
Jonathan D. Chang
Kianté Brantley
Rajkumar Ramamurthy
Dipendra Kumar Misra
Wen Sun
19
41
0
20 Jun 2023
Preference-grounded Token-level Guidance for Language Model Fine-tuning
Preference-grounded Token-level Guidance for Language Model Fine-tuning
Shentao Yang
Shujian Zhang
Congying Xia
Yihao Feng
Caiming Xiong
Mi Zhou
26
23
0
01 Jun 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
58
3,354
0
29 May 2023
disco: a toolkit for Distributional Control of Generative Models
disco: a toolkit for Distributional Control of Generative Models
Germán Kruszewski
Jos Rozen
Marc Dymetman
24
4
0
08 Mar 2023
Pretraining Language Models with Human Preferences
Pretraining Language Models with Human Preferences
Tomasz Korbak
Kejian Shi
Angelica Chen
Rasika Bhalerao
C. L. Buckley
Jason Phang
Sam Bowman
Ethan Perez
ALM
SyDa
36
207
0
16 Feb 2023
Auditing large language models: a three-layered approach
Auditing large language models: a three-layered approach
Jakob Mokander
Jonas Schuett
Hannah Rose Kirk
Luciano Floridi
AILaw
MLAU
45
194
0
16 Feb 2023
Aligning Language Models with Preferences through f-divergence
  Minimization
Aligning Language Models with Preferences through f-divergence Minimization
Dongyoung Go
Tomasz Korbak
Germán Kruszewski
Jos Rozen
Nahyeon Ryu
Marc Dymetman
26
69
0
16 Feb 2023
RL with KL penalties is better viewed as Bayesian inference
RL with KL penalties is better viewed as Bayesian inference
Tomasz Korbak
Ethan Perez
Christopher L. Buckley
OffRL
38
73
0
23 May 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,953
0
04 Mar 2022
Controlling Conditional Language Models without Catastrophic Forgetting
Controlling Conditional Language Models without Catastrophic Forgetting
Tomasz Korbak
Hady ElSahar
Germán Kruszewski
Marc Dymetman
CLL
AI4CE
15
33
0
01 Dec 2021
Challenges in Detoxifying Language Models
Challenges in Detoxifying Language Models
Johannes Welbl
Amelia Glaese
J. Uesato
Sumanth Dathathri
John F. J. Mellor
Lisa Anne Hendricks
Kirsty Anderson
Pushmeet Kohli
Ben Coppin
Po-Sen Huang
LM&MA
250
193
0
15 Sep 2021
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
280
1,595
0
18 Sep 2019
The Woman Worked as a Babysitter: On Biases in Language Generation
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
217
616
0
03 Sep 2019
Language GANs Falling Short
Language GANs Falling Short
Massimo Caccia
Lucas Caccia
W. Fedus
Hugo Larochelle
Joelle Pineau
Laurent Charlin
124
215
0
06 Nov 2018
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,746
0
26 Sep 2016
Deep Reinforcement Learning for Dialogue Generation
Deep Reinforcement Learning for Dialogue Generation
Jiwei Li
Will Monroe
Alan Ritter
Michel Galley
Jianfeng Gao
Dan Jurafsky
214
1,327
0
05 Jun 2016
1