ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.10202
  4. Cited By
Stabilizing RLHF through Advantage Model and Selective Rehearsal

Stabilizing RLHF through Advantage Model and Selective Rehearsal

18 September 2023
Baolin Peng
Linfeng Song
Ye Tian
Lifeng Jin
Haitao Mi
Dong Yu
ArXiv (abs)PDFHTMLHuggingFace (11 upvotes)

Papers citing "Stabilizing RLHF through Advantage Model and Selective Rehearsal"

10 / 10 papers shown
Mapping Post-Training Forgetting in Language Models at Scale
Mapping Post-Training Forgetting in Language Models at Scale
Jackson Harmon
Andreas Hochlehnert
Matthias Bethge
Ameya Prabhu
CLLKELM
153
0
0
20 Oct 2025
HSCR: Hierarchical Self-Contrastive Rewarding for Aligning Medical Vision Language Models
HSCR: Hierarchical Self-Contrastive Rewarding for Aligning Medical Vision Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Songtao Jiang
Yan Zhang
Yeying Jin
Hongwei Wang
Y. Wu
Yang Feng
Jian Wu
Zuozhu Liu
214
3
0
01 Jun 2025
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Yuheng Zhang
Dian Yu
Tao Ge
Linfeng Song
Zhichen Zeng
Haitao Mi
Nan Jiang
Dong Yu
327
13
0
24 Feb 2025
Modality-Fair Preference Optimization for Trustworthy MLLM Alignment
Modality-Fair Preference Optimization for Trustworthy MLLM AlignmentInternational Joint Conference on Artificial Intelligence (IJCAI), 2024
Songtao Jiang
Yan Zhang
Ruizhe Chen
Yeying Jin
Zuozhu Liu
Qinglin He
Yang Feng
Jian Wu
Zuozhu Liu
MoEMLLM
316
18
0
20 Oct 2024
Systematic Evaluation of LLM-as-a-Judge in LLM Alignment Tasks: Explainable Metrics and Diverse Prompt Templates
Systematic Evaluation of LLM-as-a-Judge in LLM Alignment Tasks: Explainable Metrics and Diverse Prompt Templates
Hui Wei
Shenghua He
Tian Xia
Andy H. Wong
Jingyang Lin
Mei Han
Mei Han
ALMELM
503
60
0
23 Aug 2024
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Yuheng Zhang
Dian Yu
Baolin Peng
Linfeng Song
Ye Tian
Mingyue Huo
Nan Jiang
Haitao Mi
Dong Yu
533
31
0
30 Jun 2024
Dense Reward for Free in Reinforcement Learning from Human Feedback
Dense Reward for Free in Reinforcement Learning from Human Feedback
Alex J. Chan
Hao Sun
Samuel Holt
M. Schaar
268
61
0
01 Feb 2024
Enabling Language Models to Implicitly Learn Self-Improvement
Enabling Language Models to Implicitly Learn Self-Improvement
Ziqi Wang
Le Hou
Tianjian Lu
Yuexin Wu
Yunxuan Li
Hongkun Yu
Heng Ji
ReLMLRM
279
9
0
02 Oct 2023
Reward Engineering for Generating Semi-structured Explanation
Reward Engineering for Generating Semi-structured ExplanationFindings (Findings), 2023
Jiuzhou Han
Wray Buntine
Ehsan Shareghi
LRM
154
0
0
15 Sep 2023
Leftover Lunch: Advantage-based Offline Reinforcement Learning for
  Language Models
Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Ashutosh Baheti
Ximing Lu
Faeze Brahman
Ronan Le Bras
Maarten Sap
Mark O. Riedl
347
14
0
24 May 2023
1