ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.08045
  4. Cited By
Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM
  Game
v1v2v3v4 (latest)

Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM Game

Annual Meeting of the Association for Computational Linguistics (ACL), 2023
14 November 2023
Pengyu Cheng
Yifan Yang
Jian Li
Yong Dai
Tianhao Hu
Peixin Cao
Nan Du
Xiaolong Li
ArXiv (abs)PDFHTMLGithub (54★)

Papers citing "Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM Game"

24 / 24 papers shown
DAP: A Discrete-token Autoregressive Planner for Autonomous Driving
DAP: A Discrete-token Autoregressive Planner for Autonomous Driving
Bowen Ye
Bin Zhang
Hang Zhao
232
0
0
17 Nov 2025
SSFO: Self-Supervised Faithfulness Optimization for Retrieval-Augmented Generation
SSFO: Self-Supervised Faithfulness Optimization for Retrieval-Augmented Generation
Xiaqiang Tang
Yi Wang
Keyu Hu
Rui Xu
Chuang Li
Weigao Sun
Jian Li
Sihong Xie
RALM
237
1
0
24 Aug 2025
VerIF: Verification Engineering for Reinforcement Learning in Instruction Following
VerIF: Verification Engineering for Reinforcement Learning in Instruction Following
Hao Peng
Yunjia Qi
Xiaozhi Wang
Bin Xu
Lei Hou
Juanzi Li
OffRL
452
13
0
11 Jun 2025
A Survey on Progress in LLM Alignment from the Perspective of Reward Design
A Survey on Progress in LLM Alignment from the Perspective of Reward Design
Miaomiao Ji
Yanqiu Wu
Zhibin Wu
Shoujin Wang
Jian Yang
Mark Dras
Usman Naseem
477
14
0
05 May 2025
Energy-Based Reward Models for Robust Language Model Alignment
Energy-Based Reward Models for Robust Language Model Alignment
Anamika Lochab
Ruqi Zhang
1.1K
3
0
17 Apr 2025
Stackelberg Self-Annotation: A Robust Approach to Data-Efficient LLM Alignment
Stackelberg Self-Annotation: A Robust Approach to Data-Efficient LLM Alignment
Xu Chu
Zhixin Zhang
Tianyu Jia
Yujie Jin
499
2
0
25 Feb 2025
RLHF in an SFT Way: From Optimal Solution to Reward-Weighted Alignment
RLHF in an SFT Way: From Optimal Solution to Reward-Weighted Alignment
Yuhao Du
Hui Yuan
Pengyu Cheng
Zhihong Chen
Yuejiao Xie
Xiang Wan
Anningzhe Gao
363
10
0
16 Feb 2025
Inverse-RLignment: Large Language Model Alignment from Demonstrations through Inverse Reinforcement Learning
Inverse-RLignment: Large Language Model Alignment from Demonstrations through Inverse Reinforcement Learning
Hao Sun
M. Schaar
498
25
0
28 Jan 2025
Holistic Utility Preference Learning for Listwise Alignment
Holistic Utility Preference Learning for Listwise Alignment
Jiacong Zhou
Xianyun Wang
Jun Yu
Jun Yu
436
3
0
17 Oct 2024
On the Limited Generalization Capability of the Implicit Reward Model
  Induced by Direct Preference Optimization
On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference OptimizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yong Lin
Skyler Seto
Maartje ter Hoeve
Katherine Metcalf
B. Theobald
Xuan Wang
Yizhe Zhang
Chen Huang
Tong Zhang
468
25
0
05 Sep 2024
Systematic Evaluation of LLM-as-a-Judge in LLM Alignment Tasks: Explainable Metrics and Diverse Prompt Templates
Systematic Evaluation of LLM-as-a-Judge in LLM Alignment Tasks: Explainable Metrics and Diverse Prompt Templates
Hui Wei
Shenghua He
Tian Xia
Andy H. Wong
Jingyang Lin
Mei Han
Mei Han
ALMELM
595
71
0
23 Aug 2024
RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math
  Reasoning by Eight-Fold
RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold
Amrith Rajagopal Setlur
Saurabh Garg
Xinyang Geng
Naman Garg
Virginia Smith
Aviral Kumar
574
114
0
20 Jun 2024
A Survey on Human Preference Learning for Large Language Models
A Survey on Human Preference Learning for Large Language Models
Ruili Jiang
Kehai Chen
Xuefeng Bai
Zhixuan He
Juntao Li
Muyun Yang
Tiejun Zhao
Liqiang Nie
Min Zhang
364
20
0
17 Jun 2024
Regularizing Hidden States Enables Learning Generalizable Reward Model
  for LLMs
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMsNeural Information Processing Systems (NeurIPS), 2024
Rui Yang
Ruomeng Ding
Yong Lin
Huan Zhang
Tong Zhang
326
120
0
14 Jun 2024
SPO: Multi-Dimensional Preference Sequential Alignment With Implicit
  Reward Modeling
SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling
Xingzhou Lou
Junge Zhang
Jian Xie
Lifeng Liu
Dong Yan
Kaiqi Huang
226
21
0
21 May 2024
Towards Adapting Open-Source Large Language Models for Expert-Level Clinical Note Generation
Towards Adapting Open-Source Large Language Models for Expert-Level Clinical Note Generation
Hanyin Wang
Chufan Gao
Bolun Liu
Qiping Xu
Guleid Hussein
Mohamad El Labban
Kingsley Iheasirim
H. Korsapati
Chuck Outcalt
Jimeng Sun
LM&MAAI4MH
379
2
0
25 Apr 2024
Self-playing Adversarial Language Game Enhances LLM Reasoning
Self-playing Adversarial Language Game Enhances LLM Reasoning
Pengyu Cheng
Tianhao Hu
Han Xu
Zhisong Zhang
Yong Dai
Lei Han
Nan Du
Nan Du
Xiaolong Li
SyDaLRMReLM
582
88
0
16 Apr 2024
Direct Nash Optimization: Teaching Language Models to Self-Improve with
  General Preferences
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Corby Rosset
Ching-An Cheng
Arindam Mitra
Michael Santacroce
Ahmed Hassan Awadallah
Tengyang Xie
565
171
0
04 Apr 2024
Improving Reinforcement Learning from Human Feedback Using Contrastive
  Rewards
Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Wei Shen
Xiaoying Zhang
Yuanshun Yao
Rui Zheng
Hongyi Guo
Yang Liu
ALM
262
26
0
12 Mar 2024
Overcoming Reward Overoptimization via Adversarial Policy Optimization
  with Lightweight Uncertainty Estimation
Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Xiaoying Zhang
Jean-François Ton
Wei Shen
Hongning Wang
Yang Liu
196
24
0
08 Mar 2024
Accelerating Greedy Coordinate Gradient via Probe Sampling
Accelerating Greedy Coordinate Gradient via Probe Sampling
Yiran Zhao
Wenyue Zheng
Tianle Cai
Xuan Long Do
Kenji Kawaguchi
Anirudh Goyal
Michael Shieh
373
2
0
02 Mar 2024
Relative Preference Optimization: Enhancing LLM Alignment through
  Contrasting Responses across Identical and Diverse Prompts
Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts
Yueqin Yin
Zhendong Wang
Yi Gu
Hai Huang
Weizhu Chen
Mingyuan Zhou
323
31
0
12 Feb 2024
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language
  Models
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language ModelsInternational Conference on Machine Learning (ICML), 2024
Zixiang Chen
Yihe Deng
Huizhuo Yuan
Kaixuan Ji
Quanquan Gu
SyDa
753
510
0
02 Jan 2024
On Diversified Preferences of Large Language Model Alignment
On Diversified Preferences of Large Language Model AlignmentConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Dun Zeng
Yong Dai
Pengyu Cheng
Longyue Wang
Tianhao Hu
Wanshun Chen
Nan Du
Zenglin Xu
ALM
452
23
0
12 Dec 2023
1
Page 1 of 1