ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.18571
  4. Cited By
Arithmetic Control of LLMs for Diverse User Preferences: Directional
  Preference Alignment with Multi-Objective Rewards

Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards

28 February 2024
Haoxiang Wang
Yong Lin
Wei Xiong
Rui Yang
Shizhe Diao
Shuang Qiu
Han Zhao
Tong Zhang
ArXivPDFHTML

Papers citing "Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards"

13 / 63 papers shown
Title
Prometheus 2: An Open Source Language Model Specialized in Evaluating
  Other Language Models
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
Seungone Kim
Juyoung Suk
Shayne Longpre
Bill Yuchen Lin
Jamin Shin
Sean Welleck
Graham Neubig
Moontae Lee
Kyungjae Lee
Minjoon Seo
MoMe
ALM
ELM
49
163
0
02 May 2024
DPO Meets PPO: Reinforced Token Optimization for RLHF
DPO Meets PPO: Reinforced Token Optimization for RLHF
Han Zhong
Guhao Feng
Guhao Feng
Li Zhao
Di He
Jiang Bian
Liwei Wang
Jiang Bian
Liwei Wang
47
55
0
29 Apr 2024
FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback
FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback
Liqiang Jing
Xinya Du
68
17
0
07 Apr 2024
ROPO: Robust Preference Optimization for Large Language Models
ROPO: Robust Preference Optimization for Large Language Models
Xize Liang
Chao Chen
Shuang Qiu
Jie Wang
Yue-bo Wu
Zhihang Fu
Zhihao Shi
Feng Wu
Jieping Ye
40
1
0
05 Apr 2024
Strengthening Multimodal Large Language Model with Bootstrapped
  Preference Optimization
Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization
Renjie Pi
Tianyang Han
Wei Xiong
Jipeng Zhang
Runtao Liu
Rui Pan
Tong Zhang
MLLM
33
33
0
13 Mar 2024
Controllable Preference Optimization: Toward Controllable
  Multi-Objective Alignment
Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment
Yiju Guo
Ganqu Cui
Lifan Yuan
Ning Ding
Jiexin Wang
...
Ruobing Xie
Jie Zhou
Yankai Lin
Zhiyuan Liu
Maosong Sun
28
56
0
29 Feb 2024
Online Iterative Reinforcement Learning from Human Feedback with General
  Preference Model
Online Iterative Reinforcement Learning from Human Feedback with General Preference Model
Chen Ye
Wei Xiong
Yuheng Zhang
Nan Jiang
Tong Zhang
OffRL
36
9
0
11 Feb 2024
Panacea: Pareto Alignment via Preference Adaptation for LLMs
Panacea: Pareto Alignment via Preference Adaptation for LLMs
Yifan Zhong
Chengdong Ma
Xiaoyuan Zhang
Ziran Yang
Haojun Chen
Qingfu Zhang
Siyuan Qi
Yaodong Yang
49
30
0
03 Feb 2024
Self-Rewarding Language Models
Self-Rewarding Language Models
Weizhe Yuan
Richard Yuanzhe Pang
Kyunghyun Cho
Xian Li
Sainbayar Sukhbaatar
Jing Xu
Jason Weston
ReLM
SyDa
ALM
LRM
230
291
0
18 Jan 2024
Are Large Language Models Really Robust to Word-Level Perturbations?
Are Large Language Models Really Robust to Word-Level Perturbations?
Haoyu Wang
Guozheng Ma
Cong Yu
Ning Gui
Linrui Zhang
...
Sen Zhang
Li Shen
Xueqian Wang
Peilin Zhao
Dacheng Tao
KELM
18
22
0
20 Sep 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
275
1,561
0
18 Sep 2019
Previous
12