Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.05910
Cited By
SALMON: Self-Alignment with Instructable Reward Models
9 October 2023
Zhiqing Sun
Yikang Shen
Hongxin Zhang
Qinhong Zhou
Zhenfang Chen
David D. Cox
Yiming Yang
Chuang Gan
ALM
SyDa
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SALMON: Self-Alignment with Instructable Reward Models"
34 / 34 papers shown
Title
QM-ToT: A Medical Tree of Thoughts Reasoning Framework for Quantized Model
Zongxian Yang
Jiayu Qian
Z. Huang
Kay Chen Tan
LM&MA
LRM
28
0
0
13 Apr 2025
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
Jialun Zhong
Wei Shen
Yanzeng Li
Songyang Gao
Hua Lu
Yicheng Chen
Yang Zhang
Wei Zhou
Jinjie Gu
Lei Zou
LRM
38
2
0
12 Apr 2025
VideoSAVi: Self-Aligned Video Language Models without Human Supervision
Yogesh Kulkarni
Pooyan Fazli
VLM
98
2
0
01 Dec 2024
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Dawei Li
Bohan Jiang
Liangjie Huang
Alimohammad Beigi
Chengshuai Zhao
...
Canyu Chen
Tianhao Wu
Kai Shu
Lu Cheng
Huan Liu
ELM
AILaw
108
63
0
25 Nov 2024
Self-Generated Critiques Boost Reward Modeling for Language Models
Yue Yu
Zhengxing Chen
Aston Zhang
L Tan
Chenguang Zhu
...
Suchin Gururangan
Chao-Yue Zhang
Melanie Kambadur
Dhruv Mahajan
Rui Hou
LRM
ALM
87
14
0
25 Nov 2024
Towards Full Delegation: Designing Ideal Agentic Behaviors for Travel Planning
Song Jiang
Da JU
Andrew Cohen
Sasha Mitts
Aaron Foss
Justine T Kao
Xian Li
Yuandong Tian
62
2
0
21 Nov 2024
Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning
Yujian Liu
Shiyu Chang
Tommi Jaakkola
Yang Zhang
23
0
0
25 Oct 2024
A Survey on Data Synthesis and Augmentation for Large Language Models
Ke Wang
Jiahui Zhu
Minjie Ren
Z. Liu
Shiwei Li
...
Chenkai Zhang
Xiaoyu Wu
Qiqi Zhan
Qingjie Liu
Yunhong Wang
SyDa
38
15
0
16 Oct 2024
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
Jihan Yao
Wenxuan Ding
Shangbin Feng
Lucy Lu Wang
Yulia Tsvetkov
25
0
0
14 Oct 2024
MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference Optimization
Yougang Lyu
Lingyong Yan
Zihan Wang
Dawei Yin
Pengjie Ren
Maarten de Rijke
Z. Z. Ren
55
6
0
10 Oct 2024
PREDICT: Preference Reasoning by Evaluating Decomposed preferences Inferred from Candidate Trajectories
Stephane Aroca-Ouellette
Natalie Mackraz
B. Theobald
Katherine Metcalf
28
0
0
08 Oct 2024
FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model
Yichen Lu
Jiaqi Song
Chao-Han Huck Yang
Shinji Watanabe
16
0
0
03 Oct 2024
Preference-Guided Reflective Sampling for Aligning Language Models
Hai Ye
Hwee Tou Ng
24
3
0
22 Aug 2024
Internal Consistency and Self-Feedback in Large Language Models: A Survey
Xun Liang
Shichao Song
Zifan Zheng
Hanyu Wang
Qingchen Yu
...
Rong-Hua Li
Peng Cheng
Zhonghao Wang
Feiyu Xiong
Zhiyu Li
HILM
LRM
58
24
0
19 Jul 2024
A Survey on Self-Evolution of Large Language Models
Zhengwei Tao
Ting-En Lin
Xiancai Chen
Hangyu Li
Yuchuan Wu
Yongbin Li
Zhi Jin
Fei Huang
Dacheng Tao
Jingren Zhou
LRM
LM&Ro
49
21
0
22 Apr 2024
Multilingual Large Language Model: A Survey of Resources, Taxonomy and Frontiers
Libo Qin
Qiguang Chen
Yuhang Zhou
Zhi Chen
Yinghui Li
Lizi Liao
Min Li
Wanxiang Che
Philip S. Yu
LRM
47
36
0
07 Apr 2024
Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
Ruohong Zhang
Liangke Gui
Zhiqing Sun
Yihao Feng
Keyang Xu
...
Di Fu
Chunyuan Li
Alexander G. Hauptmann
Yonatan Bisk
Yiming Yang
MLLM
43
57
0
01 Apr 2024
ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback
Zhenyu Hou
Yiin Niu
Zhengxiao Du
Xiaohan Zhang
Xiao Liu
...
Qinkai Zheng
Minlie Huang
Hongning Wang
Jie Tang
Yuxiao Dong
ALM
22
17
0
01 Apr 2024
Large Language Models for Data Annotation: A Survey
Zhen Tan
Dawei Li
Song Wang
Alimohammad Beigi
Bohan Jiang
Amrita Bhattacharjee
Mansooreh Karami
Jundong Li
Lu Cheng
Huan Liu
SyDa
42
44
0
21 Feb 2024
Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversational Search
Chanwoong Yoon
Gangwoo Kim
Byeongguk Jeon
Sungdong Kim
Yohan Jo
Jaewoo Kang
RALM
KELM
35
11
0
19 Feb 2024
Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping
Haoyu Wang
Guozheng Ma
Ziqiao Meng
Zeyu Qin
Li Shen
...
Liu Liu
Yatao Bian
Tingyang Xu
Xueqian Wang
Peilin Zhao
55
12
0
12 Feb 2024
Direct Language Model Alignment from Online AI Feedback
Shangmin Guo
Biao Zhang
Tianlin Liu
Tianqi Liu
Misha Khalman
...
Thomas Mesnard
Yao-Min Zhao
Bilal Piot
Johan Ferret
Mathieu Blondel
ALM
23
129
0
07 Feb 2024
Biospheric AI
Marcin Korecki
24
0
0
31 Jan 2024
Weaver: Foundation Models for Creative Writing
Tiannan Wang
Jiamin Chen
Qingrui Jia
Shuai Wang
Ruoyu Fang
...
Xiaohua Xu
Ningyu Zhang
Huajun Chen
Yuchen Eleanor Jiang
Wangchunshu Zhou
20
18
0
30 Jan 2024
Human-Instruction-Free LLM Self-Alignment with Limited Samples
Hongyi Guo
Yuanshun Yao
Wei Shen
Jiaheng Wei
Xiaoying Zhang
Zhaoran Wang
Yang Liu
93
20
0
06 Jan 2024
Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM Game
Pengyu Cheng
Yifan Yang
Jian Li
Yong Dai
Tianhao Hu
Peixin Cao
Nan Du
Xiaolong Li
21
28
0
14 Nov 2023
The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback
Nathan Lambert
Roberto Calandra
ALM
13
30
0
31 Oct 2023
Aligning Large Language Models through Synthetic Feedback
Sungdong Kim
Sanghwan Bae
Jamin Shin
Soyoung Kang
Donghyun Kwak
Kang Min Yoo
Minjoon Seo
ALM
SyDa
73
67
0
23 May 2023
Learning by Distilling Context
Charles Burton Snell
Dan Klein
Ruiqi Zhong
ReLM
LRM
161
44
0
30 Sep 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
218
441
0
23 Aug 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,730
0
04 Mar 2022
Understanding Dataset Difficulty with
V
\mathcal{V}
V
-Usable Information
Kawin Ethayarajh
Yejin Choi
Swabha Swayamdipta
154
157
0
16 Oct 2021
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
205
1,651
0
15 Oct 2021
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
275
1,561
0
18 Sep 2019
1