Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.12109
Cited By
RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
22 August 2024
Chenglong Wang
Yang Gan
Yifu Huo
Yongyu Mu
Murun Yang
Qiaozhi He
Tong Xiao
Chunliang Zhang
Tongran Liu
Quan Du
Di Yang
Jingbo Zhu
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data"
7 / 7 papers shown
Title
Quality-Driven Curation of Remote Sensing Vision-Language Data via Learned Scoring Models
Dilxat Muhtar
Enzhuo Zhang
Zhenshi Li
Feng-Xue Gu
Yanglangxing He
P. Xiao
Xueliang Zhang
28
2
0
02 Mar 2025
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Yuhang Zang
Xiaoyi Dong
Pan Zhang
Yuhang Cao
Ziyu Liu
...
Haodong Duan
W. Zhang
Kai Chen
D. Lin
Jiaqi Wang
VLM
68
17
0
21 Jan 2025
LRHP: Learning Representations for Human Preferences via Preference Pairs
Chenglong Wang
Yang Gan
Yifu Huo
Yongyu Mu
Qiaozhi He
Murun Yang
Tong Xiao
Chunliang Zhang
Tongran Liu
Jingbo Zhu
AI4TS
27
1
0
06 Oct 2024
Get more for less: Principled Data Selection for Warming Up Fine-Tuning in LLMs
Feiyang Kang
H. Just
Yifan Sun
Himanshu Jahagirdar
Yuanzhi Zhang
Rongxing Du
Anit Kumar Sahu
Ruoxi Jia
35
17
0
05 May 2024
LESS: Selecting Influential Data for Targeted Instruction Tuning
Mengzhou Xia
Sadhika Malladi
Suchin Gururangan
Sanjeev Arora
Danqi Chen
68
180
0
06 Feb 2024
Silkie: Preference Distillation for Large Visual Language Models
Lei Li
Zhihui Xie
Mukai Li
Shunian Chen
Peiyi Wang
Liang Chen
Yazheng Yang
Benyou Wang
Lingpeng Kong
MLLM
96
67
0
17 Dec 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
1