Title |
---|
![]() HelloBench: Evaluating Long Text Generation Capabilities of Large
Language Models Haoran Que Feiyu Duan Liqun He Yutao Mou Wangchunshu Zhou ...Ge Zhang Junran Peng Zhaoxiang Zhang Songyang Zhang Kai Chen |
![]() RRM: Robust Reward Model Training Mitigates Reward Hacking Tianqi Liu Wei Xiong Jie Jessie Ren Lichang Chen Junru Wu ...Yuan Liu Bilal Piot Abe Ittycheriah Aviral Kumar Mohammad Saleh |
![]() Towards a Unified View of Preference Learning for Large Language Models:
A Survey Bofei Gao Feifan Song Yibo Miao Zefan Cai Z. Yang ...Houfeng Wang Zhifang Sui Peiyi Wang Baobao Chang Baobao Chang |
![]() Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization Yuxin Jiang Bo Huang Yufei Wang Xingshan Zeng Liangyou Li Yasheng Wang Xin Jiang Lifeng Shang Ruiming Tang Wei Wang |