Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.12735
Cited By
CREAM: Consistency Regularized Self-Rewarding Language Models
16 October 2024
Z. Wang
Weilei He
Zhiyuan Liang
Xuchao Zhang
Chetan Bansal
Ying Wei
Weitong Zhang
Huaxiu Yao
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CREAM: Consistency Regularized Self-Rewarding Language Models"
5 / 5 papers shown
Title
Anyprefer: An Agentic Framework for Preference Data Synthesis
Yiyang Zhou
Z. Wang
Tianle Wang
Shangyu Xing
Peng Xia
...
Chetan Bansal
Weitong Zhang
Ying Wei
Mohit Bansal
Huaxiu Yao
44
0
0
27 Apr 2025
Adaptive Helpfulness-Harmlessness Alignment with Preference Vectors
Ren-Wei Liang
Chin-Ting Hsu
Chan-Hung Yu
Saransh Agrawal
Shih-Cheng Huang
Shang-Tse Chen
Kuan-Hao Huang
Shao-Hua Sun
49
0
0
27 Apr 2025
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization
Qingyang Zhang
Haitao Wu
Changqing Zhang
Peilin Zhao
Yatao Bian
ReLM
LRM
41
1
0
08 Apr 2025
Self-Consistency of the Internal Reward Models Improves Self-Rewarding Language Models
Xin Zhou
Yiwen Guo
Ruotian Ma
Tao Gui
Qi Zhang
Xuanjing Huang
LRM
54
1
0
13 Feb 2025
Calibrated Self-Rewarding Vision Language Models
Yiyang Zhou
Zhiyuan Fan
Dongjie Cheng
Sihan Yang
Zhaorun Chen
Chenhang Cui
Xiyao Wang
Yun-Qing Li
Linjun Zhang
Huaxiu Yao
VLM
29
25
0
23 May 2024
1