ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2507.18624
  4. Cited By
Checklists Are Better Than Reward Models For Aligning Language Models

Checklists Are Better Than Reward Models For Aligning Language Models

24 July 2025
Vijay Viswanathan
Yanchao Sun
Shuang Ma
Xiang Kong
Meng Cao
Graham Neubig
Tongshuang Wu
    ALM
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "Checklists Are Better Than Reward Models For Aligning Language Models"

7 / 7 papers shown
Title
Making, not Taking, the Best of N
Making, not Taking, the Best of N
Ammar Khairi
Daniel D'souza
Marzieh Fadaee
Julia Kreutzer
MoMe
16
0
0
01 Oct 2025
Chasing the Tail: Effective Rubric-based Reward Modeling for Large Language Model Post-Training
Chasing the Tail: Effective Rubric-based Reward Modeling for Large Language Model Post-Training
Junkai Zhang
Zihao Wang
Lin Gui
Swarnashree Mysore Sathyendra
Jaehwan Jeong
Victor Veitch
Wei Wang
Yunzhong He
Bing Liu
Lifeng Jin
ALMLRM
22
1
0
25 Sep 2025
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning
Yang Zhou
Sunzhu Li
Shunyu Liu
Wenkai Fang
Jiale Zhao
...
Hengtong Lu
Wei Chen
Yan Xie
Mingli Song
Mingli Song
LRM
44
3
0
23 Aug 2025
Reinforcement Learning with Rubric Anchors
Reinforcement Learning with Rubric Anchors
Zenan Huang
Yihong Zhuang
Guoshan Lu
Zeyu Qin
Haokai Xu
...
Yanmei Gu
Y Samuel Wang
Zhengkai Yang
Jianguo Li
Junbo Zhao
ALM
28
9
0
18 Aug 2025
From Clicks to Preference: A Multi-stage Alignment Framework for Generative Query Suggestion in Conversational System
From Clicks to Preference: A Multi-stage Alignment Framework for Generative Query Suggestion in Conversational System
Junhao Yin
Haolin Wang
Peng Bao
Ju Xu
Yongliang Wang
32
0
0
15 Aug 2025
Are Today's LLMs Ready to Explain Well-Being Concepts?
Are Today's LLMs Ready to Explain Well-Being Concepts?
Bohan Jiang
Dawei Li
Zhen Tan
Chengshuai Zhao
Huan Liu
AI4MH
46
0
0
06 Aug 2025
Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains
Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains
Anisha Gunjal
Anthony Wang
Elaine Lau
Vaskar Nath
Bing Liu
Bing Liu
Sean Hendryx
OffRL
67
15
0
23 Jul 2025
1