Communities
Connect sessions
AI calendar
Organizations
Contact Sales
Search
Open menu
Home
Papers
2506.15651
Cited By
AutoRule: Reasoning Chain-of-thought Extracted Rule-based Rewards Improve Preference Learning
18 June 2025
Tevin Wang
Chenyan Xiong
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"AutoRule: Reasoning Chain-of-thought Extracted Rule-based Rewards Improve Preference Learning"
4 / 4 papers shown
Title
What Generative Search Engines Like and How to Optimize Web Content Cooperatively
Yujiang Wu
Shanshan Zhong
Yubin Kim
Chenyan Xiong
0
0
0
13 Oct 2025
Beneficial Reasoning Behaviors in Agentic Search and Effective Post-training to Obtain Them
Jiahe Jin
Abhijay Paladugu
Chenyan Xiong
AIFin
LRM
56
0
0
08 Oct 2025
Reinforcement Learning Meets Large Language Models: A Survey of Advancements and Applications Across the LLM Lifecycle
Keliang Liu
Dingkang Yang
Ziyun Qian
Weijie Yin
Y. Wang
Hongsheng Li
Jun Liu
Peng Zhai
Y. Liu
Lihua Zhang
OffRL
LRM
32
0
0
20 Sep 2025
A Survey on Progress in LLM Alignment from the Perspective of Reward Design
Miaomiao Ji
Yanqiu Wu
Zhibin Wu
Shoujin Wang
Jian Yang
Mark Dras
Usman Naseem
156
6
0
05 May 2025
1