ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.15651
  4. Cited By
AutoRule: Reasoning Chain-of-thought Extracted Rule-based Rewards Improve Preference Learning

AutoRule: Reasoning Chain-of-thought Extracted Rule-based Rewards Improve Preference Learning

18 June 2025
Tevin Wang
Chenyan Xiong
    LRM
ArXiv (abs)PDFHTML

Papers citing "AutoRule: Reasoning Chain-of-thought Extracted Rule-based Rewards Improve Preference Learning"

4 / 4 papers shown
Title
What Generative Search Engines Like and How to Optimize Web Content Cooperatively
What Generative Search Engines Like and How to Optimize Web Content Cooperatively
Yujiang Wu
Shanshan Zhong
Yubin Kim
Chenyan Xiong
0
0
0
13 Oct 2025
Beneficial Reasoning Behaviors in Agentic Search and Effective Post-training to Obtain Them
Beneficial Reasoning Behaviors in Agentic Search and Effective Post-training to Obtain Them
Jiahe Jin
Abhijay Paladugu
Chenyan Xiong
AIFinLRM
56
0
0
08 Oct 2025
Reinforcement Learning Meets Large Language Models: A Survey of Advancements and Applications Across the LLM Lifecycle
Reinforcement Learning Meets Large Language Models: A Survey of Advancements and Applications Across the LLM Lifecycle
Keliang Liu
Dingkang Yang
Ziyun Qian
Weijie Yin
Y. Wang
Hongsheng Li
Jun Liu
Peng Zhai
Y. Liu
Lihua Zhang
OffRLLRM
32
0
0
20 Sep 2025
A Survey on Progress in LLM Alignment from the Perspective of Reward Design
A Survey on Progress in LLM Alignment from the Perspective of Reward Design
Miaomiao Ji
Yanqiu Wu
Zhibin Wu
Shoujin Wang
Jian Yang
Mark Dras
Usman Naseem
156
6
0
05 May 2025
1