Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.16182
Cited By
IPO: Your Language Model is Secretly a Preference Classifier
22 February 2025
Shivank Garg
Ayush Singh
Shweta Singh
Paras Chopra
Re-assign community
ArXiv
PDF
HTML
Papers citing
"IPO: Your Language Model is Secretly a Preference Classifier"
1 / 1 papers shown
Title
Exploring the Potential of Offline RL for Reasoning in LLMs: A Preliminary Study
Xiaoyu Tian
Sitong Zhao
Haotian Wang
Shuaiting Chen
Yiping Peng
Yunjie Ji
Han Zhao
Xiangang Li
OffRL
LRM
17
0
0
04 May 2025
1