Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2501.10799
Cited By
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback
18 January 2025
Yen-Ting Lin
Di Jin
Tengyu Xu
Tianhao Wu
Sainbayar Sukhbaatar
Chen Zhu
Yun He
Yun-Nung Chen
Jason Weston
Yuandong Tian
Arash Rahnama
Sinong Wang
Hao Ma
Han Fang
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (15 upvotes)
Papers citing
"Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback"
9 / 9 papers shown
SIGMA: Search-Augmented On-Demand Knowledge Integration for Agentic Mathematical Reasoning
Ali Asgarov
Umid Suleymanov
Aadyant Khatri
LRM
198
0
0
31 Oct 2025
Enhancing Large Language Model Reasoning with Reward Models: An Analytical Survey
Qiyuan Liu
Hao Xu
Xuhong Chen
Wei Chen
Yee Whye Teh
Ning Miao
ReLM
LRM
AI4CE
278
0
0
02 Oct 2025
Humanline: Online Alignment as Perceptual Loss
Sijia Liu
Niklas Muennighoff
Kawin Ethayarajh
84
0
0
29 Sep 2025
UAV-ON: A Benchmark for Open-World Object Goal Navigation with Aerial Agents
Jianqiang Xiao
Yuexuan Sun
Yixin Shao
Boxi Gan
Rongqiang Liu
Yanjing Wu
Weili Gua
Xiang Deng
272
0
0
01 Aug 2025
Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models
Zizhuo Zhang
Jianing Zhu
Xinmu Ge
Zihua Zhao
Zhanke Zhou
Xuan Li
Xiao Feng
Jiangchao Yao
Bo Han
ALM
LRM
284
0
0
01 Aug 2025
Flow Matching Meets PDEs: A Unified Framework for Physics-Constrained Generation
Giacomo Baldan
Qiang Liu
Alberto Guardone
Nils Thuerey
AI4CE
177
6
0
10 Jun 2025
A Survey on Large Language Models for Mathematical Reasoning
Peng-Yuan Wang
Tian-Shuo Liu
Chenyang Wang
Yi-Di Wang
Shu Yan
...
Xu-Hui Liu
Xin-Wei Chen
Jia-Cheng Xu
Ziniu Li
Yang Yu
LRM
269
18
0
10 Jun 2025
SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks
Yifei Zhou
Song Jiang
Yuandong Tian
Jason Weston
Sergey Levine
Sainbayar Sukhbaatar
Xian Li
LLMAG
LRM
391
49
0
19 Mar 2025
PIPA: Preference Alignment as Prior-Informed Statistical Estimation
Junbo Li
Zinan Lin
Qiang Liu
OffRL
417
0
0
09 Feb 2025
1