Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2507.10616
Cited By
v1
v2 (latest)
Scalpel vs. Hammer: GRPO Amplifies Existing Capabilities, SFT Replaces Them
13 July 2025
Neel Rajani
Aryo Pradipta Gema
Seraphina Goldfarb-Tarrant
Ivan Titov
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Scalpel vs. Hammer: GRPO Amplifies Existing Capabilities, SFT Replaces Them"
4 / 4 papers shown
Zooming into Comics: Region-Aware RL Improves Fine-Grained Comic Understanding in Vision-Language Models
Yule Chen
Yufan Ren
Sabine Süsstrunk
VLM
102
0
0
09 Nov 2025
Towards a Unified View of Large Language Model Post-Training
Xingtai Lv
Yuxin Zuo
Youbang Sun
Hongyi Liu
Yuntian Wei
...
Lixuan He
Xuekai Zhu
Kaiyan Zhang
Bingning Wang
Ning Ding
OffRL
108
11
0
04 Sep 2025
AMFT: Aligning LLM Reasoners by Meta-Learning the Optimal Imitation-Exploration Balance
Lixuan He
Jie Feng
Yong Li
OffRL
LRM
235
3
0
09 Aug 2025
Revisiting LLM Reasoning via Information Bottleneck
Shiye Lei
Zhihao Cheng
Kai Jia
Dacheng Tao
LRM
169
10
0
24 Jul 2025
1