Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.11468
Cited By
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
10 April 2025
Hardy Chen
Haoqin Tu
Fali Wang
Hui Liu
X. Tang
Xinya Du
Yuyin Zhou
Cihang Xie
ReLM
VLM
OffRL
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models"
6 / 6 papers shown
Title
Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning
Shaokun Zhang
Yi Dong
Jieyu Zhang
Jan Kautz
Bryan Catanzaro
Andrew Tao
Qingyun Wu
Zhiding Yu
Guilin Liu
LLMAG
OffRL
KELM
LRM
79
0
0
25 Apr 2025
Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning
Chris
Yichen Wei
Yi Peng
X. Wang
Weijie Qiu
...
Jianhao Zhang
Y. Hao
Xuchen Song
Yang Liu
Yahui Zhou
OffRL
AI4TS
SyDa
LRM
VLM
67
0
0
23 Apr 2025
Learning to Reason under Off-Policy Guidance
Jianhao Yan
Yafu Li
Zican Hu
Zhi Wang
Ganqu Cui
Xiaoye Qu
Yu Cheng
Yue Zhang
OffRL
LRM
34
0
0
21 Apr 2025
LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception
Yuan-Hong Liao
Sven Elflein
Liu He
Laura Leal-Taixe
Yejin Choi
Sanja Fidler
David Acuna
ReLM
LRM
VLM
24
0
0
21 Apr 2025
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
Xiangyan Liu
Jinjie Ni
Zijian Wu
Chao Du
Longxu Dou
H. Wang
Tianyu Pang
Michael Shieh
OffRL
LRM
26
0
0
17 Apr 2025
Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use
Anna Goldie
Azalia Mirhoseini
Hao Zhou
Irene Cai
Christopher D. Manning
SyDa
OffRL
ReLM
LRM
94
3
0
07 Apr 2025
1