Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Home
Papers
2410.01560
Cited By
v1
v2 (latest)
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data
International Conference on Learning Representations (ICLR), 2024
2 October 2024
Shubham Toshniwal
Wei Du
Ivan Moshkov
Branislav Kisacanin
Alexan Ayrapetyan
Igor Gitman
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (4 upvotes)
Papers citing
"OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data"
50 / 83 papers shown
Not-a-Bandit: Provably No-Regret Drafter Selection in Speculative Decoding for LLMs
Hongyi Liu
Jiaji Huang
Zhen Jia
Youngsuk Park
Yu Wang
OffRL
119
1
0
22 Oct 2025
ECG-LLM-- training and evaluation of domain-specific large language models for electrocardiography
Lara Ahrens
Wilhelm Haverkamp
Nils Strodthoff
118
0
0
21 Oct 2025
Pay Attention to the Triggers: Constructing Backdoors That Survive Distillation
Giovanni De Muri
Mark Vero
Robin Staab
Martin Vechev
151
0
0
21 Oct 2025
FineVision: Open Data Is All You Need
Luis Wiedmann
Orr Zohar
Amir Mahla
Xiaohan Wang
Rui Li
Thibaud Frere
Leandro von Werra
Aritra Roy Gosthipaty
Andrés Marafioti
VLM
192
12
0
20 Oct 2025
QueST: Incentivizing LLMs to Generate Difficult Problems
Hanxu Hu
Xingxing Zhang
Jannis Vamvas
Rico Sennrich
Furu Wei
AIMat
SyDa
MQ
LRM
255
0
0
20 Oct 2025
To Infinity and Beyond: Tool-Use Unlocks Length Generalization in State Space Models
Eran Malach
Omid Saremi
Sinead Williamson
Arwen Bradley
Aryo Lotfi
Emmanuel Abbe
J. Susskind
Etai Littwin
152
0
0
16 Oct 2025
HoneyBee: Data Recipes for Vision-Language Reasoners
Hritik Bansal
Devandra Singh Sachan
Kai-Wei Chang
Aditya Grover
Gargi Ghosh
Wen-tau Yih
Ramakanth Pasunuru
VLM
LRM
146
3
0
14 Oct 2025
Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought
Guijin Son
Donghun Yang
Hitesh Laxmichand Patel
Amit Agarwal
Hyunwoo Ko
...
Minhyuk Kim
Nikunj Drolia
Dasol Choi
Kyong-Ha Lee
Youngjae Yu
LRM
146
0
0
05 Oct 2025
Principled and Tractable RL for Reasoning with Diffusion Language Models
Anthony Zhan
DiffM
AI4CE
97
2
0
05 Oct 2025
GuidedSampling: Steering LLMs Towards Diverse Candidate Solutions at Inference-Time
Divij Handa
Mihir Parmar
Aswin Rrv
Md Nayem Uddin
Hamid Palangi
Chitta Baral
87
0
0
04 Oct 2025
Graph2Eval: Automatic Multimodal Task Generation for Agents via Knowledge Graphs
Yurun Chen
Xavier Hu
Y. Liu
Ziqi Wang
Zeyi Liao
...
Feng Wei
Yuxi Qian
Bo Zheng
Keting Yin
Shengyu Zhang
LLMAG
229
1
0
01 Oct 2025
Beyond English-Centric Training: How Reinforcement Learning Improves Cross-Lingual Reasoning in LLMs
Shulin Huang
Yiran Ding
Junshu Pan
Yue Zhang
OffRL
LRM
108
1
0
28 Sep 2025
Learning More with Less: A Dynamic Dual-Level Down-Sampling Framework for Efficient Policy Optimization
Chao Wang
Tao Yang
Hongtao Tian
Yunsheng Shi
Qiyao Ma
Xiaotao Liu
Ting Yao
Wenbo Ding
OffRL
117
0
0
26 Sep 2025
Exploring Solution Divergence and Its Effect on Large Language Model Problem Solving
Hang Li
Kaiqi Yang
Yucheng Chu
Hui Liu
Shucheng Zhou
MoMe
LRM
121
0
0
26 Sep 2025
ScaleDiff: Scaling Difficult Problems for Advanced Mathematical Reasoning
Qizhi Pei
Zhuoshi Pan
Honglin Lin
Xin Gao
Yu Li
Zinan Tang
Conghui He
Rui Yan
Lijun Wu
AIMat
OffRL
LRM
223
2
0
25 Sep 2025
Expanding Reasoning Potential in Foundation Model by Learning Diverse Chains of Thought Patterns
Xuemiao Zhang
Can Ren
Chengying Tu
Rongxiang Weng
Shuo Wang
Hongfei Yan
Jingang Wang
Xunliang Cai
LRM
AI4CE
209
1
0
25 Sep 2025
CogAtom: From Cognitive Atoms to Olympiad-level Mathematical Reasoning in Large Language Models
Zhuofan Chen
Jiyuan He
Yichi Zhang
Xing Hu
Haoxing Wen
Jun Bai
Wenge Rong
LRM
230
0
0
22 Sep 2025
SAIL-VL2 Technical Report
Weijie Yin
Yongjie Ye
Fangxun Shu
Yue Liao
Zijian Kang
...
Han Wang
Wenzhuo Liu
Xiao Liang
Shuicheng Yan
Chao Feng
LRM
VLM
285
4
0
17 Sep 2025
Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks
Taishi Nakamura
Satoki Ishikawa
Masaki Kawamura
Takumi Okamoto
Daisuke Nohara
Jun Suzuki
Rio Yokota
MoE
LRM
175
0
0
26 Aug 2025
Can Structured Templates Facilitate LLMs in Tackling Harder Tasks? : An Exploration of Scaling Laws by Difficulty
Zhichao Yang
Zhaoxin Fan
Gen Li
Yuanze Hu
Xinyu Wang
Ye Qiu
Xin Wang
Yifan Sun
Wenjun Wu
LRM
76
0
0
26 Aug 2025
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Nvidia
Aarti Basant
Abhijit Khairnar
Abhijit Paithankar
Abhinav Khattar
...
Keith Wyss
Keshav Santhanam
Kezhi Kong
Krzysztof Pawelec
Kumar Anik
LRM
291
0
0
20 Aug 2025
Data Mixing Optimization for Supervised Fine-Tuning of Large Language Models
Yuan Li
Zhengzhong Liu
Eric P. Xing
128
1
0
16 Aug 2025
Apriel-Nemotron-15B-Thinker
Shruthan Radhakrishna
S. Parikh
Gopal Sarda
Anil Turkkan
Quaizar Vohra
...
Sathwik Tejaswi Madhusudhan
Torsten Scholak
Sébastien Paquet
Sagar Davasam
Srinivas Sunkara
LLMAG
MoE
LRM
184
2
0
13 Aug 2025
MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy
Shaoxiong Zhan
Yanlin Lai
Ziyu Lu
Dahua Lin
Ziqing Yang
Fei Tang
LRM
115
10
0
07 Aug 2025
WarriorMath: Enhancing the Mathematical Ability of Large Language Models with a Defect-aware Framework
Yue Chen
Minghua He
Fangkai Yang
Pu Zhao
Lu Wang
...
Yuefeng Zhan
Hao Sun
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
172
2
0
02 Aug 2025
SAND-Math: Using LLMs to Generate Novel, Difficult and Useful Mathematics Questions and Answers
Chaitanya Manem
Pratik Prabhanjan Brahma
Prakamya Mishra
Zicheng Liu
Emad Barsoum
AIMat
LRM
342
4
0
28 Jul 2025
Diversity-Enhanced Reasoning for Subjective Questions
Yumeng Wang
Zhiyuan Fan
Jiayu Liu
J. Huang
Yi R. Fung
LRM
470
5
0
27 Jul 2025
PITA: Preference-Guided Inference-Time Alignment for LLM Post-Training
Sarat Chandra Bobbili
Ujwal Dinesha
Dheeraj Narasimha
S. Shakkottai
147
2
0
26 Jul 2025
GenSelect: A Generative Approach to Best-of-N
Shubham Toshniwal
Ivan Sorokin
Aleksander Ficek
Ivan Moshkov
Igor Gitman
LRM
135
6
0
23 Jul 2025
MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning
Run-Ze Fan
Zengzhi Wang
Pengfei Liu
LRM
315
11
0
22 Jul 2025
EvoLM: In Search of Lost Language Model Training Dynamics
Zhenting Qi
Fan Nie
Alexandre Alahi
James Zou
Himabindu Lakkaraju
Yilun Du
Eric P. Xing
Sham Kakade
Hanlin Zhang
304
2
0
19 Jun 2025
Test-Time-Scaling for Zero-Shot Diagnosis with Visual-Language Reasoning
Ji Young Byun
Young-Jin Park
Navid Azizan
Rama Chellappa
LM&MA
LRM
161
1
0
11 Jun 2025
TaskCraft: Automated Generation of Agentic Tasks
Dingfeng Shi
Jingyi Cao
Qianben Chen
W. Sun
W. Li
...
Jiaheng Liu
Changwang Zhang
Jun Wang
Yuchen Eleanor Jiang
Wangchunshu Zhou
302
20
0
11 Jun 2025
Reinforce LLM Reasoning through Multi-Agent Reflection
Yurun Yuan
Tengyang Xie
LRM
305
16
0
10 Jun 2025
A Survey on Large Language Models for Mathematical Reasoning
Peng-Yuan Wang
Tian-Shuo Liu
Chenyang Wang
Yi-Di Wang
Shu Yan
...
Xu-Hui Liu
Xin-Wei Chen
Jia-Cheng Xu
Ziniu Li
Yang Yu
LRM
269
18
0
10 Jun 2025
Improving Large Language Models with Concept-Aware Fine-Tuning
Michael K. Chen
Xikun Zhang
Jiaxing Huang
Dacheng Tao
273
1
0
09 Jun 2025
SPARQ: Synthetic Problem Generation for Reasoning via Quality-Diversity Algorithms
Alex Havrilla
Edward Hughes
Mikayel Samvelyan
Jacob Abernethy
SyDa
LRM
306
5
0
06 Jun 2025
Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Kejian Zhu
Shangqing Tu
Zhuoran Jin
Lei Hou
Juanzi Li
Jun Zhao
KELM
223
0
0
04 Jun 2025
DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning
Ziyin Zhang
Jiahao Xu
Zhiwei He
Tian Liang
Qiuzhi Liu
...
Zhuosheng Zhang
Rui Wang
Zhaopeng Tu
Haitao Mi
Dong Yu
OffRL
LRM
302
10
0
29 May 2025
Benchmarking Abstract and Reasoning Abilities Through A Theoretical Perspective
Qingchuan Ma
Yuhang Wu
Xiawu Zheng
Rongrong Ji
202
1
0
28 May 2025
LASER: Stratified Selective Sampling for Instruction Tuning with Dedicated Scoring Strategy
Paramita Mirza
Lucas Weber
Fabian Küch
272
0
0
28 May 2025
ReCopilot: Reverse Engineering Copilot in Binary Analysis
Guoqiang Chen
Huiqi Sun
Daguang Liu
Zhiqi Wang
Qiang Wang
Bin Yin
Lu Liu
Lingyun Ying
209
6
0
22 May 2025
Watch your steps: Dormant Adversarial Behaviors that Activate upon LLM Finetuning
Thibaud Gloaguen
Mark Vero
Robin Staab
Martin Vechev
AAML
469
0
0
22 May 2025
Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models
Jingcong Liang
Siyuan Wang
Miren Tian
Yitong Li
Duyu Tang
Zhongyu Wei
MoE
311
0
0
21 May 2025
Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards
Xiaoyuan Liu
Tian Liang
Zhiwei He
Jiahao Xu
Wenxuan Wang
Pinjia He
Zhaopeng Tu
Haitao Mi
Dong Yu
OffRL
ReLM
LRM
345
15
0
19 May 2025
Multi-Token Prediction Needs Registers
Anastasios Gerontopoulos
Spyros Gidaris
N. Komodakis
375
3
0
15 May 2025
FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation
Chaitali Bhattacharyya
Hyunsei Lee
Junyoung Lee
Shinhyoung Jang
Il hong Suh
Yeseong Kim
297
3
0
01 May 2025
Accurate and Diverse LLM Mathematical Reasoning via Automated PRM-Guided GFlowNets
Adam Younsi
Abdalgader Abubaker
Abdalgader Abubaker
Hakim Hacid
Hakim Hacid
Salem Lahlou
LRM
551
6
0
28 Apr 2025
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset
Ivan Moshkov
Darragh Hanley
Ivan Sorokin
Shubham Toshniwal
Christof Henkel
Benedikt Schifferer
Wei Du
Igor Gitman
ReLM
LRM
286
65
0
23 Apr 2025
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Junxiong Wang
Wen-Ding Li
Daniele Paliotta
Daniel Ritter
Alexander M. Rush
Tri Dao
LRM
340
12
0
14 Apr 2025
1
2
Next