Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2505.17612
Cited By
v1
v2 (latest)
Distilling LLM Agent into Small Models with Retrieval and Code Tools
23 May 2025
Minki Kang
Jongwon Jeong
Seanie Lee
Jaewoong Cho
Sung Ju Hwang
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (80 upvotes)
Github (171★)
Papers citing
"Distilling LLM Agent into Small Models with Retrieval and Code Tools"
50 / 63 papers shown
Title
MENTOR: A Reinforcement Learning Framework for Enabling Tool Use in Small Models via Teacher-Optimized Rewards
Changsu Choi
Hoyun Song
Dongyeon Kim
WooHyeon Jung
Minkyung Cho
Sunjin Park
NohHyeob Bae
Seona Yu
Kyungtae Lim
152
0
0
21 Oct 2025
A Survey on Collaborating Small and Large Language Models for Performance, Cost-effectiveness, Cloud-edge Privacy, and Trustworthiness
Fali Wang
Jihai Chen
Shuhua Yang
Ali Al-Lawati
Linli Tang
Hui Liu
Suhang Wang
155
2
0
14 Oct 2025
From Correction to Mastery: Reinforced Distillation of Large Language Model Agents
Yuanjie Lyu
Chengyu Wang
Jun Huang
Tong Xu
ALM
LRM
232
2
0
12 Sep 2025
TURA: Tool-Augmented Unified Retrieval Agent for AI Search
Zhejun Zhao
Yuehu Dong
Alley Liu
Lixue Zheng
Pingsheng Liu
Dongdong Shen
Long Xia
Jiashu Zhao
D. Yin
114
4
0
06 Aug 2025
AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes
Jiahao Qiu
Xinzhe Juan
Yimin Wang
L. Yang
Xuan Qi
...
Hongru Wang
Shilong Liu
Xun Jiang
Liu Leqi
Mengdi Wang
181
9
0
17 Jun 2025
Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning
Joykirat Singh
Raghav Magazine
Yash Pandya
A. Nambi
LLMAG
KELM
OffRL
LRM
654
47
0
28 Apr 2025
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models
Minki Kang
Jongwon Jeong
Jaewoong Cho
ALM
LRM
272
6
0
07 Apr 2025
Open Deep Search: Democratizing Search with Open-source Reasoning Agents
Salaheddin Alzubi
Creston Brooks
Purva Chiniya
Edoardo Contente
Chiara von Gerlach
...
Arda Kaz
Windsor Nguyen
Sewoong Oh
Himanshu Tyagi
Pramod Viswanath
VLM
ELM
LRM
311
35
0
26 Mar 2025
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Sara Szymkuć
Hansi Zeng
Zhenrui Yue
Jinsung Yoon
Sercan O. Arik
Dong Wang
Hamed Zamani
Jiawei Han
OffRL
AI4TS
LRM
RALM
ReLM
KELM
740
523
0
12 Mar 2025
DIMSUM: Discourse in Mathematical Reasoning as a Supervision Module
Krish Sharma
Niyar R. Barman
Nicholas M. Asher
Akshay Chaturvedi
LRM
AIMat
274
39
0
06 Mar 2025
Process Reward Models for LLM Agents: Practical Framework and Directions
Sanjiban Choudhury
196
33
0
17 Feb 2025
Agentic Reasoning: A Streamlined Framework for Enhancing LLM Reasoning with Agentic Tools
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Junde Wu
Jiayuan Zhu
Yuyuan Liu
Min Xu
Yueming Jin
LRM
454
25
0
07 Feb 2025
Mentor-KD: Making Small Language Models Better Multi-step Reasoners
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Hojae Lee
Junho Kim
SangKeun Lee
LRM
170
13
0
11 Oct 2024
AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yifan Song
Weimin Xiong
Xiutian Zhao
Dawei Zhu
Wenhao Wu
Ke Wang
Cheng Li
Wei Peng
Sujian Li
LLMAG
184
28
0
10 Oct 2024
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models
International Conference on Learning Representations (ICLR), 2024
Seanie Lee
Haebin Seong
Dong Bok Lee
Minki Kang
Xiaoyin Chen
Dominik Wagner
Yoshua Bengio
Juho Lee
Sung Ju Hwang
378
13
0
02 Oct 2024
Small Language Models: Survey, Measurements, and Insights
Zhenyan Lu
Xiang Li
Dongqi Cai
Rongjie Yi
Fangming Liu
Xiwen Zhang
Nicholas D. Lane
Mengwei Xu
ObjD
LRM
449
101
0
24 Sep 2024
Qwen2.5-Coder Technical Report
Binyuan Hui
Jian Yang
Zeyu Cui
Jiaxi Yang
Dayiheng Liu
...
Fei Huang
Xingzhang Ren
Xuancheng Ren
Jingren Zhou
Junyang Lin
OSLM
315
781
0
18 Sep 2024
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
An Yang
Beichen Zhang
Binyuan Hui
Bofei Gao
Bowen Yu
...
Mingfeng Xue
Runji Lin
Tianyu Liu
Xingzhang Ren
Zhenru Zhang
OSLM
LRM
375
669
0
18 Sep 2024
AgentInstruct: Toward Generative Teaching with Agentic Flows
Arindam Mitra
Luciano Del Corro
Guoqing Zheng
Shweti Mahajan
Dany Rouhana
...
Corby Rosset
Fillipe Silva
Hamed Khanpour
Yash Lara
Ahmed Awadallah
SyDa
414
58
0
03 Jul 2024
Safety Alignment Should Be Made More Than Just a Few Tokens Deep
International Conference on Learning Representations (ICLR), 2024
Xiangyu Qi
Ashwinee Panda
Kaifeng Lyu
Xiao Ma
Subhrajit Roy
Ahmad Beirami
Prateek Mittal
Peter Henderson
215
260
0
10 Jun 2024
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models
Zehui Chen
Kuikun Liu
Qiuchen Wang
Wenwei Zhang
Jiangning Liu
Dahua Lin
Kai-xiang Chen
Feng Zhao
LLMAG
ALM
AIFin
291
62
0
19 Mar 2024
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
Jianguo Zhang
Tian Lan
Rithesh Murthy
Zhiwei Liu
Weiran Yao
...
Juan Carlos Niebles
Silvio Savarese
Shelby Heinecke
Huan Wang
Caiming Xiong
LLMAG
332
49
0
23 Feb 2024
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Zechun Liu
Changsheng Zhao
Forrest N. Iandola
Chen Lai
Yuandong Tian
...
Ernie Chang
Yangyang Shi
Raghuraman Krishnamoorthi
Liangzhen Lai
Vikas Chandra
ALM
296
176
0
22 Feb 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Yiming Li
Yu-Huan Wu
Daya Guo
ReLM
LRM
1.2K
3,611
0
05 Feb 2024
Executable Code Actions Elicit Better LLM Agents
Xingyao Wang
Yangyi Chen
Lifan Yuan
Yizhe Zhang
Yunzhu Li
Yuan Yao
Heng Ji
ELM
LLMAG
LM&Ro
743
293
0
01 Feb 2024
Distilling Mathematical Reasoning Capabilities into Small Language Models
Neural Networks (NN), 2024
Xunyu Zhu
Jian Li
Yong Liu
Can Ma
Weiping Wang
LRM
219
26
0
22 Jan 2024
AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Shuofei Qiao
Ningyu Zhang
Runnan Fang
Yujie Luo
Wangchunshu Zhou
Yuchen Eleanor Jiang
Chengfei Lv
Huajun Chen
LLMAG
265
66
0
10 Jan 2024
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents
Ke Yang
Jiateng Liu
John Wu
Chaoqi Yang
Yi R. Fung
...
Xu Cao
Xingyao Wang
Yiquan Wang
Chenhui Xu
Chengxiang Zhai
LLMAG
ELM
420
111
0
01 Jan 2024
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Peiyi Wang
Lei Li
Zhihong Shao
R. X. Xu
Damai Dai
Yifei Li
Deli Chen
Y.Wu
Zhifang Sui
AIMat
LRM
ALM
410
648
0
14 Dec 2023
First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Kushal Kumar Jain
Moritz Miller
Niket Tandon
Kumar Shridhar
ReLM
LRM
264
4
0
14 Nov 2023
Agent Lumos: Unified and Modular Training for Open-Source Language Agents
Da Yin
Faeze Brahman
Abhilasha Ravichander
Khyathi Chandu
Kai-Wei Chang
Yejin Choi
Bill Yuchen Lin
LLMAG
268
58
0
09 Nov 2023
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Aohan Zeng
Mingdao Liu
Rui Lu
Bowen Wang
Xiao Liu
Yuxiao Dong
Jie Tang
LM&MA
ALM
LLMAG
374
253
0
19 Oct 2023
FireAct: Toward Language Agent Fine-tuning
Baian Chen
Chang Shu
Ehsan Shareghi
Nigel Collier
Karthik Narasimhan
Shunyu Yao
ALM
LLMAG
336
154
0
09 Oct 2023
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving
International Conference on Learning Representations (ICLR), 2023
Zhibin Gou
Zhihong Shao
Yeyun Gong
Haoran Pan
Yujiu Yang
Shiyu Huang
Nan Duan
Weizhu Chen
LRM
AI4CE
LLMAG
363
251
0
29 Sep 2023
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
International Conference on Learning Representations (ICLR), 2023
Xingyao Wang
Zihan Wang
Jiateng Liu
Yangyi Chen
Lifan Yuan
Hao Peng
Heng Ji
LRM
417
244
0
19 Sep 2023
Jailbroken: How Does LLM Safety Training Fail?
Neural Information Processing Systems (NeurIPS), 2023
Alexander Wei
Nika Haghtalab
Jacob Steinhardt
761
1,362
0
05 Jul 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Neural Information Processing Systems (NeurIPS), 2023
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
2.7K
6,354
0
09 Jun 2023
Orca: Progressive Learning from Complex Explanation Traces of GPT-4
Subhabrata Mukherjee
Arindam Mitra
Ganesh Jawahar
Sahaj Agarwal
Hamid Palangi
Ahmed Hassan Awadallah
ELM
ALM
LRM
422
338
0
05 Jun 2023
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks
Neural Information Processing Systems (NeurIPS), 2023
Minki Kang
Seanie Lee
Jinheon Baek
Kenji Kawaguchi
Sung Ju Hwang
ALM
LRM
244
92
0
28 May 2023
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Xuekai Zhu
Biqing Qi
Kaiyan Zhang
Xingwei Long
Zhouhan Lin
Bowen Zhou
ALM
LRM
261
27
0
23 May 2023
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Lokesh Nagalapatti
Chun-Liang Li
Chih-Kuan Yeh
Hootan Nakhost
Yasuhisa Fujii
Alexander Ratner
Ranjay Krishna
Chen-Yu Lee
Tomas Pfister
ALM
720
713
0
03 May 2023
Reflexion: Language Agents with Verbal Reinforcement Learning
Neural Information Processing Systems (NeurIPS), 2023
Noah Shinn
Federico Cassano
Beck Labash
A. Gopinath
Karthik Narasimhan
Shunyu Yao
LLMAG
KELM
589
2,157
0
20 Mar 2023
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
3.9K
20,416
0
15 Mar 2023
Specializing Smaller Language Models towards Multi-Step Reasoning
International Conference on Machine Learning (ICML), 2023
Yao Fu
Hao-Chun Peng
Litu Ou
Ashish Sabharwal
Tushar Khot
ReLM
LRM
219
316
0
30 Jan 2023
Large Language Models Are Reasoning Teachers
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Namgyu Ho
Laura Schmid
Se-Young Yun
ReLM
ELM
LRM
308
429
0
20 Dec 2022
Teaching Small Language Models to Reason
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Lucie Charlotte Magister
Jonathan Mallinson
Jakub Adamek
Eric Malmi
Aliaksei Severyn
LRM
AI4CE
ReLM
494
315
0
16 Dec 2022
Text Embeddings by Weakly-Supervised Contrastive Pre-training
Liang Wang
Nan Yang
Xiaolong Huang
Binxing Jiao
Linjun Yang
Daxin Jiang
Rangan Majumder
Furu Wei
VLM
606
963
0
07 Dec 2022
PAL: Program-aided Language Models
International Conference on Machine Learning (ICML), 2022
Luyu Gao
Aman Madaan
Shuyan Zhou
Uri Alon
Pengfei Liu
Yiming Yang
Jamie Callan
Graham Neubig
ReLM
LRM
498
606
0
18 Nov 2022
Large Language Models Struggle to Learn Long-Tail Knowledge
International Conference on Machine Learning (ICML), 2022
Nikhil Kandpal
H. Deng
Adam Roberts
Eric Wallace
Colin Raffel
RALM
KELM
407
541
0
15 Nov 2022
Measuring and Narrowing the Compositionality Gap in Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Ofir Press
Muru Zhang
Sewon Min
Ludwig Schmidt
Noah A. Smith
M. Lewis
ReLM
KELM
LRM
677
910
0
07 Oct 2022
1
2
Next