ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.17266
  4. Cited By
Select2Reason: Efficient Instruction-Tuning Data Selection for Long-CoT Reasoning

Select2Reason: Efficient Instruction-Tuning Data Selection for Long-CoT Reasoning

22 May 2025
Cehao Yang
Xueyuan Lin
Chengjin Xu
Xuhui Jiang
Xiaojun Wu
Honghao Liu
Hui Xiong
Jian Guo
    LRM
ArXivPDFHTML

Papers citing "Select2Reason: Efficient Instruction-Tuning Data Selection for Long-CoT Reasoning"

46 / 46 papers shown
Title
MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space
MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space
Yicheng Chen
Yining Li
Kai Hu
Zerun Ma
Haochen Ye
Kai Chen
52
2
0
18 Apr 2025
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
Weihao Zeng
Yuzhen Huang
Qian Liu
Wei Liu
Keqing He
Zejun Ma
Junxian He
OffRL
ReLM
LRM
122
85
0
24 Mar 2025
CrowdSelect: Synthetic Instruction Data Selection with Multi-LLM Wisdom
Yisen Li
Lingfeng Yang
Wenxuan Shen
Pan Zhou
Yao Wan
Weiwei Lin
Danny Chen
93
1
0
03 Mar 2025
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning
Tian Xie
Zitian Gao
Qingnan Ren
Haoming Luo
Yuqian Hong
Bryan Dai
Joey Zhou
Kai Qiu
Zhirong Wu
Chong Luo
ReLM
OffRL
LRM
127
55
0
21 Feb 2025
LIMO: Less is More for Reasoning
LIMO: Less is More for Reasoning
Yixin Ye
Zhen Huang
Yang Xiao
Ethan Chern
Shijie Xia
Pengfei Liu
LRM
124
132
0
05 Feb 2025
Demystifying Long Chain-of-Thought Reasoning in LLMs
Demystifying Long Chain-of-Thought Reasoning in LLMs
Edward Yeo
Yuxuan Tong
Morry Niu
Graham Neubig
Xiang Yue
OffRL
LRM
121
107
0
05 Feb 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
303
1,503
0
22 Jan 2025
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi Team
Angang Du
Bofei Gao
Bowei Xing
Changjiu Jiang
...
Zihao Huang
Ziyao Xu
Zhiyong Yang
Zonghan Yang
Zongyu Lin
OffRL
ALM
AI4TS
VLM
LRM
214
250
0
22 Jan 2025
IterSelectTune: An Iterative Training Framework for Efficient
  Instruction-Tuning Data Selection
IterSelectTune: An Iterative Training Framework for Efficient Instruction-Tuning Data Selection
Jielin Song
Siyu Liu
Bin Zhu
Yanghui Rao
38
3
0
17 Oct 2024
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via
  Self-Improvement
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
An Yang
Beichen Zhang
Binyuan Hui
Bofei Gao
Bowen Yu
...
Mingfeng Xue
Runji Lin
Tianyu Liu
Xingzhang Ren
Zhenru Zhang
OSLM
LRM
72
251
0
18 Sep 2024
Scaling LLM Test-Time Compute Optimally can be More Effective than
  Scaling Model Parameters
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Charlie Snell
Jaehoon Lee
Kelvin Xu
Aviral Kumar
LRM
124
576
0
06 Aug 2024
G-DIG: Towards Gradient-based Diverse and High-quality Instruction Data
  Selection for Machine Translation
G-DIG: Towards Gradient-based Diverse and High-quality Instruction Data Selection for Machine Translation
Xingyuan Pan
Luyang Huang
Liyan Kang
Zhicheng Liu
Yu Lu
Shanbo Cheng
ALM
86
14
0
21 May 2024
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Yaowei Zheng
Richong Zhang
Junhao Zhang
Yanhan Ye
Zheyan Luo
Zhangchi Feng
Yongqiang Ma
101
479
0
20 Mar 2024
LiveCodeBench: Holistic and Contamination Free Evaluation of Large
  Language Models for Code
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Naman Jain
King Han
Alex Gu
Wen-Ding Li
Fanjia Yan
Tianjun Zhang
Sida I. Wang
Armando Solar-Lezama
Koushik Sen
Ion Stoica
ELM
79
365
0
12 Mar 2024
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large
  Language Models by Summarizing Training Trajectories of Small Models
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models
Yu Yang
Siddhartha Mishra
Jeffrey N Chiang
Baharan Mirzasoleiman
62
21
0
12 Mar 2024
Clustering and Ranking: Diversity-preserved Instruction Selection
  through Expert-aligned Quality Estimation
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation
Yuan Ge
Yilun Liu
Chi Hu
Weibin Meng
Shimin Tao
Xiaofeng Zhao
Hongxia Ma
Li Zhang
Hao Yang
Tong Xiao
ALM
51
33
0
28 Feb 2024
RECOST: External Knowledge Guided Data-efficient Instruction Tuning
RECOST: External Knowledge Guided Data-efficient Instruction Tuning
Qi Zhang
Yiming Zhang
Haobo Wang
Junbo Zhao
64
14
0
27 Feb 2024
OlympiadBench: A Challenging Benchmark for Promoting AGI with
  Olympiad-Level Bilingual Multimodal Scientific Problems
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems
Chaoqun He
Renjie Luo
Yuzhuo Bai
Shengding Hu
Zhen Leng Thai
...
Yuxiang Zhang
Jie Liu
Lei Qi
Zhiyuan Liu
Maosong Sun
ELM
AIMat
76
205
0
21 Feb 2024
Smaller Language Models are capable of selecting Instruction-Tuning
  Training Data for Larger Language Models
Smaller Language Models are capable of selecting Instruction-Tuning Training Data for Larger Language Models
Dheeraj Mekala
Alex Nguyen
Jingbo Shang
ALM
42
21
0
16 Feb 2024
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for
  Instruction Fine-Tuning
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning
Hao Zhao
Maksym Andriushchenko
Francesco Croce
Nicolas Flammarion
ALM
124
53
0
07 Feb 2024
LESS: Selecting Influential Data for Targeted Instruction Tuning
LESS: Selecting Influential Data for Targeted Instruction Tuning
Mengzhou Xia
Sadhika Malladi
Suchin Gururangan
Sanjeev Arora
Danqi Chen
121
210
0
06 Feb 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open
  Language Models
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Yiming Li
Yu-Huan Wu
Daya Guo
ReLM
LRM
94
953
0
05 Feb 2024
Superfiltering: Weak-to-Strong Data Filtering for Fast
  Instruction-Tuning
Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
Ming Li
Yong Zhang
Shwai He
Zhitao Li
Hongyu Zhao
Jianzong Wang
Ning Cheng
Dinesh Manocha
72
72
0
01 Feb 2024
What Makes Good Data for Alignment? A Comprehensive Study of Automatic
  Data Selection in Instruction Tuning
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning
Wei Liu
Weihao Zeng
Keqing He
Yong Jiang
Junxian He
ALM
80
231
0
25 Dec 2023
One-Shot Learning as Instruction Data Prospector for Large Language
  Models
One-Shot Learning as Instruction Data Prospector for Large Language Models
Yunshui Li
Binyuan Hui
Xiaobo Xia
Jiaxi Yang
Min Yang
...
Ling-Hao Chen
Junhao Liu
Tongliang Liu
Fei Huang
Yongbin Li
58
35
0
16 Dec 2023
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human
  Annotations
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations
Peiyi Wang
Lei Li
Zhihong Shao
R. X. Xu
Damai Dai
Yifei Li
Deli Chen
Y.Wu
Zhifang Sui
AIMat
LRM
ALM
83
316
0
14 Dec 2023
MoDS: Model-oriented Data Selection for Instruction Tuning
MoDS: Model-oriented Data Selection for Instruction Tuning
Qianlong Du
Chengqing Zong
Jiajun Zhang
ALM
71
83
0
27 Nov 2023
Data Diversity Matters for Robust Instruction Tuning
Data Diversity Matters for Robust Instruction Tuning
Alexander Bukharin
Tuo Zhao
97
41
0
21 Nov 2023
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
David Rein
Betty Li Hou
Asa Cooper Stickland
Jackson Petty
Richard Yuanzhe Pang
Julien Dirani
Julian Michael
Samuel R. Bowman
AI4MH
ELM
72
627
0
20 Nov 2023
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language
  Models
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
L. Yu
Weisen Jiang
Han Shi
Jincheng Yu
Zhengying Liu
Yu Zhang
James T. Kwok
Zheng Li
Adrian Weller
Weiyang Liu
OSLM
LRM
80
363
0
21 Sep 2023
Efficient Memory Management for Large Language Model Serving with
  PagedAttention
Efficient Memory Management for Large Language Model Serving with PagedAttention
Woosuk Kwon
Zhuohan Li
Siyuan Zhuang
Ying Sheng
Lianmin Zheng
Cody Hao Yu
Joseph E. Gonzalez
Haotong Zhang
Ion Stoica
VLM
136
2,049
0
12 Sep 2023
Code Llama: Open Foundation Models for Code
Code Llama: Open Foundation Models for Code
Baptiste Rozière
Jonas Gehring
Fabian Gloeckle
Sten Sootla
Itai Gat
...
Hugo Touvron
Louis Martin
Nicolas Usunier
Thomas Scialom
Gabriel Synnaeve
ELM
ALM
84
1,990
0
24 Aug 2023
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data
  Selection for Instruction Tuning
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning
Ming Li
Yong Zhang
Zhitao Li
Jiuhai Chen
Lichang Chen
Ning Cheng
Jianzong Wang
Dinesh Manocha
Jing Xiao
102
194
0
23 Aug 2023
#InsTag: Instruction Tagging for Analyzing Supervised Fine-tuning of
  Large Language Models
#InsTag: Instruction Tagging for Analyzing Supervised Fine-tuning of Large Language Models
Keming Lu
Hongyi Yuan
Zheng Yuan
Runji Lin
Junyang Lin
Chuanqi Tan
Chang Zhou
Jingren Zhou
ALM
LRM
37
67
0
14 Aug 2023
Scaling Relationship on Learning Mathematical Reasoning with Large
  Language Models
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
Zheng Yuan
Hongyi Yuan
Cheng Li
Guanting Dong
Keming Lu
Chuanqi Tan
Chang Zhou
Jingren Zhou
LRM
ALM
64
184
0
03 Aug 2023
FlashAttention-2: Faster Attention with Better Parallelism and Work
  Partitioning
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Tri Dao
LRM
97
1,221
0
17 Jul 2023
Let's Verify Step by Step
Let's Verify Step by Step
Hunter Lightman
V. Kosaraju
Yura Burda
Harrison Edwards
Bowen Baker
Teddy Lee
Jan Leike
John Schulman
Ilya Sutskever
K. Cobbe
ALM
OffRL
LRM
126
1,044
0
31 May 2023
The CoT Collection: Improving Zero-shot and Few-shot Learning of
  Language Models via Chain-of-Thought Fine-Tuning
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
Seungone Kim
Se June Joo
Doyoung Kim
Joel Jang
Seonghyeon Ye
Jamin Shin
Minjoon Seo
ALM
RALM
LRM
45
100
0
23 May 2023
LogiCoT: Logical Chain-of-Thought Instruction-Tuning
LogiCoT: Logical Chain-of-Thought Instruction-Tuning
Hanmeng Liu
Zhiyang Teng
Leyang Cui
Chaoli Zhang
Qiji Zhou
Yue Zhang
LRM
40
26
0
20 May 2023
Towards Reasoning in Large Language Models: A Survey
Towards Reasoning in Large Language Models: A Survey
Jie Huang
Kevin Chen-Chuan Chang
LM&MA
ELM
LRM
100
606
0
20 Dec 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
489
3,486
0
21 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
616
9,009
0
28 Jan 2022
LoRA: Low-Rank Adaptation of Large Language Models
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
312
10,099
0
17 Jun 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
176
2,307
0
20 Apr 2021
Measuring Mathematical Problem Solving With the MATH Dataset
Measuring Mathematical Problem Solving With the MATH Dataset
Dan Hendrycks
Collin Burns
Saurav Kadavath
Akul Arora
Steven Basart
Eric Tang
D. Song
Jacob Steinhardt
ReLM
FaML
134
2,109
0
05 Mar 2021
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
Samyam Rajbhandari
Jeff Rasley
Olatunji Ruwase
Yuxiong He
ALM
AI4CE
77
852
0
04 Oct 2019
1