ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.12032
  4. Cited By
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data
  Selection for Instruction Tuning

From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning

23 August 2023
Ming Li
Yong Zhang
Zhitao Li
Jiuhai Chen
Lichang Chen
Ning Cheng
Jianzong Wang
Tianyi Zhou
Jing Xiao
ArXivPDFHTML

Papers citing "From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning"

33 / 33 papers shown
Title
ICon: In-Context Contribution for Automatic Data Selection
ICon: In-Context Contribution for Automatic Data Selection
Yixin Yang
Qingxiu Dong
Linli Yao
Fangwei Zhu
Zhifang Sui
41
0
0
08 May 2025
DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training
DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training
Xiaoyu Tian
Sitong Zhao
Haotian Wang
Shuaiting Chen
Yiping Peng
Yunjie Ji
Han Zhao
Xiangang Li
LRM
57
1
0
24 Apr 2025
MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning
MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning
Yangning Li
Zihua Lan
Lv Qingsong
Yinghui Li
Hai-Tao Zheng
29
0
0
09 Apr 2025
RAISE: Reinforenced Adaptive Instruction Selection For Large Language Models
RAISE: Reinforenced Adaptive Instruction Selection For Large Language Models
Lv Qingsong
Yangning Li
Zihua Lan
Zishan Xu
Jiwei Tang
Yinghui Li
Wenhao Jiang
Hai-tao Zheng
Philip S. Yu
32
0
0
09 Apr 2025
Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning
Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning
Bardia Safaei
Faizan Siddiqui
Jiacong Xu
Vishal M. Patel
Shao-Yuan Lo
VLM
104
0
0
10 Mar 2025
ALinFiK: Learning to Approximate Linearized Future Influence Kernel for Scalable Third-Party LLM Data Valuation
ALinFiK: Learning to Approximate Linearized Future Influence Kernel for Scalable Third-Party LLM Data Valuation
Yanzhou Pan
Huawei Lin
Yide Ran
Jiamin Chen
Xiaodong Yu
Weijie Zhao
Denghui Zhang
Zhaozhuo Xu
40
0
0
02 Mar 2025
EDGE: Efficient Data Selection for LLM Agents via Guideline Effectiveness
EDGE: Efficient Data Selection for LLM Agents via Guideline Effectiveness
Yunxiao Zhang
Guanming Xiong
Haochen Li
Wen Zhao
LLMAG
64
0
0
18 Feb 2025
The Best Instruction-Tuning Data are Those That Fit
The Best Instruction-Tuning Data are Those That Fit
Dylan Zhang
Qirun Dai
Hao Peng
ALM
115
3
0
06 Feb 2025
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi Team
Angang Du
Bofei Gao
Bowei Xing
Changjiu Jiang
...
Zhilin Yang
Zhiqi Huang
Zihao Huang
Ziyao Xu
Z. Yang
VLM
ALM
OffRL
AI4TS
LRM
106
132
0
22 Jan 2025
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
88
11
0
31 Dec 2024
Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted Captions
Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted Captions
Moran Yanuka
Assaf Ben-Kish
Yonatan Bitton
Idan Szpektor
Raja Giryes
VLM
45
2
0
13 Nov 2024
Stronger Models are NOT Stronger Teachers for Instruction Tuning
Stronger Models are NOT Stronger Teachers for Instruction Tuning
Zhangchen Xu
Fengqing Jiang
Luyao Niu
Bill Yuchen Lin
Radha Poovendran
ALM
48
5
0
11 Nov 2024
DELIFT: Data Efficient Language model Instruction Fine Tuning
DELIFT: Data Efficient Language model Instruction Fine Tuning
Ishika Agarwal
Krishnateja Killamsetty
Lucian Popa
Marina Danilevksy
ALM
VLM
46
2
0
07 Nov 2024
What is Wrong with Perplexity for Long-context Language Modeling?
What is Wrong with Perplexity for Long-context Language Modeling?
Lizhe Fang
Yifei Wang
Zhaoyang Liu
Chenheng Zhang
Stefanie Jegelka
Jinyang Gao
Bolin Ding
Yisen Wang
58
4
0
31 Oct 2024
Data Quality Control in Federated Instruction-tuning of Large Language Models
Data Quality Control in Federated Instruction-tuning of Large Language Models
Yaxin Du
Rui Ye
Fengting Yuchi
W. Zhao
Jingjing Qu
Y. Wang
Siheng Chen
ALM
FedML
45
0
0
15 Oct 2024
CodeACT: Code Adaptive Compute-efficient Tuning Framework for Code LLMs
CodeACT: Code Adaptive Compute-efficient Tuning Framework for Code LLMs
Weijie Lv
Xuan Xia
Sheng-Jun Huang
ALM
34
2
0
05 Aug 2024
ProcTag: Process Tagging for Assessing the Efficacy of Document Instruction Data
ProcTag: Process Tagging for Assessing the Efficacy of Document Instruction Data
Yufan Shen
Chuwei Luo
Zhaoqing Zhu
Yang Chen
Qi Zheng
Zhi Yu
Jiajun Bu
Cong Yao
36
2
0
17 Jul 2024
Curriculum Learning with Quality-Driven Data Selection
Curriculum Learning with Quality-Driven Data Selection
Biao Wu
Fang Meng
Ling-Hao Chen
19
1
0
27 Jun 2024
RuleR: Improving LLM Controllability by Rule-based Data Recycling
RuleR: Improving LLM Controllability by Rule-based Data Recycling
Ming Li
Han Chen
Chenguang Wang
Dang Nguyen
Dianqi Li
Tianyi Zhou
26
6
0
22 Jun 2024
Take the essence and discard the dross: A Rethinking on Data Selection for Fine-Tuning Large Language Models
Take the essence and discard the dross: A Rethinking on Data Selection for Fine-Tuning Large Language Models
Ziche Liu
Rui Ke
Feng Jiang
Feng Jiang
Haizhou Li
61
1
0
20 Jun 2024
CoEvol: Constructing Better Responses for Instruction Finetuning through
  Multi-Agent Cooperation
CoEvol: Constructing Better Responses for Instruction Finetuning through Multi-Agent Cooperation
Renhao Li
Minghuan Tan
Derek F. Wong
Min Yang
LLMAG
19
1
0
11 Jun 2024
Are Large Language Models the New Interface for Data Pipelines?
Are Large Language Models the New Interface for Data Pipelines?
Sylvio Barbon Junior
Paolo Ceravolo
Sven Groppe
Mustafa Jarrar
S. Maghool
Florence Sèdes
S. Sahri
M. van Keulen
LM&MA
29
8
0
06 Jun 2024
A Survey on Self-Evolution of Large Language Models
A Survey on Self-Evolution of Large Language Models
Zhengwei Tao
Ting-En Lin
Xiancai Chen
Hangyu Li
Yuchuan Wu
Yongbin Li
Zhi Jin
Fei Huang
Dacheng Tao
Jingren Zhou
LRM
LM&Ro
49
21
0
22 Apr 2024
Dataverse: Open-Source ETL (Extract, Transform, Load) Pipeline for Large Language Models
Dataverse: Open-Source ETL (Extract, Transform, Load) Pipeline for Large Language Models
Hyunbyung Park
Sukyung Lee
Gyoungjin Gim
Yungi Kim
Dahyun Kim
Chanjun Park
VLM
29
0
0
28 Mar 2024
Large Language Models and Causal Inference in Collaboration: A Survey
Large Language Models and Causal Inference in Collaboration: A Survey
Xiaoyu Liu
Paiheng Xu
Junda Wu
Jiaxin Yuan
Yifan Yang
...
Haoliang Wang
Tong Yu
Julian McAuley
Wei Ai
Furong Huang
ELM
LRM
72
5
0
14 Mar 2024
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM
  Instruction-Tuning
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
Ming Li
Lichang Chen
Jiuhai Chen
Shwai He
Jiuxiang Gu
Tianyi Zhou
16
50
0
15 Feb 2024
Assistive Large Language Model Agents for Socially-Aware Negotiation Dialogues
Assistive Large Language Model Agents for Socially-Aware Negotiation Dialogues
Yuncheng Hua
Lizhen Qu
Gholamreza Haffari
85
5
0
29 Jan 2024
An Experimental Design Framework for Label-Efficient Supervised
  Finetuning of Large Language Models
An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models
Gantavya Bhatt
Yifang Chen
Arnav M. Das
Jifan Zhang
Sang T. Truong
...
Jeff Bilmes
S. Du
Kevin G. Jamieson
Jordan T. Ash
Robert D. Nowak
29
14
0
12 Jan 2024
Rethinking the Instruction Quality: LIFT is What You Need
Rethinking the Instruction Quality: LIFT is What You Need
Yang Xu
Yongqiang Yao
Yufan Huang
Mengnan Qi
Maoquan Wang
Bin Gu
Neel Sundaresan
ALM
19
32
0
12 Dec 2023
CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM
  Instruction Tuning
CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM Instruction Tuning
Yilun Liu
Shimin Tao
Xiaofeng Zhao
Ming Zhu
Wenbing Ma
...
Min Zhang
Hongxia Ma
Li Zhang
Hao-Yu Yang
Yanfei Jiang
26
10
0
22 Nov 2023
Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning
Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning
Ming Li
Lichang Chen
Jiuhai Chen
Shwai He
Heng-Chiao Huang
Jiuxiang Gu
Tianyi Zhou
90
20
0
18 Oct 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in
  NLP
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
Qinyuan Ye
Bill Yuchen Lin
Xiang Ren
209
179
0
18 Apr 2021
1