ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.15653
  4. Cited By
MoDS: Model-oriented Data Selection for Instruction Tuning

MoDS: Model-oriented Data Selection for Instruction Tuning

27 November 2023
Qianlong Du
Chengqing Zong
Jiajun Zhang
    ALM
ArXivPDFHTML

Papers citing "MoDS: Model-oriented Data Selection for Instruction Tuning"

50 / 57 papers shown
Title
Text2Cypher: Data Pruning using Hard Example Selection
Text2Cypher: Data Pruning using Hard Example Selection
Makbule Gulcin Ozsoy
AAML
41
0
0
08 May 2025
ICon: In-Context Contribution for Automatic Data Selection
ICon: In-Context Contribution for Automatic Data Selection
Yixin Yang
Qingxiu Dong
Linli Yao
Fangwei Zhu
Zhifang Sui
41
0
0
08 May 2025
DONOD: Robust and Generalizable Instruction Fine-Tuning for LLMs via Model-Intrinsic Dataset Pruning
DONOD: Robust and Generalizable Instruction Fine-Tuning for LLMs via Model-Intrinsic Dataset Pruning
Jucheng Hu
S. M. I. Simon X. Yang
Dongzhan Zhou
Lijun Wu
24
0
0
21 Apr 2025
ToolACE-R: Tool Learning with Adaptive Self-Refinement
ToolACE-R: Tool Learning with Adaptive Self-Refinement
Xingshan Zeng
W. Liu
X. Huang
Zezhong Wang
Lingzhi Wang
...
Y. Wang
Lifeng Shang
Xin Jiang
Ruiming Tang
Q. Liu
CLL
50
0
0
02 Apr 2025
Pay More Attention to the Robustness of Prompt for Instruction Data Mining
Pay More Attention to the Robustness of Prompt for Instruction Data Mining
Qiang Wang
Dawei Feng
Xu Zhang
Ao Shen
Yang Xu
Bo Ding
H. Wang
AAML
41
0
0
31 Mar 2025
Towards Automatic Continual Learning: A Self-Adaptive Framework for Continual Instruction Tuning
Towards Automatic Continual Learning: A Self-Adaptive Framework for Continual Instruction Tuning
Peiyi Lin
Fukai Zhang
Kai Niu
Hao Fu
CLL
64
0
0
20 Mar 2025
D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning
Jia Zhang
Chen-Xi Zhang
Yao Liu
Yi-Xuan Jin
Xiao-Wen Yang
Bo Zheng
Y. Liu
Lan-Zhe Guo
47
2
0
14 Mar 2025
Large-Scale Data Selection for Instruction Tuning
Hamish Ivison
Muru Zhang
Faeze Brahman
Pang Wei Koh
Pradeep Dasigi
ALM
71
1
0
03 Mar 2025
Beyond QA Pairs: Assessing Parameter-Efficient Fine-Tuning for Fact Embedding in LLMs
Shivam Ratnakar
Abhiroop Talasila
Raghav Chamadiya
Nikhil Agarwal
Vinayak K Doifode
ALM
48
1
0
03 Mar 2025
Advancing MAPF towards the Real World: A Scalable Multi-Agent Realistic Testbed (SMART)
Jingtian Yan
Zhifei Li
William Kang
Yulun Zhang
Stephen Smith
Jiaoyang Li
43
0
0
03 Mar 2025
MathClean: A Benchmark for Synthetic Mathematical Data Cleaning
MathClean: A Benchmark for Synthetic Mathematical Data Cleaning
Hao Liang
Meiyi Qiang
Y. Li
Zefeng He
Yongzhen Guo
Z. Zhu
Wentao Zhang
Bin Cui
33
0
0
26 Feb 2025
SAE-V: Interpreting Multimodal Models for Enhanced Alignment
SAE-V: Interpreting Multimodal Models for Enhanced Alignment
Hantao Lou
Changye Li
Jiaming Ji
Yaodong Yang
40
1
0
22 Feb 2025
The Best Instruction-Tuning Data are Those That Fit
The Best Instruction-Tuning Data are Those That Fit
Dylan Zhang
Qirun Dai
Hao Peng
ALM
115
3
0
06 Feb 2025
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
88
11
0
31 Dec 2024
Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over
  Aligned Large Language Models
Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language Models
Yuchen Fan
Yuzhong Hong
Qiushi Wang
Junwei Bao
Hongfei Jiang
Yang Song
67
1
0
17 Dec 2024
ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific
  Instruction Tuning
ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning
Yang Wu
Huayi Zhang
Yizheng Jiao
Lin Ma
Xiaozhong Liu
Jinhong Yu
Dongyu Zhang
Dezhi Yu
Wei Xu
73
1
0
01 Dec 2024
Seed-Free Synthetic Data Generation Framework for Instruction-Tuning
  LLMs: A Case Study in Thai
Seed-Free Synthetic Data Generation Framework for Instruction-Tuning LLMs: A Case Study in Thai
Parinthapat Pengpun
Can Udomcharoenchaikit
Weerayut Buaphet
Peerat Limkonchotiwat
SyDa
83
2
0
23 Nov 2024
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
Xinyan Guan
Yanjiang Liu
Xinyu Lu
Boxi Cao
Ben He
...
Le Sun
Jie Lou
Bowen Yu
Y. Lu
Hongyu Lin
ALM
79
2
0
18 Nov 2024
Compound-QA: A Benchmark for Evaluating LLMs on Compound Questions
Compound-QA: A Benchmark for Evaluating LLMs on Compound Questions
Yutao Hou
Yajing Luo
Zhiwen Ruan
H. Wang
Weifeng Ge
Y. Chen
Guanhua Chen
ELM
38
0
0
15 Nov 2024
EVQAScore: A Fine-grained Metric for Video Question Answering Data Quality Evaluation
EVQAScore: A Fine-grained Metric for Video Question Answering Data Quality Evaluation
Hao Liang
Zirong Chen
W. Zhang
Wentao Zhang
31
0
0
11 Nov 2024
DELIFT: Data Efficient Language model Instruction Fine Tuning
DELIFT: Data Efficient Language model Instruction Fine Tuning
Ishika Agarwal
Krishnateja Killamsetty
Lucian Popa
Marina Danilevksy
ALM
VLM
46
2
0
07 Nov 2024
MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
Gabrielle Kaili-May Liu
Bowen Shi
Avi Caciularu
Idan Szpektor
Arman Cohan
58
3
0
30 Oct 2024
Data Quality Control in Federated Instruction-tuning of Large Language Models
Data Quality Control in Federated Instruction-tuning of Large Language Models
Yaxin Du
Rui Ye
Fengting Yuchi
W. Zhao
Jingjing Qu
Y. Wang
Siheng Chen
ALM
FedML
45
0
0
15 Oct 2024
3DS: Decomposed Difficulty Data Selection's Case Study on LLM Medical
  Domain Adaptation
3DS: Decomposed Difficulty Data Selection's Case Study on LLM Medical Domain Adaptation
Hongxin Ding
Yue Fang
Runchuan Zhu
Xinke Jiang
Jinyang Zhang
Yongxin Xu
Xu Chu
Junfeng Zhao
Yasha Wang
28
0
0
13 Oct 2024
Rethinking Data Selection at Scale: Random Selection is Almost All You
  Need
Rethinking Data Selection at Scale: Random Selection is Almost All You Need
Tingyu Xia
Bowen Yu
K. Dang
An Yang
Yuan Wu
Yuan Tian
Yi-Ju Chang
Junyang Lin
ALM
49
3
0
12 Oct 2024
Data Selection via Optimal Control for Language Models
Data Selection via Optimal Control for Language Models
Yuxian Gu
Li Dong
Hongning Wang
Y. Hao
Qingxiu Dong
Furu Wei
Minlie Huang
AI4CE
48
4
0
09 Oct 2024
DecorateLM: Data Engineering through Corpus Rating, Tagging, and Editing
  with Language Models
DecorateLM: Data Engineering through Corpus Rating, Tagging, and Editing with Language Models
Ranchi Zhao
Zhen Leng Thai
Yifan Zhang
Shengding Hu
Yunqi Ba
Jie Zhou
Jie Cai
Zhiyuan Liu
Maosong Sun
23
1
0
08 Oct 2024
HyperINF: Unleashing the HyperPower of the Schulz's Method for Data
  Influence Estimation
HyperINF: Unleashing the HyperPower of the Schulz's Method for Data Influence Estimation
Xinyu Zhou
Simin Fan
Martin Jaggi
TDI
20
0
0
07 Oct 2024
Data Proportion Detection for Optimized Data Management for Large
  Language Models
Data Proportion Detection for Optimized Data Management for Large Language Models
Hao Liang
Keshi Zhao
Yajie Yang
Bin Cui
Guosheng Dong
Zenan Zhou
Wentao Zhang
31
0
0
26 Sep 2024
What is the Role of Small Models in the LLM Era: A Survey
What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen
Gaël Varoquaux
ALM
58
23
0
10 Sep 2024
Leveraging Open Knowledge for Advancing Task Expertise in Large Language
  Models
Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models
Yuncheng Yang
Yulei Qin
Tong Wu
Zihan Xu
Gang Li
...
Yuchen Shi
Ke Li
Xing Sun
Jie Yang
Yun Gu
ALM
OffRL
MoE
46
0
0
28 Aug 2024
I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative
  Self-Enhancement Paradigm
I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm
Yiming Liang
Ge Zhang
Xingwei Qu
Tianyu Zheng
Jiawei Guo
...
Jiaheng Liu
Chenghua Lin
Lei Ma
Wenhao Huang
Jiajun Zhang
ALM
43
5
0
15 Aug 2024
FANNO: Augmenting High-Quality Instruction Data with Open-Sourced LLMs
  Only
FANNO: Augmenting High-Quality Instruction Data with Open-Sourced LLMs Only
He Zhu
Junyou Su
Tianle Lun
Yicheng Tao
Wenjia Zhang
Zipei Fan
Guanhua Chen
ALM
23
2
0
02 Aug 2024
Synth-Empathy: Towards High-Quality Synthetic Empathy Data
Synth-Empathy: Towards High-Quality Synthetic Empathy Data
Hao Liang
Linzhuang Sun
Jingxuan Wei
Xijie Huang
Linkun Sun
Bihui Yu
Conghui He
Wentao Zhang
SyDa
35
4
0
31 Jul 2024
SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision
  Language Models
SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models
Zheng Liu
Hao Liang
Xijie Huang
Wentao Xiong
Qinhan Yu
Linzhuang Sun
Chong Chen
Conghui He
Bin Cui
Wentao Zhang
SyDa
39
0
0
30 Jul 2024
Bailicai: A Domain-Optimized Retrieval-Augmented Generation Framework
  for Medical Applications
Bailicai: A Domain-Optimized Retrieval-Augmented Generation Framework for Medical Applications
Cui Long
Yongbin Liu
Chunping Ouyang
Ying Yu
32
3
0
24 Jul 2024
Entropy Law: The Story Behind Data Compression and LLM Performance
Entropy Law: The Story Behind Data Compression and LLM Performance
Mingjia Yin
Chuhan Wu
Yufei Wang
Hao Wang
Wei Guo
Yasheng Wang
Y. Liu
Ruiming Tang
Defu Lian
Enhong Chen
37
19
0
09 Jul 2024
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System
Miao Zheng
H. Liang
Fan Yang
Haoze Sun
Tianpeng Li
...
Kun Fang
Weipeng Chen
Bin Cui
Wentao Zhang
Zenan Zhou
RALM
37
3
0
08 Jul 2024
KeyVideoLLM: Towards Large-scale Video Keyframe Selection
KeyVideoLLM: Towards Large-scale Video Keyframe Selection
Hao Liang
Jiapeng Li
Tianyi Bai
Xijie Huang
Linzhuang Sun
Zhengren Wang
Conghui He
Bin Cui
Chong Chen
Wentao Zhang
VGen
27
7
0
03 Jul 2024
Efficient-Empathy: Towards Efficient and Effective Selection of Empathy
  Data
Efficient-Empathy: Towards Efficient and Effective Selection of Empathy Data
Linzhuang Sun
Hao Liang
Jingxuan Wei
Linkun Sun
Bihui Yu
Bin Cui
Wentao Zhang
27
1
0
02 Jul 2024
Retrieval Augmented Instruction Tuning for Open NER with Large Language
  Models
Retrieval Augmented Instruction Tuning for Open NER with Large Language Models
Tingyu Xie
Jian Zhang
Yan Zhang
Yuanyuan Liang
Qi Li
Hongwei Wang
RALM
29
0
0
25 Jun 2024
Take the essence and discard the dross: A Rethinking on Data Selection for Fine-Tuning Large Language Models
Take the essence and discard the dross: A Rethinking on Data Selection for Fine-Tuning Large Language Models
Ziche Liu
Rui Ke
Feng Jiang
Feng Jiang
Haizhou Li
61
1
0
20 Jun 2024
Concept-skill Transferability-based Data Selection for Large
  Vision-Language Models
Concept-skill Transferability-based Data Selection for Large Vision-Language Models
Jaewoo Lee
Boyang Li
Sung Ju Hwang
VLM
33
8
0
16 Jun 2024
A Survey of Multimodal Large Language Model from A Data-centric
  Perspective
A Survey of Multimodal Large Language Model from A Data-centric Perspective
Tianyi Bai
Hao Liang
Binwang Wan
Yanran Xu
Xi Li
...
Ping-Chia Huang
Jiulong Shan
Conghui He
Binhang Yuan
Wentao Zhang
47
36
0
26 May 2024
G-DIG: Towards Gradient-based Diverse and High-quality Instruction Data
  Selection for Machine Translation
G-DIG: Towards Gradient-based Diverse and High-quality Instruction Data Selection for Machine Translation
Xingyuan Pan
Luyang Huang
Liyan Kang
Zhicheng Liu
Yu Lu
Shanbo Cheng
ALM
35
11
0
21 May 2024
SHED: Shapley-Based Automated Dataset Refinement for Instruction
  Fine-Tuning
SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning
Yexiao He
Ziyao Wang
Zheyu Shen
Guoheng Sun
Yucong Dai
Yongkai Wu
Hongyi Wang
Ang Li
26
11
0
23 Apr 2024
Exploring the Mystery of Influential Data for Mathematical Reasoning
Exploring the Mystery of Influential Data for Mathematical Reasoning
Xinzhe Ni
Yeyun Gong
Zhibin Gou
Yelong Shen
Yujiu Yang
Nan Duan
Weizhu Chen
34
10
0
01 Apr 2024
PlanGPT: Enhancing Urban Planning with Tailored Language Model and
  Efficient Retrieval
PlanGPT: Enhancing Urban Planning with Tailored Language Model and Efficient Retrieval
He Zhu
Wenjia Zhang
Nuoxian Huang
Boyang Li
Luyao Niu
...
Yicheng Tao
Junyou Su
Zhaoya Gong
Chenyu Fang
Xing Liu
LLMAG
45
7
0
29 Feb 2024
Clustering and Ranking: Diversity-preserved Instruction Selection
  through Expert-aligned Quality Estimation
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation
Yuan Ge
Yilun Liu
Chi Hu
Weibin Meng
Shimin Tao
Xiaofeng Zhao
Hongxia Ma
Li Zhang
Hao Yang
Tong Xiao
ALM
27
24
0
28 Feb 2024
A Survey on Knowledge Distillation of Large Language Models
A Survey on Knowledge Distillation of Large Language Models
Xiaohan Xu
Ming Li
Chongyang Tao
Tao Shen
Reynold Cheng
Jinyang Li
Can Xu
Dacheng Tao
Tianyi Zhou
KELM
VLM
42
98
0
20 Feb 2024
12
Next