Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.10430
Cited By
Smaller Language Models are capable of selecting Instruction-Tuning Training Data for Larger Language Models
16 February 2024
Dheeraj Mekala
Alex Nguyen
Jingbo Shang
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Smaller Language Models are capable of selecting Instruction-Tuning Training Data for Larger Language Models"
5 / 5 papers shown
Title
The Best Instruction-Tuning Data are Those That Fit
Dylan Zhang
Qirun Dai
Hao Peng
ALM
115
3
0
06 Feb 2025
DFlow: Diverse Dialogue Flow Simulation with Large Language Models
Wanyu Du
Song Feng
James Gung
Lijia Sun
Yi Zhang
Saab Mansour
Yanjun Qi
47
0
0
18 Oct 2024
RegMix: Data Mixture as Regression for Language Model Pre-training
Qian Liu
Xiaosen Zheng
Niklas Muennighoff
Guangtao Zeng
Longxu Dou
Tianyu Pang
Jing Jiang
Min-Bin Lin
MoE
67
36
1
01 Jul 2024
Concept-skill Transferability-based Data Selection for Large Vision-Language Models
Jaewoo Lee
Boyang Li
Sung Ju Hwang
VLM
33
8
0
16 Jun 2024
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
Zachary Ankner
Cody Blakeney
Kartik K. Sreenivasan
Max Marion
Matthew L. Leavitt
Mansheej Paul
30
23
0
30 May 2024
1