Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.07064
Cited By
Data Selection via Optimal Control for Language Models
9 October 2024
Yuxian Gu
Li Dong
Hongning Wang
Y. Hao
Qingxiu Dong
Furu Wei
Minlie Huang
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Data Selection via Optimal Control for Language Models"
3 / 3 papers shown
Title
QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining
Fengze Liu
Weidong Zhou
Binbin Liu
Zhimiao Yu
Yifan Zhang
...
Yifeng Yu
Bingni Zhang
Xiaohuan Zhou
Taifeng Wang
Yong Cao
52
0
0
23 Apr 2025
MASS: Mathematical Data Selection via Skill Graphs for Pretraining Large Language Models
J. Li
Lu Yu
Qing Cui
Zhiqiang Zhang
Jun Zhou
Yanfang Ye
Chuxu Zhang
56
0
0
19 Mar 2025
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Minlie Huang
46
5
0
22 Oct 2024
1