ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.16952
  4. Cited By
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance

Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance

25 March 2024
Jiasheng Ye
Peiju Liu
Tianxiang Sun
Yunhua Zhou
Jun Zhan
Xipeng Qiu
ArXivPDFHTML

Papers citing "Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance"

50 / 53 papers shown
Title
R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training
R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training
Albert Ge
Tzu-Heng Huang
John Cooper
Avi Trost
Ziyi Chu
Satya Sai Srinath Namburi GNVV
Ziyang Cai
Kendall Park
Nicholas Roberts
Frederic Sala
47
0
0
01 May 2025
QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining
QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining
Fengze Liu
Weidong Zhou
Binbin Liu
Zhimiao Yu
Yifan Zhang
...
Yifeng Yu
Bingni Zhang
Xiaohuan Zhou
Taifeng Wang
Yong Cao
55
0
0
23 Apr 2025
Phoenix: A Motion-based Self-Reflection Framework for Fine-grained Robotic Action Correction
Phoenix: A Motion-based Self-Reflection Framework for Fine-grained Robotic Action Correction
Wenke Xia
Ruoxuan Feng
Dong Wang
Di Hu
25
0
0
20 Apr 2025
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
Xinlin Zhuang
Jiahui Peng
Ren Ma
Y. Wang
Tianyi Bai
Xingjian Wei
Jiantao Qiu
Chi Zhang
Ying Qian
Conghui He
36
0
0
19 Apr 2025
DataDecide: How to Predict Best Pretraining Data with Small Experiments
DataDecide: How to Predict Best Pretraining Data with Small Experiments
Ian H. Magnusson
Nguyen Tai
Ben Bogin
David Heineman
Jena D. Hwang
...
Dirk Groeneveld
Oyvind Tafjord
Noah A. Smith
Pang Wei Koh
Jesse Dodge
ALM
20
0
0
15 Apr 2025
OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training
OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training
Juntao Zhao
Qi Lu
Wei Jia
Borui Wan
Lei Zuo
...
Y. Hu
Yanghua Peng
H. Lin
Xin Liu
Chuan Wu
AI4CE
29
0
0
14 Apr 2025
From Fairness to Truthfulness: Rethinking Data Valuation Design
From Fairness to Truthfulness: Rethinking Data Valuation Design
Dongyang Fan
Tyler J. Rotello
Sai Praneeth Karimireddy
TDI
41
0
0
07 Apr 2025
DALIP: Distribution Alignment-based Language-Image Pre-Training for Domain-Specific Data
DALIP: Distribution Alignment-based Language-Image Pre-Training for Domain-Specific Data
Junjie Wu
Jiangtao Xie
Zhaolin Zhang
Qilong Wang
Q. Hu
P. Li
Sen Xu
VLM
34
0
0
02 Apr 2025
ToReMi: Topic-Aware Data Reweighting for Dynamic Pre-Training Data Selection
ToReMi: Topic-Aware Data Reweighting for Dynamic Pre-Training Data Selection
Xiaoxuan Zhu
Zhouhong Gu
Baiqian Wu
Suhang Zheng
Tao Wang
Tianyu Li
Hongwei Feng
Yanghua Xiao
40
0
0
01 Apr 2025
Data Mixture Optimization: A Multi-fidelity Multi-scale Bayesian Framework
Data Mixture Optimization: A Multi-fidelity Multi-scale Bayesian Framework
Thomson Yen
Andrew Siah
Haozhe Chen
Tianyi Peng
Daniel Guetta
Hongseok Namkoong
40
0
0
26 Mar 2025
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
Kairong Luo
Haodong Wen
Shengding Hu
Zhenbo Sun
Zhiyuan Liu
Maosong Sun
Kaifeng Lyu
Wenguang Chen
CLL
49
0
0
17 Mar 2025
Compute Optimal Scaling of Skills: Knowledge vs Reasoning
Nicholas Roberts
Niladri S. Chatterji
Sharan Narang
Mike Lewis
Dieuwke Hupkes
46
2
0
13 Mar 2025
Teaching LMMs for Image Quality Scoring and Interpreting
Zicheng Zhang
H. Wu
Ziheng Jia
Weisi Lin
Guangtao Zhai
55
1
0
12 Mar 2025
Statistical Deficiency for Task Inclusion Estimation
Loïc Fosse
Frédéric Béchet
Benoit Favre
Géraldine Damnati
Gwénolé Lecorvé
Maxime Darrin
Philippe Formont
Pablo Piantanida
55
0
0
07 Mar 2025
Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions
E. Liu
Amanda Bertsch
Lintang Sutawika
Lindia Tjuatja
Patrick Fernandes
...
S.
Carolin (Haas) Lawrence
Aditi Raghunathan
Kiril Gashteovski
Graham Neubig
52
0
0
05 Mar 2025
SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity
Xiangyu Xi
Deyang Kong
Jian Yang
Jiawei Yang
Z. Chen
Wei Wang
J. T. Wang
Xunliang Cai
Shikun Zhang
Wei Ye
57
0
0
03 Mar 2025
Mixtera: A Data Plane for Foundation Model Training
Mixtera: A Data Plane for Foundation Model Training
Maximilian Böther
Xiaozhe Yao
Tolga Kerimoglu
Ana Klimovic
Viktor Gsteiger
Ana Klimovic
MoE
64
0
0
27 Feb 2025
Unsupervised Topic Models are Data Mixers for Pre-training Language Models
Unsupervised Topic Models are Data Mixers for Pre-training Language Models
Jiahui Peng
Xinlin Zhuang
Qiu Jiantao
Ren Ma
Jing Yu
Tianyi Bai
Conghui He
28
0
0
24 Feb 2025
Optimizing Pre-Training Data Mixtures with Mixtures of Data Expert Models
Optimizing Pre-Training Data Mixtures with Mixtures of Data Expert Models
Lior Belenki
Alekh Agarwal
Tianze Shi
Kristina Toutanova
MoE
41
0
0
21 Feb 2025
How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines
How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines
Ayan Sengupta
Yash Goel
Tanmoy Chakraborty
36
0
0
17 Feb 2025
PiKE: Adaptive Data Mixing for Multi-Task Learning Under Low Gradient Conflicts
Zeman Li
Yuan Deng
Peilin Zhong
Meisam Razaviyayn
Vahab Mirrokni
MoMe
69
1
0
10 Feb 2025
Foundations of GenIR
Qingyao Ai
Jingtao Zhan
Y. Liu
40
0
0
06 Jan 2025
The interplay between domain specialization and model size
The interplay between domain specialization and model size
Roseval Malaquias Junior
Ramon Pires
Thales Sales Almeida
Kenzo Sakiyama
R. Romero
R. Nogueira
41
0
0
03 Jan 2025
Loss-to-Loss Prediction: Scaling Laws for All Datasets
Loss-to-Loss Prediction: Scaling Laws for All Datasets
David Brandfonbrener
Nikhil Anand
Nikhil Vyas
Eran Malach
Sham Kakade
72
2
0
19 Nov 2024
LLM-R: A Framework for Domain-Adaptive Maintenance Scheme Generation
  Combining Hierarchical Agents and RAG
LLM-R: A Framework for Domain-Adaptive Maintenance Scheme Generation Combining Hierarchical Agents and RAG
Laifa Tao
Qixuan Huang
Xianjun Wu
Weiwei Zhang
Yunlong Wu
Bin Li
Chen Lu
Xingshuo Hai
38
0
0
07 Nov 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Minlie Huang
48
5
0
22 Oct 2024
Scalable Data Ablation Approximations for Language Models through
  Modular Training and Merging
Scalable Data Ablation Approximations for Language Models through Modular Training and Merging
Clara Na
Ian H. Magnusson
A. Jha
Tom Sherborne
Emma Strubell
Jesse Dodge
Pradeep Dasigi
MoMe
31
4
0
21 Oct 2024
Towards Neural Scaling Laws for Time Series Foundation Models
Towards Neural Scaling Laws for Time Series Foundation Models
Qingren Yao
Chao-Han Huck Yang
Renhe Jiang
Yuxuan Liang
Ming Jin
Shirui Pan
AI4TS
AI4CE
32
6
0
16 Oct 2024
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
Yiding Jiang
Allan Zhou
Zhili Feng
Sadhika Malladi
J. Zico Kolter
25
15
0
15 Oct 2024
Scaling Laws for Predicting Downstream Performance in LLMs
Scaling Laws for Predicting Downstream Performance in LLMs
Yangyi Chen
Binxuan Huang
Yifan Gao
Zhengyang Wang
Jingfeng Yang
Heng Ji
LRM
41
7
0
11 Oct 2024
Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining
Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining
Tianyi Bai
Ling Yang
Zhen Hao Wong
Jiahui Peng
Xinlin Zhuang
...
Lijun Wu
Jiantao Qiu
Wentao Zhang
Binhang Yuan
Conghui He
LLMAG
23
1
0
10 Oct 2024
Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models
Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models
Zhipeng Chen
Liang Song
K. Zhou
Wayne Xin Zhao
B. Wang
Weipeng Chen
Ji-Rong Wen
57
0
0
10 Oct 2024
Cookbook: A framework for improving LLM generative abilities via
  programmatic data generating templates
Cookbook: A framework for improving LLM generative abilities via programmatic data generating templates
A. Narayan
Mayee F. Chen
Kush S. Bhatia
Christopher Ré
SyDa
31
3
0
07 Oct 2024
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Tianjian Li
Haoran Xu
Weiting Tan
Kenton Murray
Daniel Khashabi
35
1
0
06 Oct 2024
Data Proportion Detection for Optimized Data Management for Large
  Language Models
Data Proportion Detection for Optimized Data Management for Large Language Models
Hao Liang
Keshi Zhao
Yajie Yang
Bin Cui
Guosheng Dong
Zenan Zhou
Wentao Zhang
26
0
0
26 Sep 2024
Archon: An Architecture Search Framework for Inference-Time Techniques
Archon: An Architecture Search Framework for Inference-Time Techniques
Jon Saad-Falcon
Adrian Gamarra Lafuente
Shlok Natarajan
Nahum Maru
Hristo Todorov
...
E. Kelly Buchanan
Mayee Chen
Neel Guha
Christopher Ré
Azalia Mirhoseini
AI4CE
21
12
0
23 Sep 2024
AgentMonitor: A Plug-and-Play Framework for Predictive and Secure
  Multi-Agent Systems
AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems
Chi-Min Chan
Jianxuan Yu
Weize Chen
Chunyang Jiang
Xinyu Liu
Weijie Shi
Zhiyuan Liu
Wei Xue
Yike Guo
LLMAG
36
0
0
27 Aug 2024
Are Bigger Encoders Always Better in Vision Large Models?
Are Bigger Encoders Always Better in Vision Large Models?
Bozhou Li
Hao Liang
Zimo Meng
Wentao Zhang
VLM
27
3
0
01 Aug 2024
CMR Scaling Law: Predicting Critical Mixture Ratios for Continual
  Pre-training of Language Models
CMR Scaling Law: Predicting Critical Mixture Ratios for Continual Pre-training of Language Models
Jiawei Gu
Zacc Yang
Chuanghao Ding
Rui Zhao
Fei Tan
CLL
34
3
0
24 Jul 2024
DDK: Distilling Domain Knowledge for Efficient Large Language Models
DDK: Distilling Domain Knowledge for Efficient Large Language Models
Jiaheng Liu
Chenchen Zhang
Jinyang Guo
Yuanxing Zhang
Haoran Que
...
Congnan Liu
Wenbo Su
Jiamang Wang
Lin Qu
Bo Zheng
36
3
0
23 Jul 2024
RegMix: Data Mixture as Regression for Language Model Pre-training
RegMix: Data Mixture as Regression for Language Model Pre-training
Qian Liu
Xiaosen Zheng
Niklas Muennighoff
Guangtao Zeng
Longxu Dou
Tianyu Pang
Jing Jiang
Min-Bin Lin
MoE
45
34
1
01 Jul 2024
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large
  Language Models
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models
Haoran Que
Jiaheng Liu
Ge Zhang
Chenchen Zhang
Xingwei Qu
...
Jie Fu
Wenbo Su
Jiamang Wang
Lin Qu
Bo Zheng
CLL
30
11
0
03 Jun 2024
A Survey of Multimodal Large Language Model from A Data-centric
  Perspective
A Survey of Multimodal Large Language Model from A Data-centric Perspective
Tianyi Bai
Hao Liang
Binwang Wan
Yanran Xu
Xi Li
...
Ping-Chia Huang
Jiulong Shan
Conghui He
Binhang Yuan
Wentao Zhang
44
31
0
26 May 2024
360Zhinao Technical Report
360Zhinao Technical Report
360Zhinao Team
26
0
0
22 May 2024
MiniCPM: Unveiling the Potential of Small Language Models with Scalable
  Training Strategies
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
Shengding Hu
Yuge Tu
Xu Han
Chaoqun He
Ganqu Cui
...
Chaochao Jia
Guoyang Zeng
Dahai Li
Zhiyuan Liu
Maosong Sun
MoE
29
275
0
09 Apr 2024
Towards Optimal Learning of Language Models
Towards Optimal Learning of Language Models
Yuxian Gu
Li Dong
Y. Hao
Qingxiu Dong
Minlie Huang
Furu Wei
28
7
0
27 Feb 2024
Scaling laws for learning with real and surrogate data
Scaling laws for learning with real and surrogate data
Ayush Jain
Andrea Montanari
Eren Sasoglu
30
11
0
06 Feb 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
133
298
0
05 Jan 2024
Don't Make Your LLM an Evaluation Benchmark Cheater
Don't Make Your LLM an Evaluation Benchmark Cheater
Kun Zhou
Yutao Zhu
Zhipeng Chen
Wentong Chen
Wayne Xin Zhao
Xu Chen
Yankai Lin
Ji-Rong Wen
Jiawei Han
ELM
99
136
0
03 Nov 2023
Revisiting Neural Scaling Laws in Language and Vision
Revisiting Neural Scaling Laws in Language and Vision
Ibrahim M. Alabdulmohsin
Behnam Neyshabur
Xiaohua Zhai
145
101
0
13 Sep 2022
12
Next