Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.15285
Cited By
Maximize Your Data's Potential: Enhancing LLM Accuracy with Two-Phase Pretraining
18 December 2024
Steven Feng
Shrimai Prabhumoye
Kezhi Kong
Dan Su
M. Patwary
M. Shoeybi
Bryan Catanzaro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Maximize Your Data's Potential: Enhancing LLM Accuracy with Two-Phase Pretraining"
2 / 2 papers shown
Title
Datasets, Documents, and Repetitions: The Practicalities of Unequal Data Quality
Alex Fang
Hadi Pouransari
Matt Jordan
Alexander Toshev
Vaishaal Shankar
Ludwig Schmidt
Tom Gunter
74
0
0
10 Mar 2025
MIND: Math Informed syNthetic Dialogues for Pretraining LLMs
Syeda Nahida Akter
Shrimai Prabhumoye
John Kamalu
S. Satheesh
Eric Nyberg
M. Patwary
M. Shoeybi
Bryan Catanzaro
LRM
SyDa
ReLM
98
1
0
15 Oct 2024
1