Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2508.15096
Cited By
Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset
20 August 2025
Rabeeh Karimi Mahabadi
S. Satheesh
Shrimai Prabhumoye
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github (1132★)
Papers citing
"Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset"
3 / 3 papers shown
Title
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Sean McLeish
Ang Li
John Kirchenbauer
Dayal Singh Kalra
Brian Bartoldson
B. Kailkhura
Avi Schwarzschild
Jonas Geiping
Tom Goldstein
Micah Goldblum
260
0
0
10 Nov 2025
SPICE: Self-Play In Corpus Environments Improves Reasoning
Bo Liu
Chuanyang Jin
Seungone Kim
Weizhe Yuan
Wenting Zhao
Ilia Kulikov
Xian Li
Sainbayar Sukhbaatar
Jack Lanchantin
Jason Weston
ReLM
LRM
230
6
0
28 Oct 2025
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Nvidia
Aarti Basant
Abhijit Khairnar
Abhijit Paithankar
Abhinav Khattar
...
Keith Wyss
Keshav Santhanam
Kezhi Kong
Krzysztof Pawelec
Kumar Anik
LRM
267
0
0
20 Aug 2025
1