Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2504.12427
Cited By
Position: The Most Expensive Part of an LLM should be its Training Data
16 April 2025
Nikhil Kandpal
Colin Raffel
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Position: The Most Expensive Part of an LLM should be its Training Data"
3 / 3 papers shown
KnowRL: Teaching Language Models to Know What They Know
Sahil Kale
Devendra Singh Dhami
KELM
112
0
0
13 Oct 2025
Pushing LLMs to Their Logical Reasoning Bound: The Role of Data Reasoning Intensity
Zhen Bi
Zhenlin Hu
Jinnan Yang
Mingyang Chen
Cheng Deng
...
Qing Shen
Zhenfang Liu
Kang Zhao
Ningyu Zhang
Jungang Lou
LRM
282
0
0
29 Sep 2025
Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training
Pierre-Carl Langlais
Carlos Rosas Hinostroza
Mattia Nee
Catherine Arnett
Pavel Chizhov
Eliot Jones
Irène Girard
David Mach
Anastasia Stasenko
Ivan P. Yamshchikov
AILaw
235
6
0
02 Jun 2025
1
Page 1 of 1