Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.05798
Cited By
Hidden Holes: topological aspects of language models
9 June 2024
Stephen Fitz
P. Romero
Jiyan Jonas Schneider
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Hidden Holes: topological aspects of language models"
2 / 2 papers shown
Title
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
237
590
0
14 Jul 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
248
1,986
0
31 Dec 2020
1