Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.13657
Cited By
FuLG: 150B Romanian Corpus for Language Model Pretraining
18 July 2024
Vlad-Andrei Bădoiu
Mihai-Valentin Dumitru
Alexandru M. Gherghescu
Alexandru Agache
C. Raiciu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FuLG: 150B Romanian Corpus for Language Model Pretraining"
2 / 2 papers shown
Title
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
Iz Beltagy
Pete Walsh
Akshita Bhagia
Rodney Michael Kinney
...
Jesse Dodge
Kyle Lo
Luca Soldaini
Noah A. Smith
Hanna Hajishirzi
OSLM
130
349
0
01 Feb 2024
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
245
1,977
0
31 Dec 2020
1