Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.00022
Cited By
Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation
24 April 2025
Thomas F Burns
Letitia Parcalabescu
Stephan Wäldchen
Michael Barlow
Gregor Ziegltrum
Volker Stampa
Bastian Harren
Björn Deiseroth
SyDa
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation"
Title
No papers