ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.13252
  4. Cited By
Multilingual Language Model Pretraining using Machine-translated Data

Multilingual Language Model Pretraining using Machine-translated Data

20 February 2025
Jiayi Wang
Yao Lu
Maurice Weber
Max Ryabinin
David Ifeoluwa Adelani
Yihong Chen
Raphael Tang
Pontus Stenetorp
    LRM
ArXivPDFHTML

Papers citing "Multilingual Language Model Pretraining using Machine-translated Data"

2 / 2 papers shown
Title
Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation
Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation
Thomas F Burns
Letitia Parcalabescu
Stephan Wäldchen
Michael Barlow
Gregor Ziegltrum
Volker Stampa
Bastian Harren
Björn Deiseroth
SyDa
28
0
0
24 Apr 2025
Compass-V2 Technical Report
Compass-V2 Technical Report
Sophia Maria
MoE
LRM
29
0
0
22 Apr 2025
1