Scaling Laws for Multilingual Neural Machine Translation

International Conference on Machine Learning (ICML), 2023

19 February 2023

ArXiv (abs)PDF HTML Github

Papers citing "Scaling Laws for Multilingual Neural Machine Translation"

27 / 27 papers shown

Encoder-Decoder or Decoder-Only? Revisiting Encoder-Decoder Large Language Model

185

30 Oct 2025

ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality

218

24 Oct 2025

ModernVBERT: Towards Smaller Visual Document Retrievers

398

01 Oct 2025

Model Merging Scaling Laws in Large Language Models

393

29 Sep 2025

Exploring Polyglot Harmony: On Multilingual Data Allocation for Large Language Models Pretraining

178

19 Sep 2025

OLMoASR: Open Models and Data for Training Robust Speech Recognition Models

217

28 Aug 2025

Efficient Scaling for LLM-based ASR

289

06 Aug 2025

MergeBench: A Benchmark for Merging Domain-Specialized LLMs

800

16 May 2025

Scaling Laws for Conditional Emergence of Multilingual Image Captioning via Generalization from Translation

610

12 Mar 2025

(Mis)Fitting: A Survey of Scaling Laws

Margaret Li

Sneha Kudugunta

Luke Zettlemoyer

480

26 Feb 2025

Scaling Laws for Multilingual Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

292

15 Oct 2024

Scaling Optimal LR Across Token HorizonsInternational Conference on Learning Representations (ICLR), 2024

Xia Song

607

30 Sep 2024

EuroLLM: Multilingual Language Models for Europe

Pedro Henrique Martins

Patrick Fernandes

...

Alexandra Birch

André F. T. Martins

327

24 Sep 2024

Do Neural Scaling Laws Exist on Graph Self-Supervised Learning?LOG IN (LOG IN), 2024

337

20 Aug 2024

Reconciling Kaplan and Chinchilla Scaling Laws

Tim Pearce

Jinyeop Song

445

12 Jun 2024

LexMatcher: Dictionary-centric Data Collection for LLM-based Machine Translation

Yue Zhang

512

03 Jun 2024

When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method

351

262

27 Feb 2024

Scaling Laws for Downstream Task Performance of Large Language ModelsInternational Conference on Learning Representations (ICLR), 2024

383

06 Feb 2024

Selecting Large Language Model to Fine-tune via Rectified Scaling Law

Sujian Li

Xiaojun Wan

388

04 Feb 2024

CroissantLLM: A Truly Bilingual French-English Language Model

...

797

01 Feb 2024

The Universal Statistical Structure and Scaling Laws of Chaos and Turbulence

Noam Levi

Yaron Oz

AI4CE

310

02 Nov 2023

A Benchmark for Learning to Translate a New Language from One Grammar BookInternational Conference on Learning Representations (ICLR), 2023

Dan Jurafsky

370

28 Sep 2023

The Underlying Scaling Laws and Universal Statistical Structure of Complex Datasets

Noam Levi

Yaron Oz

454

26 Jun 2023

Multilingual Large Language Models Are Not (Yet) Code-SwitchersConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Ruochen Zhang

Samuel Cahyawijaya

Jan Christian Blaise Cruz

Genta Indra Winata

Alham Fikri Aji

LRM

1.5K

23 May 2023

When Does Monolingual Data Help Multilingual Translation: The Role of Domain and Model ScaleNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

445

23 May 2023

On the Pareto Front of Multilingual Neural Machine TranslationNeural Information Processing Systems (NeurIPS), 2023

408

06 Apr 2023

Causes and Cures for Interference in Multilingual TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

364

14 Dec 2022