ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.09650
  4. Cited By
Scaling Laws for Multilingual Neural Machine Translation

Scaling Laws for Multilingual Neural Machine Translation

International Conference on Machine Learning (ICML), 2023
19 February 2023
Patrick Fernandes
Behrooz Ghorbani
Xavier Garcia
Markus Freitag
Orhan Firat
ArXiv (abs)PDFHTMLGithub

Papers citing "Scaling Laws for Multilingual Neural Machine Translation"

27 / 27 papers shown
Encoder-Decoder or Decoder-Only? Revisiting Encoder-Decoder Large Language Model
Encoder-Decoder or Decoder-Only? Revisiting Encoder-Decoder Large Language Model
Biao Zhang
Yong Cheng
Siamak Shakeri
Xinyi Wang
Min Ma
Orhan Firat
185
2
0
30 Oct 2025
ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality
ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality
Shayne Longpre
Sneha Kudugunta
Niklas Muennighoff
I-Hung Hsu
Isaac Caswell
Alex Pentland
Sercan O. Arik
Chen-Yu Lee
Sayna Ebrahimi
CLLLRM
218
6
0
24 Oct 2025
ModernVBERT: Towards Smaller Visual Document Retrievers
ModernVBERT: Towards Smaller Visual Document Retrievers
Paul Teiletche
Quentin Macé
Max Conti
António Loison
Gautier Viaud
Pierre Colombo
Manuel Faysse
VLM
398
12
0
01 Oct 2025
Model Merging Scaling Laws in Large Language Models
Model Merging Scaling Laws in Large Language Models
Yuanyi Wang
Yanggan Gu
Yiming Zhang
Qi Zhou
Zhaoyi Yan
C. Xie
X. Wang
Jianbo Yuan
Hongxia Yang
MoMe
393
2
0
29 Sep 2025
Exploring Polyglot Harmony: On Multilingual Data Allocation for Large Language Models Pretraining
Exploring Polyglot Harmony: On Multilingual Data Allocation for Large Language Models Pretraining
Ping Guo
Y. Ren
Binbin Liu
Fengze Liu
Haobin Lin
Yifan Zhang
Bingni Zhang
Taifeng Wang
Yin Zheng
178
1
0
19 Sep 2025
OLMoASR: Open Models and Data for Training Robust Speech Recognition Models
OLMoASR: Open Models and Data for Training Robust Speech Recognition Models
Huong Ngo
Matt Deitke
Martijn Bartelds
Sarah M Pratt
Josh Gardner
Matt Jordan
Ludwig Schmidt
217
5
0
28 Aug 2025
Efficient Scaling for LLM-based ASR
Efficient Scaling for LLM-based ASR
Bingshen Mu
Yiwen Shao
Kun Wei
Dong Yu
Lei Xie
AuLLM
289
9
0
06 Aug 2025
MergeBench: A Benchmark for Merging Domain-Specialized LLMs
MergeBench: A Benchmark for Merging Domain-Specialized LLMs
Yifei He
Siqi Zeng
Yuzheng Hu
Rui Yang
Tong Zhang
Han Zhao
MoMeALM
800
15
0
16 May 2025
Scaling Laws for Conditional Emergence of Multilingual Image Captioning via Generalization from Translation
Scaling Laws for Conditional Emergence of Multilingual Image Captioning via Generalization from Translation
Julian Spravil
Sebastian Houben
Sven Behnke
VLM
610
0
0
12 Mar 2025
(Mis)Fitting: A Survey of Scaling Laws
(Mis)Fitting: A Survey of Scaling Laws
Margaret Li
Sneha Kudugunta
Luke Zettlemoyer
480
14
0
26 Feb 2025
Scaling Laws for Multilingual Language Models
Scaling Laws for Multilingual Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Yifei He
Alon Benhaim
Barun Patra
Praneetha Vaddamanu
Sanchit Ahuja
Parul Chopra
Vishrav Chaudhary
Han Zhao
Xia Song
292
19
0
15 Oct 2024
Scaling Optimal LR Across Token Horizons
Scaling Optimal LR Across Token HorizonsInternational Conference on Learning Representations (ICLR), 2024
Johan Bjorck
Alon Benhaim
Vishrav Chaudhary
Furu Wei
Xia Song
607
23
0
30 Sep 2024
EuroLLM: Multilingual Language Models for Europe
EuroLLM: Multilingual Language Models for Europe
Pedro Henrique Martins
Patrick Fernandes
Joao Alves
Nuno M. Guerreiro
Ricardo Rei
...
Pierre Colombo
Barry Haddow
José G. C. de Souza
Alexandra Birch
André F. T. Martins
327
94
0
24 Sep 2024
Do Neural Scaling Laws Exist on Graph Self-Supervised Learning?
Do Neural Scaling Laws Exist on Graph Self-Supervised Learning?LOG IN (LOG IN), 2024
Qian Ma
Haitao Mao
Jingzhe Liu
Zhehua Zhang
Chunlin Feng
Yu Song
Yihan Shao
Yao Ma
337
4
0
20 Aug 2024
Reconciling Kaplan and Chinchilla Scaling Laws
Reconciling Kaplan and Chinchilla Scaling Laws
Tim Pearce
Jinyeop Song
445
27
0
12 Jun 2024
LexMatcher: Dictionary-centric Data Collection for LLM-based Machine
  Translation
LexMatcher: Dictionary-centric Data Collection for LLM-based Machine Translation
Yongjing Yin
Jiali Zeng
Yafu Li
Fandong Meng
Yue Zhang
512
3
0
03 Jun 2024
When Scaling Meets LLM Finetuning: The Effect of Data, Model and
  Finetuning Method
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
Biao Zhang
Zhongtao Liu
Colin Cherry
Orhan Firat
LRM
351
262
0
27 Feb 2024
Scaling Laws for Downstream Task Performance of Large Language Models
Scaling Laws for Downstream Task Performance of Large Language ModelsInternational Conference on Learning Representations (ICLR), 2024
Berivan Isik
Natalia Ponomareva
Hussein Hazimeh
Dimitris Paparas
Sergei Vassilvitskii
Sanmi Koyejo
383
52
0
06 Feb 2024
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
Haowei Lin
Baizhou Huang
Haotian Ye
Qinyu Chen
Zihao Wang
Sujian Li
Jianzhu Ma
Xiaojun Wan
James Zou
Yitao Liang
388
30
0
04 Feb 2024
CroissantLLM: A Truly Bilingual French-English Language Model
CroissantLLM: A Truly Bilingual French-English Language Model
Manuel Faysse
Patrick Fernandes
Nuno M. Guerreiro
António Loison
Duarte M. Alves
...
François Yvon
André F.T. Martins
Gautier Viaud
C´eline Hudelot
Pierre Colombo
797
56
0
01 Feb 2024
The Universal Statistical Structure and Scaling Laws of Chaos and
  Turbulence
The Universal Statistical Structure and Scaling Laws of Chaos and Turbulence
Noam Levi
Yaron Oz
AI4CE
310
2
0
02 Nov 2023
A Benchmark for Learning to Translate a New Language from One Grammar
  Book
A Benchmark for Learning to Translate a New Language from One Grammar BookInternational Conference on Learning Representations (ICLR), 2023
Garrett Tanzer
Mirac Suzgun
Chenguang Xi
Dan Jurafsky
Luke Melas-Kyriazi
370
94
0
28 Sep 2023
The Underlying Scaling Laws and Universal Statistical Structure of
  Complex Datasets
The Underlying Scaling Laws and Universal Statistical Structure of Complex Datasets
Noam Levi
Yaron Oz
454
11
0
26 Jun 2023
Multilingual Large Language Models Are Not (Yet) Code-Switchers
Multilingual Large Language Models Are Not (Yet) Code-SwitchersConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ruochen Zhang
Samuel Cahyawijaya
Jan Christian Blaise Cruz
Genta Indra Winata
Alham Fikri Aji
LRM
1.5K
88
0
23 May 2023
When Does Monolingual Data Help Multilingual Translation: The Role of
  Domain and Model Scale
When Does Monolingual Data Help Multilingual Translation: The Role of Domain and Model ScaleNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Christos Baziotis
Biao Zhang
Alexandra Birch
Barry Haddow
445
2
0
23 May 2023
On the Pareto Front of Multilingual Neural Machine Translation
On the Pareto Front of Multilingual Neural Machine TranslationNeural Information Processing Systems (NeurIPS), 2023
Liang Chen
Shuming Ma
Dongdong Zhang
Furu Wei
Baobao Chang
MoE
408
8
0
06 Apr 2023
Causes and Cures for Interference in Multilingual Translation
Causes and Cures for Interference in Multilingual TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Uri Shaham
Maha Elbayad
Vedanuj Goswami
Omer Levy
Shruti Bhosale
364
32
0
14 Dec 2022
1
Page 1 of 1