Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.03193
Cited By
The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation
6 June 2021
Naman Goyal
Cynthia Gao
Vishrav Chaudhary
Peng-Jen Chen
Guillaume Wenzek
Da Ju
Sanjan Krishnan
MarcÁurelio Ranzato
Francisco Guzman
Angela Fan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation"
50 / 81 papers shown
Title
Unveiling Language-Specific Features in Large Language Models via Sparse Autoencoders
Boyi Deng
Yu Wan
Yidan Zhang
Baosong Yang
Fuli Feng
39
0
0
08 May 2025
Bemba Speech Translation: Exploring a Low-Resource African Language
Muhammad Hazim Al Farouq
Aman Kassahun Wassie
Yasmin Moslem
38
0
0
05 May 2025
MultiLoKo: a multilingual local knowledge benchmark for LLMs spanning 31 languages
Dieuwke Hupkes
Nikolay Bogoychev
82
0
0
14 Apr 2025
Large Language Models in Numberland: A Quick Test of Their Numerical Reasoning Abilities
Roussel Rahman
ReLM
ELM
LRM
46
0
0
31 Mar 2025
Whispering in Amharic: Fine-tuning Whisper for Low-resource Language
Dawit Ketema Gete
Bedru Yimam Ahamed
Tadesse Destaw Belay
Yohannes Ayana Ejigu
Sukairaj Hafiz Imam
...
Umma Aliyu Musa
Martin Semmann
Shamsuddeen Hassan Muhammad
Henning Schreiber
Seid Muhie Yimam
43
0
0
24 Mar 2025
Arabizi vs LLMs: Can the Genie Understand the Language of Aladdin?
Perla Al Almaoui
Pierrette Bouillon
Simon Hengchen
57
0
0
28 Feb 2025
NaijaNLP: A Survey of Nigerian Low-Resource Languages
Isa Inuwa-Dutse
37
0
0
27 Feb 2025
How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations
Hyunji Lee
Danni Liu
Supriti Sinhamahapatra
Jan Niehues
103
0
0
21 Feb 2025
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation
Vera Neplenbroek
Arianna Bisazza
Raquel Fernández
100
0
0
17 Feb 2025
DCAD-2000: A Multilingual Dataset across 2000+ Languages with Data Cleaning as Anomaly Detection
Yingli Shen
Wen Lai
Shuo Wang
Xueren Zhang
Kangyang Luo
Alexander M. Fraser
Maosong Sun
47
0
0
17 Feb 2025
BOUQuET: dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation
Omnilingual MT Team
Pierre Yves Andrews
Mikel Artetxe
Mariano Coria Meglioli
Marta R. Costa-jussá
...
Eduardo Sánchez
Ioannis Tsiamas
Arina Turkatenko
Albert Ventayol-Boada
Shireen Yates
98
0
0
06 Feb 2025
Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study
Menglong Cui
Pengzhi Gao
Wei Liu
Jian Luan
Bin Wang
LRM
41
0
0
04 Feb 2025
Towards Multilingual LLM Evaluation for Baltic and Nordic languages: A study on Lithuanian History
Yevhen Kostiuk
O. Vitman
Łukasz Gagała
Artur Kiulian
ELM
95
0
0
17 Jan 2025
MoCE: Adaptive Mixture of Contextualization Experts for Byte-based Neural Machine Translation
Langlin Huang
Mengyu Bu
Yang Feng
21
0
0
03 Nov 2024
Boosting LLM Translation Skills without General Ability Loss via Rationale Distillation
Junhong Wu
Yang Zhao
Yangyifan Xu
Bing Liu
Chengqing Zong
CLL
33
1
0
17 Oct 2024
Effective Self-Mining of In-Context Examples for Unsupervised Machine Translation with LLMs
Abdellah El Mekki
Muhammad Abdul-Mageed
LRM
29
0
0
14 Oct 2024
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
HyoJung Han
Akiko Eriguchi
Haoran Xu
Hieu T. Hoang
Marine Carpuat
Huda Khayrallah
VLM
32
2
0
12 Oct 2024
Characterizing and Efficiently Accelerating Multimodal Generation Model Inference
Yejin Lee
Anna Y. Sun
Basil Hosmer
Bilge Acun
Can Balioglu
...
Ram Pasunuru
Scott Yih
Sravya Popuri
Xing Liu
Carole-Jean Wu
50
2
0
30 Sep 2024
EMMeTT: Efficient Multimodal Machine Translation Training
Piotr Żelasko
Zhehuai Chen
Mengru Wang
Daniel Galvez
Oleksii Hrinchuk
Shuoyang Ding
Ke Hu
Jagadeesh Balam
Vitaly Lavrukhin
Boris Ginsburg
25
1
0
20 Sep 2024
Shaping the Future of Endangered and Low-Resource Languages -- Our Role in the Age of LLMs: A Keynote at ECIR 2024
Josiane Mothe
34
2
0
05 Sep 2024
Misfitting With AI: How Blind People Verify and Contest AI Errors
Rahaf Alharbi
P. Lor
Jaylin Herskovitz
S. Schoenebeck
Robin Brewer
29
10
0
13 Aug 2024
Modular Sentence Encoders: Separating Language Specialization from Cross-Lingual Alignment
Yongxin Huang
Kexin Wang
Goran Glavavs
Iryna Gurevych
44
0
0
20 Jul 2024
Too Late to Train, Too Early To Use? A Study on Necessity and Viability of Low-Resource Bengali LLMs
Tamzeed Mahfuz
Satak Kumar Dey
Ruwad Naswan
Hasnaen Adil
Khondker Salman Sayeed
Haz Sameen Shahgir
29
0
0
29 Jun 2024
M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models
Rishabh Maheshwary
Vikas Yadav
Hoang Nguyen
Khyati Mahajan
Sathwik Tejaswi Madhusudhan
40
3
0
24 Jun 2024
Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models
Lynn Chua
Badih Ghazi
Yangsibo Huang
Pritish Kamath
Ravi Kumar
Pasin Manurangsi
Amer Sinha
Chulin Xie
Chiyuan Zhang
51
1
0
23 Jun 2024
Datasets for Multilingual Answer Sentence Selection
Matteo Gabburo
S. Campese
Federico Agostini
Alessandro Moschitti
26
0
0
14 Jun 2024
IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
David Ifeoluwa Adelani
Jessica Ojo
Israel Abebe Azime
Jian Yun Zhuang
Jesujoba Oluwadara Alabi
...
Salomey Osei
Sokhar Samb
Tadesse Kebede Guge
Pontus Stenetorp
Pontus Stenetorp
ELM
50
7
0
05 Jun 2024
Critical Learning Periods: Leveraging Early Training Dynamics for Efficient Data Pruning
E. Chimoto
Jay Gala
Orevaoghene Ahia
Julia Kreutzer
Bruce A. Bassett
Sara Hooker
VLM
34
4
0
29 May 2024
Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation
Tong Su
Xin Peng
Sarubi Thillainathan
David Guzmán
Surangika Ranathunga
En-Shiun Annie Lee
27
2
0
05 Apr 2024
A Systematic Analysis of Subwords and Cross-Lingual Transfer in Multilingual Translation
Francois Meyer
Jan Buys
24
1
0
29 Mar 2024
Where does In-context Translation Happen in Large Language Models
Suzanna Sia
David Mueller
Kevin Duh
LRM
33
0
0
07 Mar 2024
Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages
Yuan Zhang
Yile Wang
Zijun Liu
Shuo Wang
Xiaolong Wang
Peng Li
Maosong Sun
Yang Janet Liu
LRM
27
9
0
19 Feb 2024
Gradable ChatGPT Translation Evaluation
Hui Jiao
Bei Peng
Lu Zong
Xiaojun Zhang
Xinwei Li
33
1
0
18 Jan 2024
Machine Translation Models are Zero-Shot Detectors of Translation Direction
Michelle Wastl
Jannis Vamvas
Rico Sennrich
VLM
15
0
0
12 Jan 2024
How Vocabulary Sharing Facilitates Multilingualism in LLaMA?
Fei Yuan
Shuai Yuan
Zhiyong Wu
Lei Li
20
10
0
15 Nov 2023
MELA: Multilingual Evaluation of Linguistic Acceptability
Ziyin Zhang
Yikang Liu
Wei Huang
Junyu Mao
Rui Wang
Hai Hu
22
3
0
15 Nov 2023
A Benchmark for Learning to Translate a New Language from One Grammar Book
Garrett Tanzer
Mirac Suzgun
Chenguang Xi
Dan Jurafsky
Luke Melas-Kyriazi
24
51
0
28 Sep 2023
Towards Effective Disambiguation for Machine Translation with Large Language Models
Vivek Iyer
Pinzhen Chen
Alexandra Birch
17
10
0
20 Sep 2023
Baichuan 2: Open Large-scale Language Models
Ai Ming Yang
Bin Xiao
Bingning Wang
Borong Zhang
Ce Bian
...
Youxin Jiang
Yuchen Gao
Yupeng Zhang
Zenan Zhou
Zhiying Wu
ELM
LRM
61
699
0
19 Sep 2023
Joint Dropout: Improving Generalizability in Low-Resource Neural Machine Translation through Phrase Pair Variables
Ali Araabi
Vlad Niculae
Christof Monz
29
1
0
24 Jul 2023
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models
Orevaoghene Ahia
Sachin Kumar
Hila Gonen
Jungo Kasai
David R. Mortensen
Noah A. Smith
Yulia Tsvetkov
29
80
0
23 May 2023
Mitigating Data Imbalance and Representation Degeneration in Multilingual Machine Translation
Wen Lai
Alexandra Chronopoulou
Alexander M. Fraser
30
4
0
22 May 2023
On the Off-Target Problem of Zero-Shot Multilingual Neural Machine Translation
Liang Chen
Shuming Ma
Dongdong Zhang
Furu Wei
Baobao Chang
10
5
0
18 May 2023
Language Model Tokenizers Introduce Unfairness Between Languages
Aleksandar Petrov
Emanuele La Malfa
Philip H. S. Torr
Adel Bibi
16
96
0
17 May 2023
Spaiche: Extending State-of-the-Art ASR Models to Swiss German Dialects
Clément Sicard
Kajetan Pyszkowski
Victor Gillioz
19
7
0
20 Apr 2023
Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis
Wenhao Zhu
Hongyi Liu
Qingxiu Dong
Jingjing Xu
Shujian Huang
Lingpeng Kong
Jiajun Chen
Lei Li
LRM
29
139
0
10 Apr 2023
Approaches to Corpus Creation for Low-Resource Language Technology: the Case of Southern Kurdish and Laki
Sina Ahmadi
Zahra Azin
Sara Belelli
Antonios Anastasopoulos
19
3
0
03 Apr 2023
Hallucinations in Large Multilingual Translation Models
Nuno M. Guerreiro
Duarte M. Alves
Jonas Waldendorf
Barry Haddow
Alexandra Birch
Pierre Colombo
André F.T. Martins
VLM
HILM
LRM
18
140
0
28 Mar 2023
Bilex Rx: Lexical Data Augmentation for Massively Multilingual Machine Translation
Alex Jones
Isaac Caswell
Ishan Saxena
Orhan Firat
21
8
0
27 Mar 2023
The unreasonable effectiveness of few-shot learning for machine translation
Xavier Garcia
Yamini Bansal
Colin Cherry
George F. Foster
M. Krikun
Fan Feng
Melvin Johnson
Orhan Firat
27
102
0
02 Feb 2023
1
2
Next