ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.11934
  4. Cited By
mT5: A massively multilingual pre-trained text-to-text transformer
v1v2v3 (latest)

mT5: A massively multilingual pre-trained text-to-text transformer

22 October 2020
Linting Xue
Noah Constant
Adam Roberts
Mihir Kale
Rami Al-Rfou
Aditya Siddhant
Aditya Barua
Colin Raffel
ArXiv (abs)PDFHTMLHuggingFace (4 upvotes)

Papers citing "mT5: A massively multilingual pre-trained text-to-text transformer"

50 / 1,561 papers shown
Title
Large Dual Encoders Are Generalizable Retrievers
Large Dual Encoders Are Generalizable Retrievers
Jianmo Ni
Chen Qu
Jing Lu
Zhuyun Dai
Gustavo Hernández Ábrego
...
Vincent Zhao
Yi Luan
Keith B. Hall
Ming-Wei Chang
Yinfei Yang
DML
541
551
0
15 Dec 2021
WECHSEL: Effective initialization of subword embeddings for
  cross-lingual transfer of monolingual language models
WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models
Benjamin Minixhofer
Fabian Paischer
Navid Rekabsaz
295
103
0
13 Dec 2021
Dependency Learning for Legal Judgment Prediction with a Unified
  Text-to-Text Transformer
Dependency Learning for Legal Judgment Prediction with a Unified Text-to-Text Transformer
Yunyun Huang
Xiaoyu Shen
Chuanyi Li
Jidong Ge
B. Luo
AILaw
154
24
0
13 Dec 2021
Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-sentence
  Dependency Graph
Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-sentence Dependency Graph
Liyan Xu
Xuchao Zhang
Bo Zong
Yanchi Liu
Wei Cheng
Jingchao Ni
Haifeng Chen
Bo Pan
Jinho Choi
272
4
0
01 Dec 2021
NER-BERT: A Pre-trained Model for Low-Resource Entity Tagging
NER-BERT: A Pre-trained Model for Low-Resource Entity Tagging
Zihan Liu
Feijun Jiang
Yuxiang Hu
Chen Shi
Pascale Fung
288
42
0
01 Dec 2021
PSG: Prompt-based Sequence Generation for Acronym Extraction
PSG: Prompt-based Sequence Generation for Acronym Extraction
Bin Li
Fei Xia
Yi-Zhong Weng
Xiusheng Huang
Bin Sun
Shutao Li
226
7
0
29 Nov 2021
Less is More: Generating Grounded Navigation Instructions from Landmarks
Less is More: Generating Grounded Navigation Instructions from Landmarks
Su Wang
Ceslee Montgomery
Jordi Orbay
Vighnesh Birodkar
Aleksandra Faust
Izzeddin Gur
Natasha Jaques
Austin Waters
Jason Baldridge
Peter Anderson
369
78
0
25 Nov 2021
Sparse is Enough in Scaling Transformers
Sparse is Enough in Scaling Transformers
Sebastian Jaszczur
Aakanksha Chowdhery
Afroz Mohiuddin
Lukasz Kaiser
Wojciech Gajewski
Henryk Michalewski
Jonni Kanerva
MoE
143
120
0
24 Nov 2021
Knowledge Enhanced Sports Game Summarization
Knowledge Enhanced Sports Game Summarization
Jiaan Wang
Zhixu Li
Tingyi Zhang
Duo Zheng
Jianfeng Qu
An Liu
Lei Zhao
Zhigang Chen
AI4TS
104
14
0
24 Nov 2021
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with
  Gradient-Disentangled Embedding Sharing
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing
Pengcheng He
Jianfeng Gao
Weizhu Chen
754
1,554
0
18 Nov 2021
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at
  Scale
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
...
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
378
897
0
17 Nov 2021
Induce, Edit, Retrieve: Language Grounded Multimodal Schema for
  Instructional Video Retrieval
Induce, Edit, Retrieve: Language Grounded Multimodal Schema for Instructional Video Retrieval
Yue Yang
Joongwon Kim
Artemis Panagopoulou
Mark Yatskar
Chris Callison-Burch
LM&Ro
217
14
0
17 Nov 2021
LiT: Zero-Shot Transfer with Locked-image text Tuning
LiT: Zero-Shot Transfer with Locked-image text TuningComputer Vision and Pattern Recognition (CVPR), 2021
Xiaohua Zhai
Tianlin Li
Basil Mustafa
Andreas Steiner
Daniel Keysers
Alexander Kolesnikov
Lucas Beyer
VLM
612
659
0
15 Nov 2021
Calculating Question Similarity is Enough: A New Method for KBQA Tasks
Hanyu Zhao
Shaoqing Yuan
Jiahong Leng
X. Pan
Guoqiang Wang
Ledell Wu
Jie Tang
140
0
0
15 Nov 2021
Automated question generation and question answering from Turkish texts
Automated question generation and question answering from Turkish texts
Fatih Çagatay Akyön
Ali Devrim Ekin Çavusoglu
Cemil Cengiz
S. Altinuc
A. Temi̇zel
239
16
0
11 Nov 2021
A Survey on Green Deep Learning
A Survey on Green Deep Learning
Jingjing Xu
Wangchunshu Zhou
Zhiyi Fu
Hao Zhou
Lei Li
VLM
435
92
0
08 Nov 2021
Personalized Benchmarking with the Ludwig Benchmarking Toolkit
Personalized Benchmarking with the Ludwig Benchmarking Toolkit
A. Narayan
Piero Molino
Karan Goel
Willie Neiswanger
Christopher Ré
163
11
0
08 Nov 2021
MT3: Multi-Task Multitrack Music Transcription
MT3: Multi-Task Multitrack Music TranscriptionInternational Conference on Learning Representations (ICLR), 2021
Josh Gardner
Ian Simon
Ethan Manilow
Curtis Hawthorne
Jesse Engel
450
118
0
04 Nov 2021
ÚFAL at MultiLexNorm 2021: Improving Multilingual Lexical
  Normalization by Fine-tuning ByT5
ÚFAL at MultiLexNorm 2021: Improving Multilingual Lexical Normalization by Fine-tuning ByT5
David Samuel
Milan Straka
142
17
0
28 Oct 2021
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text
  Joint Pre-Training
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training
Ankur Bapna
Yu-An Chung
Na Wu
Anmol Gulati
Ye Jia
J. Clark
Melvin Johnson
Jason Riesa
Alexis Conneau
Yu Zhang
VLM
254
100
0
20 Oct 2021
Predicting the Performance of Multilingual NLP Models
Predicting the Performance of Multilingual NLP Models
A. Srinivasan
Sunayana Sitaram
T. Ganu
Sandipan Dandapat
Kalika Bali
Monojit Choudhury
LRM
117
34
0
17 Oct 2021
Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural
  Machine Translation
Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural Machine Translation
Guanhua Chen
Shuming Ma
Yun-Nung Chen
Dongdong Zhang
Jia Pan
Wenping Wang
Furu Wei
LRM
153
17
0
16 Oct 2021
Sharpness-Aware Minimization Improves Language Model Generalization
Sharpness-Aware Minimization Improves Language Model Generalization
Dara Bahri
H. Mobahi
Yi Tay
415
116
0
16 Oct 2021
Prix-LM: Pretraining for Multilingual Knowledge Base Construction
Prix-LM: Pretraining for Multilingual Knowledge Base Construction
Wenxuan Zhou
Fangyu Liu
Ivan Vulić
Nigel Collier
Muhao Chen
KELM
227
21
0
16 Oct 2021
EncT5: A Framework for Fine-tuning T5 as Non-autoregressive Models
EncT5: A Framework for Fine-tuning T5 as Non-autoregressive Models
Frederick Liu
T. Huang
Shihang Lyu
Siamak Shakeri
Hongkun Yu
Jing Li
244
10
0
16 Oct 2021
Multilingual unsupervised sequence segmentation transfers to extremely
  low-resource languages
Multilingual unsupervised sequence segmentation transfers to extremely low-resource languages
C.M. Downey
Shannon Drizin
Levon Haroutunian
Shivin Thukral
129
2
0
16 Oct 2021
Tricks for Training Sparse Translation Models
Tricks for Training Sparse Translation Models
Dheeru Dua
Shruti Bhosale
Vedanuj Goswami
James Cross
M. Lewis
Angela Fan
MoE
310
23
0
15 Oct 2021
Why don't people use character-level machine translation?
Why don't people use character-level machine translation?
Jindrich Libovický
Helmut Schmid
Kangyang Luo
275
34
0
15 Oct 2021
GlobalWoZ: Globalizing MultiWoZ to Develop Multilingual Task-Oriented
  Dialogue Systems
GlobalWoZ: Globalizing MultiWoZ to Develop Multilingual Task-Oriented Dialogue Systems
Bosheng Ding
Junjie Hu
Lidong Bing
Sharifah Aljunied Mahani
Shafiq Joty
Luo Si
Chunyan Miao
305
42
0
14 Oct 2021
Few-shot Controllable Style Transfer for Low-Resource Multilingual
  Settings
Few-shot Controllable Style Transfer for Low-Resource Multilingual Settings
Kalpesh Krishna
Deepak Nathani
Xavier Garcia
Bidisha Samanta
Partha P. Talukdar
173
26
0
14 Oct 2021
Cross-Lingual Open-Domain Question Answering with Answer Sentence
  Generation
Cross-Lingual Open-Domain Question Answering with Answer Sentence Generation
Benjamin Muller
Luca Soldaini
Rik Koncel-Kedziorski
Eric Lind
Alessandro Moschitti
LRM
179
10
0
14 Oct 2021
Bandits Don't Follow Rules: Balancing Multi-Facet Machine Translation
  with Multi-Armed Bandits
Bandits Don't Follow Rules: Balancing Multi-Facet Machine Translation with Multi-Armed Bandits
Julia Kreutzer
David Vilar
Artem Sokolov
193
18
0
13 Oct 2021
MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better
  Translators
MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators
Zhixing Tan
Xiangwen Zhang
Shuo Wang
Yang Liu
VLMLRM
487
57
0
13 Oct 2021
Learning Compact Metrics for MT
Learning Compact Metrics for MTConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Amy Pu
Hyung Won Chung
Ankur P. Parikh
Sebastian Gehrmann
Thibault Sellam
159
112
0
12 Oct 2021
Småprat: DialoGPT for Natural Language Generation of Swedish
  Dialogue by Transfer Learning
Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning
Tosin Adewumi
Rickard Brannvall
Nosheen Abid
M. Pahlavan
Sana Sabah Sabry
F. Liwicki
Marcus Liwicki
96
21
0
12 Oct 2021
LaoPLM: Pre-trained Language Models for Lao
LaoPLM: Pre-trained Language Models for LaoInternational Conference on Language Resources and Evaluation (LREC), 2021
Hongyan Wu
Yingwen Fu
Chuwei Chen
Ziyu Yang
Shengyi Jiang
267
3
0
12 Oct 2021
Balancing Average and Worst-case Accuracy in Multitask Learning
Balancing Average and Worst-case Accuracy in Multitask Learning
Paul Michel
Sebastian Ruder
Dani Yogatama
193
13
0
12 Oct 2021
Unsupervised Neural Machine Translation with Generative Language Models
  Only
Unsupervised Neural Machine Translation with Generative Language Models Only
Jesse Michael Han
Igor Babuschkin
Harrison Edwards
Arvind Neelakantan
Tao Xu
...
Alex Ray
Pranav Shyam
Aditya A. Ramesh
Alec Radford
Ilya Sutskever
248
40
0
11 Oct 2021
VieSum: How Robust Are Transformer-based Models on Vietnamese
  Summarization?
VieSum: How Robust Are Transformer-based Models on Vietnamese Summarization?
Hieu Duy Nguyen
Long Phan
J. Anibal
Alec Peltekian
H. Tran
123
5
0
08 Oct 2021
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion
  Parameter Pretraining
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining
Junyang Lin
An Yang
Jinze Bai
Chang Zhou
Le Jiang
...
Jie Zhang
Yong Li
Jialin Li
Jingren Zhou
Hongxia Yang
MoE
310
46
0
08 Oct 2021
Sequential Reptile: Inter-Task Gradient Alignment for Multilingual
  Learning
Sequential Reptile: Inter-Task Gradient Alignment for Multilingual Learning
Seanie Lee
Haebeom Lee
Juho Lee
Sung Ju Hwang
MoMeCLL
340
19
0
06 Oct 2021
Compositional generalization in semantic parsing with pretrained
  transformers
Compositional generalization in semantic parsing with pretrained transformers
A. Orhan
205
8
0
30 Sep 2021
EdinSaar@WMT21: North-Germanic Low-Resource Multilingual NMT
EdinSaar@WMT21: North-Germanic Low-Resource Multilingual NMT
Svetlana Tchistiakova
Jesujoba Oluwadara Alabi
Koel Dutta Chowdhury
Sourav Dutta
Dana Ruiter
VLM
133
6
0
29 Sep 2021
Cross-Lingual Language Model Meta-Pretraining
Cross-Lingual Language Model Meta-Pretraining
Zewen Chi
Heyan Huang
Luyang Liu
Yu Bai
Xian-Ling Mao
LRM
370
0
0
23 Sep 2021
Scalable and Efficient MoE Training for Multitask Multilingual Models
Scalable and Efficient MoE Training for Multitask Multilingual Models
Young Jin Kim
A. A. Awan
Alexandre Muzio
Andres Felipe Cruz Salinas
Liyang Lu
Amr Hendy
Samyam Rajbhandari
Yuxiong He
Hany Awadalla
MoE
239
98
0
22 Sep 2021
Multilingual Document-Level Translation Enables Zero-Shot Transfer From
  Sentences to Documents
Multilingual Document-Level Translation Enables Zero-Shot Transfer From Sentences to Documents
Biao Zhang
Ankur Bapna
Melvin Johnson
A. Dabirmoghaddam
N. Arivazhagan
Orhan Firat
136
15
0
21 Sep 2021
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
Nguyen Luong Tran
Duong Minh Le
Dat Quoc Nguyen
293
64
0
20 Sep 2021
Enforcing fairness in private federated learning via the modified method
  of differential multipliers
Enforcing fairness in private federated learning via the modified method of differential multipliers
Borja Rodríguez Gálvez
Filip Granqvist
Rogier van Dalen
M. Seigel
FedML
199
58
0
17 Sep 2021
Language Models are Few-shot Multilingual Learners
Language Models are Few-shot Multilingual Learners
Genta Indra Winata
Andrea Madotto
Mohammad Kachuee
Rosanne Liu
J. Yosinski
Pascale Fung
ELMLRM
195
153
0
16 Sep 2021
Cross-lingual Transfer of Monolingual Models
Cross-lingual Transfer of Monolingual Models
Evangelia Gogoulou
Ariel Ekgren
T. Isbister
Magnus Sahlgren
237
20
0
15 Sep 2021
Previous
123...2829303132
Next