ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.11934
  4. Cited By
mT5: A massively multilingual pre-trained text-to-text transformer

mT5: A massively multilingual pre-trained text-to-text transformer

22 October 2020
Linting Xue
Noah Constant
Adam Roberts
Mihir Kale
Rami Al-Rfou
Aditya Siddhant
Aditya Barua
Colin Raffel
ArXivPDFHTML

Papers citing "mT5: A massively multilingual pre-trained text-to-text transformer"

50 / 358 papers shown
Title
GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator
GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator
Jian Yang
Shuming Ma
Li Dong
Shaohan Huang
Haoyang Huang
Yuwei Yin
Dongdong Zhang
Liqun Yang
Furu Wei
Zhoujun Li
SyDa
AI4CE
32
25
0
20 Dec 2022
LR-Sum: Summarization for Less-Resourced Languages
LR-Sum: Summarization for Less-Resourced Languages
Chester Palen-Michel
Constantine Lignos
9
4
0
19 Dec 2022
Latent Diffusion for Language Generation
Latent Diffusion for Language Generation
Justin Lovelace
Varsha Kishore
Chao-gang Wan
Eliot Shekhtman
Kilian Q. Weinberger
DiffM
24
71
0
19 Dec 2022
Towards leveraging latent knowledge and Dialogue context for real-world
  conversational question answering
Towards leveraging latent knowledge and Dialogue context for real-world conversational question answering
Shaomu Tan
Denis Paperno
RALM
21
0
0
17 Dec 2022
Improving Cross-task Generalization of Unified Table-to-text Models with
  Compositional Task Configurations
Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations
Jifan Chen
Yuhao Zhang
Lan Liu
Rui Dong
Xinchi Chen
Patrick K. L. Ng
William Yang Wang
Zhiheng Huang
AI4CE
17
4
0
17 Dec 2022
DAMP: Doubly Aligned Multilingual Parser for Task-Oriented Dialogue
DAMP: Doubly Aligned Multilingual Parser for Task-Oriented Dialogue
William B. Held
Christopher Hidey
Fei Liu
Eric Zhu
Rahul Goel
Diyi Yang
Rushin Shah
13
0
0
15 Dec 2022
Multi-VALUE: A Framework for Cross-Dialectal English NLP
Multi-VALUE: A Framework for Cross-Dialectal English NLP
Caleb Ziems
William B. Held
Jingfeng Yang
Jwala Dhamala
Rahul Gupta
Diyi Yang
27
40
0
15 Dec 2022
Advancing Multilingual Pre-training: TRIP Triangular Document-level
  Pre-training for Multilingual Language Models
Advancing Multilingual Pre-training: TRIP Triangular Document-level Pre-training for Multilingual Language Models
Hongyuan Lu
Haoyang Huang
Shuming Ma
Dongdong Zhang
W. Lam
Furu Wei
22
4
0
15 Dec 2022
Fixing MoE Over-Fitting on Low-Resource Languages in Multilingual
  Machine Translation
Fixing MoE Over-Fitting on Low-Resource Languages in Multilingual Machine Translation
Maha Elbayad
Anna Y. Sun
Shruti Bhosale
MoE
46
8
0
15 Dec 2022
ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for
  Programming Languages
ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages
Yekun Chai
Shuohuan Wang
Chao Pang
Yu Sun
Hao Tian
Hua-Hong Wu
24
35
0
13 Dec 2022
A Survey on Natural Language Processing for Programming
A Survey on Natural Language Processing for Programming
Qingfu Zhu
Xianzhen Luo
Fang Liu
Cuiyun Gao
Wanxiang Che
23
1
0
12 Dec 2022
MiLMo:Minority Multilingual Pre-trained Language Model
MiLMo:Minority Multilingual Pre-trained Language Model
Sisi Liu
Hanru Shi
Xinhe Yu
Wugedele Bao
Yuan Sun
Xiaobing Zhao
11
0
0
04 Dec 2022
Languages You Know Influence Those You Learn: Impact of Language
  Characteristics on Multi-Lingual Text-to-Text Transfer
Languages You Know Influence Those You Learn: Impact of Language Characteristics on Multi-Lingual Text-to-Text Transfer
Benjamin Muller
Deepanshu Gupta
Siddharth Patwardhan
J. Fauconnier
David Vandyke
Sachin Agarwal
27
5
0
04 Dec 2022
Nonparametric Masked Language Modeling
Nonparametric Masked Language Modeling
Sewon Min
Weijia Shi
M. Lewis
Xilun Chen
Wen-tau Yih
Hannaneh Hajishirzi
Luke Zettlemoyer
RALM
40
48
0
02 Dec 2022
Extending the Subwording Model of Multilingual Pretrained Models for New
  Languages
Extending the Subwording Model of Multilingual Pretrained Models for New Languages
K. Imamura
Eiichiro Sumita
VLM
19
3
0
29 Nov 2022
Beyond Counting Datasets: A Survey of Multilingual Dataset Construction
  and Necessary Resources
Beyond Counting Datasets: A Survey of Multilingual Dataset Construction and Necessary Resources
Xinyan Velocity Yu
Akari Asai
Trina Chatterjee
Junjie Hu
Eunsol Choi
16
21
0
28 Nov 2022
Breaking the Representation Bottleneck of Chinese Characters: Neural
  Machine Translation with Stroke Sequence Modeling
Breaking the Representation Bottleneck of Chinese Characters: Neural Machine Translation with Stroke Sequence Modeling
Zhijun Wang
Xuebo Liu
Min Zhang
25
11
0
23 Nov 2022
GLAMI-1M: A Multilingual Image-Text Fashion Dataset
GLAMI-1M: A Multilingual Image-Text Fashion Dataset
Vaclav Kosar
A. Hoskovec
Milan Šulc
Radek Bartyzal
VLM
22
3
0
17 Nov 2022
Unified Question Answering in Slovene
Unified Question Answering in Slovene
Katja Logar
Marko Robnik-Šikonja
14
0
0
16 Nov 2022
A Benchmark and Dataset for Post-OCR text correction in Sanskrit
A Benchmark and Dataset for Post-OCR text correction in Sanskrit
Ayush Maheshwari
Nikhil Singh
Amrith Krishna
Ganesh Ramakrishnan
23
12
0
15 Nov 2022
mOKB6: A Multilingual Open Knowledge Base Completion Benchmark
mOKB6: A Multilingual Open Knowledge Base Completion Benchmark
Shubham Mittal
Keshav Kolluru
Soumen Chakrabarti
Mausam
25
4
0
13 Nov 2022
DiaASQ : A Benchmark of Conversational Aspect-based Sentiment Quadruple
  Analysis
DiaASQ : A Benchmark of Conversational Aspect-based Sentiment Quadruple Analysis
Bobo Li
Hao Fei
Fei Li
Yu-hao Wu
Jinsong Zhang
...
Jingye Li
Yijiang Liu
Lizi Liao
Tat-Seng Chua
Donghong Ji
34
41
0
10 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
89
2,301
0
09 Nov 2022
Reranking Overgenerated Responses for End-to-End Task-Oriented Dialogue
  Systems
Reranking Overgenerated Responses for End-to-End Task-Oriented Dialogue Systems
Songbo Hu
Ivan Vulić
Fangyu Liu
Anna Korhonen
30
0
0
07 Nov 2022
Dialect-robust Evaluation of Generated Text
Dialect-robust Evaluation of Generated Text
Jiao Sun
Thibault Sellam
Elizabeth Clark
Tu Vu
Timothy Dozat
Dan Garrette
Aditya Siddhant
Jacob Eisenstein
Sebastian Gehrmann
13
19
0
02 Nov 2022
Two-stage LLM Fine-tuning with Less Specialization and More
  Generalization
Two-stage LLM Fine-tuning with Less Specialization and More Generalization
Yihan Wang
Si Si
Daliang Li
Michal Lukasik
Felix X. Yu
Cho-Jui Hsieh
Inderjit S Dhillon
Sanjiv Kumar
34
29
0
01 Nov 2022
Too Brittle To Touch: Comparing the Stability of Quantization and
  Distillation Towards Developing Lightweight Low-Resource MT Models
Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models
Harshita Diddee
Sandipan Dandapat
Monojit Choudhury
T. Ganu
Kalika Bali
27
5
0
27 Oct 2022
Weakly Supervised Data Augmentation Through Prompting for Dialogue
  Understanding
Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding
Maximillian Chen
Alexandros Papangelis
Chenyang Tao
Andrew Rosenbaum
Seokhwan Kim
Yang Liu
Zhou Yu
Dilek Z. Hakkani-Tür
39
32
0
25 Oct 2022
IDK-MRC: Unanswerable Questions for Indonesian Machine Reading
  Comprehension
IDK-MRC: Unanswerable Questions for Indonesian Machine Reading Comprehension
Rifki Afina Putri
Alice H. Oh
26
9
0
25 Oct 2022
EUR-Lex-Sum: A Multi- and Cross-lingual Dataset for Long-form
  Summarization in the Legal Domain
EUR-Lex-Sum: A Multi- and Cross-lingual Dataset for Long-form Summarization in the Legal Domain
Dennis Aumiller
Ashish Chouhan
Michael Gertz
ELM
AILaw
32
35
0
24 Oct 2022
RuCoLA: Russian Corpus of Linguistic Acceptability
RuCoLA: Russian Corpus of Linguistic Acceptability
Vladislav Mikhailov
T. Shamardina
Max Ryabinin
A. Pestova
I. Smurov
Ekaterina Artemova
22
28
0
23 Oct 2022
Graphemic Normalization of the Perso-Arabic Script
Graphemic Normalization of the Perso-Arabic Script
R. Doctor
Alexander Gutkin
Cibu Johny
Brian Roark
R. Sproat
36
4
0
21 Oct 2022
Gui at MixMT 2022 : English-Hinglish: An MT approach for translation of
  code mixed data
Gui at MixMT 2022 : English-Hinglish: An MT approach for translation of code mixed data
Akshat Gahoi
Jayant Duneja
Anshul Padhi
Shivam Mangale
Saransh Rajput
Tanvi Kamble
D. Sharma
Vasudeva Varma
25
3
0
21 Oct 2022
SIT at MixMT 2022: Fluent Translation Built on Giant Pre-trained Models
SIT at MixMT 2022: Fluent Translation Built on Giant Pre-trained Models
A. Khan
Hrishikesh Kanade
G. Budhrani
Preet Jhanglani
Jia Xu
71
2
0
21 Oct 2022
Maestro-U: Leveraging joint speech-text representation learning for zero
  supervised speech ASR
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR
Zhehuai Chen
Ankur Bapna
Andrew Rosenberg
Yu Zhang
Bhuvana Ramabhadran
Pedro J. Moreno
Nanxin Chen
30
17
0
18 Oct 2022
Tone prediction and orthographic conversion for Basaa
Tone prediction and orthographic conversion for Basaa
I. Nikitin
Brian O'Connor
Anastasia N. Safonova
8
1
0
13 Oct 2022
Knowledge Distillation Transfer Sets and their Impact on Downstream NLU
  Tasks
Knowledge Distillation Transfer Sets and their Impact on Downstream NLU Tasks
Charith Peris
Lizhen Tan
Thomas Gueudré
Turan Gojayev
Vivi Wei
Gokmen Oz
22
4
0
10 Oct 2022
Comparing Computational Architectures for Automated Journalism
Comparing Computational Architectures for Automated Journalism
Yan Sym
João Gabriel Moura Campos
M. M. José
Fabio Gagliardi Cozman
26
0
0
08 Oct 2022
Generative Language Models for Paragraph-Level Question Generation
Generative Language Models for Paragraph-Level Question Generation
Asahi Ushio
Fernando Alva-Manchego
Jose Camacho-Collados
ELM
11
45
0
08 Oct 2022
Event Extraction: A Survey
Event Extraction: A Survey
Viet Dac Lai
15
9
0
07 Oct 2022
A New Path: Scaling Vision-and-Language Navigation with Synthetic
  Instructions and Imitation Learning
A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
Aishwarya Kamath
Peter Anderson
Su Wang
Jing Yu Koh
Alexander Ku
Austin Waters
Yinfei Yang
Jason Baldridge
Zarana Parekh
LM&Ro
20
45
0
06 Oct 2022
Language Models are Multilingual Chain-of-Thought Reasoners
Language Models are Multilingual Chain-of-Thought Reasoners
Freda Shi
Mirac Suzgun
Markus Freitag
Xuezhi Wang
Suraj Srivats
...
Yi Tay
Sebastian Ruder
Denny Zhou
Dipanjan Das
Jason W. Wei
ReLM
LRM
170
325
0
06 Oct 2022
XDoc: Unified Pre-training for Cross-Format Document Understanding
XDoc: Unified Pre-training for Cross-Format Document Understanding
Jingye Chen
Tengchao Lv
Lei Cui
Changrong Zhang
Furu Wei
48
13
0
06 Oct 2022
Honest Students from Untrusted Teachers: Learning an Interpretable
  Question-Answering Pipeline from a Pretrained Language Model
Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model
Jacob Eisenstein
D. Andor
Bernd Bohnet
Michael Collins
David M. Mimno
LRM
187
24
0
05 Oct 2022
GROOT: Corrective Reward Optimization for Generative Sequential Labeling
GROOT: Corrective Reward Optimization for Generative Sequential Labeling
Kazuma Hashimoto
K. Raman
VLM
11
1
0
29 Sep 2022
COMPILING: A Benchmark Dataset for Chinese Complexity Controllable
  Definition Generation
COMPILING: A Benchmark Dataset for Chinese Complexity Controllable Definition Generation
Jiaxin Yuan
Cunliang Kong
Chenhui Xie
Liner Yang
Erhong Yang
22
4
0
29 Sep 2022
Bidirectional Language Models Are Also Few-shot Learners
Bidirectional Language Models Are Also Few-shot Learners
Ajay Patel
Bryan Li
Mohammad Sadegh Rasooli
Noah Constant
Colin Raffel
Chris Callison-Burch
LRM
62
45
0
29 Sep 2022
An Empirical Study on Cross-X Transfer for Legal Judgment Prediction
An Empirical Study on Cross-X Transfer for Legal Judgment Prediction
Joel Niklaus
Matthias Sturmer
Ilias Chalkidis
ELM
AILaw
30
18
0
25 Sep 2022
MonoByte: A Pool of Monolingual Byte-level Language Models
MonoByte: A Pool of Monolingual Byte-level Language Models
Hugo Queiroz Abonizio
Leandro Rodrigues de Souza
R. Lotufo
Rodrigo Nogueira
23
1
0
22 Sep 2022
A Benchmark for Understanding and Generating Dialogue between Characters
  in Stories
A Benchmark for Understanding and Generating Dialogue between Characters in Stories
Jianzhu Yao
Ziqi Liu
Jian-Yu Guan
Minlie Huang
14
1
0
18 Sep 2022
Previous
12345678
Next