ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.11934
  4. Cited By
mT5: A massively multilingual pre-trained text-to-text transformer
v1v2v3 (latest)

mT5: A massively multilingual pre-trained text-to-text transformer

22 October 2020
Linting Xue
Noah Constant
Adam Roberts
Mihir Kale
Rami Al-Rfou
Aditya Siddhant
Aditya Barua
Colin Raffel
ArXiv (abs)PDFHTMLHuggingFace (4 upvotes)

Papers citing "mT5: A massively multilingual pre-trained text-to-text transformer"

50 / 1,561 papers shown
Title
BERnaT: Basque Encoders for Representing Natural Textual Diversity
BERnaT: Basque Encoders for Representing Natural Textual Diversity
Ekhi Azurmendi
Joseba Fernandez de Landa
Jaione Bengoetxea
Maite Heredia
Julen Etxaniz
Mikel Zubillaga
Ander Soraluze
A. Soroa
20
0
0
03 Dec 2025
InstanceV: Instance-Level Video Generation
InstanceV: Instance-Level Video Generation
Yuheng Chen
Teng Hu
Jiangning Zhang
Zhucun Xue
Ran Yi
Lizhuang Ma
DiffMVGen
84
0
0
28 Nov 2025
Softmax Transformers are Turing-Complete
Softmax Transformers are Turing-Complete
Hongjian Jiang
Michael Hahn
Georg Zetzsche
Anthony Widjaja Lin
LRM
153
0
0
25 Nov 2025
ArbESC+: Arabic Enhanced Edit Selection System Combination for Grammatical Error Correction Resolving conflict and improving system combination in Arabic GEC
ArbESC+: Arabic Enhanced Edit Selection System Combination for Grammatical Error Correction Resolving conflict and improving system combination in Arabic GEC
Ahlam Alrehili
Areej Alhothali
KELM
132
0
0
18 Nov 2025
iSeal: Encrypted Fingerprinting for Reliable LLM Ownership Verification
iSeal: Encrypted Fingerprinting for Reliable LLM Ownership Verification
Zixun Xiong
Gaoyi Wu
Qingyang Yu
Mingyu Derek Ma
Lingfeng Yao
Miao Pan
Xiaojiang Du
Hao Wang
142
0
0
12 Nov 2025
Introducing A Bangla Sentence - Gloss Pair Dataset for Bangla Sign Language Translation and Research
Introducing A Bangla Sentence - Gloss Pair Dataset for Bangla Sign Language Translation and Research
Neelavro Saha
Rafi Shahriyar
Nafis Ashraf Roudra
Saadman Sakib
Annajiat Alim Rasel
128
1
0
11 Nov 2025
AraFinNews: Arabic Financial Summarisation with Domain-Adapted LLMs
AraFinNews: Arabic Financial Summarisation with Domain-Adapted LLMs
Mo El-Haj
Paul Rayson
AIFin
430
0
0
03 Nov 2025
HPLT 3.0: Very Large-Scale Multilingual Resources for LLM and MT. Mono- and Bi-lingual Data, Multilingual Evaluation, and Pre-Trained Models
HPLT 3.0: Very Large-Scale Multilingual Resources for LLM and MT. Mono- and Bi-lingual Data, Multilingual Evaluation, and Pre-Trained Models
Stephan Oepen
Nikolay Arefev
Mikko Aulamo
Marta Bañón
Maja Buljan
...
Teemu Vahtola
Dušan Variš
Fedor Vitiugin
Tea Vojtěchová
Jaume Zaragoza
178
0
0
02 Nov 2025
Leveraging the Cross-Domain & Cross-Linguistic Corpus for Low Resource NMT: A Case Study On Bhili-Hindi-English Parallel Corpus
Leveraging the Cross-Domain & Cross-Linguistic Corpus for Low Resource NMT: A Case Study On Bhili-Hindi-English Parallel CorpusConference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Pooja Singh
Shashwat Bhardwaj
V. Sharma
Sandeep Kumar
96
0
0
01 Nov 2025
Do You Know About My Nation? Investigating Multilingual Language Models' Cultural Literacy Through Factual Knowledge
Do You Know About My Nation? Investigating Multilingual Language Models' Cultural Literacy Through Factual Knowledge
Eshaan Tanwar
Anwoy Chatterjee
Michael Stephen Saxon
Alon Albalak
William Wang
Tanmoy Chakraborty
104
0
0
01 Nov 2025
Languages are Modalities: Cross-Lingual Alignment via Encoder Injection
Languages are Modalities: Cross-Lingual Alignment via Encoder Injection
Rajan Agarwal
Aarush Gupta
122
0
0
31 Oct 2025
Revisiting Multilingual Data Mixtures in Language Model Pretraining
Revisiting Multilingual Data Mixtures in Language Model Pretraining
Negar Foroutan
Paul Teiletche
Ayush Kumar Tarun
Antoine Bosselut
LRM
84
1
0
29 Oct 2025
Beyond Line-Level Filtering for the Pretraining Corpora of LLMs
Beyond Line-Level Filtering for the Pretraining Corpora of LLMs
Chanwoo Park
Suyoung Park
Yelim Ahn
Jongmin Kim
Jongyeon Park
Jaejin Lee
76
0
0
28 Oct 2025
MetricX-25 and GemSpanEval: Google Translate Submissions to the WMT25 Evaluation Shared Task
MetricX-25 and GemSpanEval: Google Translate Submissions to the WMT25 Evaluation Shared Task
Juraj Juraska
Tobias Domhan
M. Finkelstein
Tetsuji Nakagawa
Geza Kovacs
Daniel Deutsch
Pidong Wang
Markus Freitag
113
3
0
28 Oct 2025
LuxIT: A Luxembourgish Instruction Tuning Dataset from Monolingual Seed Data
LuxIT: A Luxembourgish Instruction Tuning Dataset from Monolingual Seed Data
Julian Valline
Cedric Lothritz
Jordi Cabot
64
0
0
28 Oct 2025
Open Korean Historical Corpus: A Millennia-Scale Diachronic Collection of Public Domain Texts
Open Korean Historical Corpus: A Millennia-Scale Diachronic Collection of Public Domain Texts
Seyoung Song
Nawon Kim
Songeun Chae
K. Park
Jiho Jin
Haneul Yoo
Dong Wang
Alice Oh
45
0
0
28 Oct 2025
Cross-Lingual Summarization as a Black-Box Watermark Removal Attack
Cross-Lingual Summarization as a Black-Box Watermark Removal Attack
Gokul Ganesan
AAMLWaLM
450
0
0
27 Oct 2025
Multilingual Target-Stance Extraction
Multilingual Target-Stance Extraction
Ethan Mines
Bonnie Dorr
104
0
0
25 Oct 2025
DETECT: Determining Ease and Textual Clarity of German Text Simplifications
DETECT: Determining Ease and Textual Clarity of German Text Simplifications
Maria Korobeynikova
Alessia Battisti
Lukas Fischer
Yingqiang Gao
76
0
0
25 Oct 2025
SentiMaithili: A Benchmark Dataset for Sentiment and Reason Generation for the Low-Resource Maithili Language
SentiMaithili: A Benchmark Dataset for Sentiment and Reason Generation for the Low-Resource Maithili Language
Rahul Ranjan
Mahendra Gurve
Anuj
Nitin
Yamuna Prasad
72
0
0
25 Oct 2025
Tibetan Language and AI: A Comprehensive Survey of Resources, Methods and Challenges
Tibetan Language and AI: A Comprehensive Survey of Resources, Methods and Challenges
Cheng Huang
Nyima Tashi
Fan Gao
Yutong Liu
J. Li
...
Guojie Tang
Xiangxiang Wang
Jia Zhang
Tsengdar J. Lee
Yongbin Yu
104
0
0
22 Oct 2025
CrossNews-UA: A Cross-lingual News Semantic Similarity Benchmark for Ukrainian, Polish, Russian, and English
CrossNews-UA: A Cross-lingual News Semantic Similarity Benchmark for Ukrainian, Polish, Russian, and English
Daryna Dementieva
Evgeniya Sukhodolskaya
Alexander Fraser
104
0
0
22 Oct 2025
GPTFace: Generative Pre-training of Facial-Linguistic Transformer by Span Masking and Weakly Correlated Text-image Data
GPTFace: Generative Pre-training of Facial-Linguistic Transformer by Span Masking and Weakly Correlated Text-image Data
Yudong Li
Hao Li
Xianxu Hou
Linlin Shen
112
0
0
21 Oct 2025
Towards Context-aware Reasoning-enhanced Generative Searching in E-commerce
Towards Context-aware Reasoning-enhanced Generative Searching in E-commerce
Zhiding Liu
Ben Chen
Mingyue Cheng
Enchong Chen
Li Li
Chenyi Lei
Wenwu Ou
Han Li
Kun Gai
LRM
132
0
0
19 Oct 2025
FarsiMCQGen: a Persian Multiple-choice Question Generation Framework
FarsiMCQGen: a Persian Multiple-choice Question Generation Framework
Mohammad Heydari Rad
Rezvan Afari
Saeedeh Momtazi
AI4Ed
229
0
0
16 Oct 2025
Retrofitting Small Multilingual Models for Retrieval: Matching 7B Performance with 300M Parameters
Retrofitting Small Multilingual Models for Retrieval: Matching 7B Performance with 300M Parameters
Lifu Tu
Yingbo Zhou
Semih Yavuz
LRM
76
0
0
16 Oct 2025
Efficient Seq2seq Coreference Resolution Using Entity Representations
Efficient Seq2seq Coreference Resolution Using Entity Representations
Matt Grenander
Shay B. Cohen
Mark Steedman
116
0
0
16 Oct 2025
A fully automated and scalable Parallel Data Augmentation for Low Resource Languages using Image and Text Analytics
A fully automated and scalable Parallel Data Augmentation for Low Resource Languages using Image and Text AnalyticsACM Symposium on Applied Computing (SAC), 2023
Prawaal Sharma
Navneet Goyal
Poonam Goyal
Vishnupriyan R
40
0
0
15 Oct 2025
Saudi Sign Language Translation Using T5
Saudi Sign Language Translation Using T5
Ali Alhejab
Tomas Zelezny
Lamya Alkanhal
Ivan Gruber
Yazeed Alharbi
Jakub Straka
Vaclav Javorek
Marek Hruz
Badriah Alkalifah
Ahmed M. Ali
SLR
233
0
0
13 Oct 2025
LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens
LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens
A. Zebaze
Rachel Bawden
Benoît Sagot
LRM
112
1
0
13 Oct 2025
DynaSpec: Context-aware Dynamic Speculative Sampling for Large-Vocabulary Language Models
DynaSpec: Context-aware Dynamic Speculative Sampling for Large-Vocabulary Language Models
Jinbin Zhang
Nasib Ullah
Erik Schultheis
Rohit Babbar
116
1
0
11 Oct 2025
Hierarchical Scheduling for Multi-Vector Image Retrieval
Hierarchical Scheduling for Multi-Vector Image Retrieval
Maoliang Li
K. Li
Yaoyang Liu
Jiayu Chen
Zihao Zheng
Yinjun Wu
Xiang Chen
112
0
0
10 Oct 2025
Multilingual Generative Retrieval via Cross-lingual Semantic Compression
Multilingual Generative Retrieval via Cross-lingual Semantic Compression
Yuxin Huang
Simeng Wu
Ran Song
Yan Xiang
Yantuan Xian
Shengxiang Gao
Z. Yu
RALM
91
1
0
09 Oct 2025
Learning to Rewrite Prompts for Bootstrapping LLMs on Downstream Tasks
Learning to Rewrite Prompts for Bootstrapping LLMs on Downstream Tasks
Yuwen Tan
Xiang Xiang
Kun He
John E. Hopcroft
114
0
0
08 Oct 2025
Sunflower: A New Approach To Expanding Coverage of African Languages in Large Language Models
Sunflower: A New Approach To Expanding Coverage of African Languages in Large Language Models
Benjamin Akera
Evelyn Nafula Ouma
Gilbert Yiga
Patrick Walukagga
Phionah Natukunda
...
Imran Sekalala
Nimpamya Janat Namara
Engineer Bainomugisha
Ernest Mwebaze
John Quinn
160
0
0
08 Oct 2025
The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP
The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP
Sheriff Issaka
Keyi Wang
Yinka Ajibola
Oluwatumininu Samuel-Ipaye
Zhaoyi Zhang
...
Jemimah Osei
Carlene Ajeneza
Persis Boateng
Prisca Adwoa Dufie Yeboah
Saadia Gabriel
88
0
0
07 Oct 2025
Towards Data-Efficient Medical Imaging: A Generative and Semi-Supervised Framework
Towards Data-Efficient Medical Imaging: A Generative and Semi-Supervised Framework
Mosong Ma
Tania Stathaki
Michalis Lazarou
MedImGAN
225
0
0
07 Oct 2025
OneVision: An End-to-End Generative Framework for Multi-view E-commerce Vision Search
OneVision: An End-to-End Generative Framework for Multi-view E-commerce Vision Search
Zexin Zheng
Huangyu Dai
Lingtao Mao
Suhua Wang
Zihan Liang
...
Yuqing Ding
Chenyi Lei
Wenwu Ou
Han Li
Kun Gai
216
0
0
07 Oct 2025
Camellia: Benchmarking Cultural Biases in LLMs for Asian Languages
Camellia: Benchmarking Cultural Biases in LLMs for Asian Languages
Tarek Naous
Anagha Savit
Carlos Rafael Catalan
Geyang Guo
Jaehyeok Lee
...
JinYeong Bak
Keisuke Sakaguchi
Tanmoy Chakraborty
Yuki Arase
Wei Xu
92
0
0
06 Oct 2025
A Low-Resource Speech-Driven NLP Pipeline for Sinhala Dyslexia Assistance
A Low-Resource Speech-Driven NLP Pipeline for Sinhala Dyslexia Assistance
Peshala Perera
Deshan Sumanathilaka
81
0
0
06 Oct 2025
Aligning LLMs for Multilingual Consistency in Enterprise Applications
Aligning LLMs for Multilingual Consistency in Enterprise Applications
Amit Agarwal
Hansa Meghwani
Hitesh Laxmichand Patel
Tao Sheng
Sujith Ravi
Dan Roth
237
5
0
28 Sep 2025
Sigma: Semantically Informative Pre-training for Skeleton-based Sign Language Understanding
Sigma: Semantically Informative Pre-training for Skeleton-based Sign Language Understanding
Muxin Pu
Mei Kuan Lim
Chun Yong Chong
Chen Change Loy
108
0
0
25 Sep 2025
Feeding Two Birds or Favoring One? Adequacy-Fluency Tradeoffs in Evaluation and Meta-Evaluation of Machine Translation
Feeding Two Birds or Favoring One? Adequacy-Fluency Tradeoffs in Evaluation and Meta-Evaluation of Machine Translation
Behzad Shayegh
Jan-Thorsten Peter
David Vilar
Tobias Domhan
Juraj Juraska
Markus Freitag
Lili Mou
68
0
0
24 Sep 2025
Human-Annotated NER Dataset for the Kyrgyz Language
Human-Annotated NER Dataset for the Kyrgyz Language
Timur Turatali
Anton Alekseev
Gulira Jumalieva
Gulnara Kabaeva
Sergey I. Nikolenko
70
0
0
23 Sep 2025
False Friends Are Not Foes: Investigating Vocabulary Overlap in Multilingual Language Models
False Friends Are Not Foes: Investigating Vocabulary Overlap in Multilingual Language Models
Julie Kallini
Dan Jurafsky
Christopher Potts
Martijn Bartelds
173
0
0
23 Sep 2025
A Rhythm-Aware Phrase Insertion for Classical Arabic Poetry Composition
A Rhythm-Aware Phrase Insertion for Classical Arabic Poetry Composition
Mohamad Elzohbi
Richard Zhao
80
0
0
23 Sep 2025
Scaling, Simplification, and Adaptation: Lessons from Pretraining on Machine-Translated Text
Scaling, Simplification, and Adaptation: Lessons from Pretraining on Machine-Translated Text
Dan John Velasco
M. R
CLLLRM
96
0
0
22 Sep 2025
CorPipe at CRAC 2025: Evaluating Multilingual Encoders for Multilingual Coreference Resolution
CorPipe at CRAC 2025: Evaluating Multilingual Encoders for Multilingual Coreference Resolution
Milan Straka
173
0
0
22 Sep 2025
Breaking Token Into Concepts: Exploring Extreme Compression in Token Representation Via Compositional Shared Semantics
Breaking Token Into Concepts: Exploring Extreme Compression in Token Representation Via Compositional Shared Semantics
Kavin R V
Pawan Goyal
91
0
0
22 Sep 2025
CUTE: A Multilingual Dataset for Enhancing Cross-Lingual Knowledge Transfer in Low-Resource Languages
CUTE: A Multilingual Dataset for Enhancing Cross-Lingual Knowledge Transfer in Low-Resource LanguagesInternational Conference on Computational Linguistics (COLING), 2025
Wenhao Zhuang
Yuan Sun
88
1
0
21 Sep 2025
1234...303132
Next