v1v2v3 (latest)

mT5: A massively multilingual pre-trained text-to-text transformer

22 October 2020

ArXiv (abs)PDF HTML HuggingFace (4 upvotes)

Papers citing "mT5: A massively multilingual pre-trained text-to-text transformer"

50 / 1,562 papers shown

Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence

452

18 Mar 2025

Pensez: Less Data, Better Reasoning -- Rethinking French LLM

Huy Hoang Ha

ReLM LRM

254

17 Mar 2025

LAG-MMLU: Benchmarking Frontier LLM Understanding in Latvian and Giriama

1.0K

14 Mar 2025

Annotating Scientific Uncertainty: A comprehensive model using linguistic patterns and comparison with existing approaches

Panggih Kusuma Ningrum

298

14 Mar 2025

A Hybrid Architecture with Efficient Fine Tuning for Abstractive Patent Document SummarizationInternational Conference on Soft Computing and Software Engineering (ICSCSE), 2025

Nevidu Jayatilleke

Ruvan Weerasinghe

AILaw

642

13 Mar 2025

An Expanded Massive Multilingual Dataset for High-Performance Language Technologies (HPLT)Annual Meeting of the Association for Computational Linguistics (ACL), 2025

...

Jaume Zaragoza-Bernabeu

485

13 Mar 2025

NAMI: Efficient Image Generation via Bridged Progressive Rectified Flow Transformers

346

12 Mar 2025

Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

...

1.1K

09 Mar 2025

Coreference as an indicator of context scope in multimodal narrative

201

07 Mar 2025

Compositional Translation: A Novel LLM-based Approach for Low-resource Machine Translation

A. Zebaze

Benoît Sagot

Rachel Bawden

262

06 Mar 2025

Feature-Level Insights into Artificial Text Detection with Sparse AutoencodersAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

247

05 Mar 2025

Enhancing Vietnamese VQA through Curriculum Learning on Raw and Augmented Text Representations

248

05 Mar 2025

Wikipedia in the Era of LLMs: Evolution and Risks

361

04 Mar 2025

Sherkala-Chat: Building a State-of-the-Art LLM for Kazakh in a Moderately Resourced Setting

...

404

03 Mar 2025

In-context Learning vs. Instruction Tuning: The Case of Small and Multilingual Language Models

David Ponce

Thierry Etchegoyhen

340

03 Mar 2025

Test-Time Alignment for Large Language Models via Textual Model Predictive Control

368

28 Feb 2025

Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think

304

27 Feb 2025

HuAMR: A Hungarian AMR Parser and Dataset

199

27 Feb 2025

PolyPrompt: Automating Knowledge Extraction from Multilingual Language Models with Dynamic Prompt Generation

Nathan Roll

303

27 Feb 2025

Few-Shot Multilingual Open-Domain QA from 5 Examples

Fan Jiang

Tom Drummond

Trevor Cohn

323

27 Feb 2025

Language Models' Factuality Depends on the Language of Inquiry

293

25 Feb 2025

Compressing Language Models for Specialized Domains

304

25 Feb 2025

What are Foundation Models Cooking in the Post-Soviet World?

460

25 Feb 2025

Encryption-Friendly LLM ArchitectureInternational Conference on Learning Representations (ICLR), 2024

497

24 Feb 2025

Do Multilingual LLMs Think In English?

Lisa Schut

Y. Gal

Sebastian Farquhar

293

24 Feb 2025

Comprehensive Analysis of Transparency and Accessibility of ChatGPT, DeepSeek, And other SoTA Large Language Models

Ranjan Sapkota

Shaina Raza

Manoj Karkee

266

21 Feb 2025

Multilingual Non-Factoid Question Answering with Answer Paragraph SelectionPacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2024

Ritwik Mishra

Sreeram Vennam

R. Shah

Ponnurangam Kumaraguru

333

20 Feb 2025

Multilingual Language Model Pretraining using Machine-translated Data

David Ifeoluwa Adelani

374

20 Feb 2025

KazMMLU: Evaluating Language Models on Kazakh, Russian, and Regional Knowledge of KazakhstanAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

...

217

18 Feb 2025

Enhancing Multilingual LLM Pretraining with Model-Based Data Selection

Bettina Messmer

Vinko Sabolčec

Martin Jaggi

178

17 Feb 2025

Generating Text from Uniform Meaning Representation

Emma Markle

Reihaneh Iranmanesh

Shira Wein

173

17 Feb 2025

M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis

459

17 Feb 2025

Balanced Multi-Factor In-Context Learning for Multilingual Large Language Models

Masahiro Kaneko

Alham Fikri Aji

Timothy Baldwin

320

17 Feb 2025

LayAlign: Enhancing Multilingual Reasoning in Large Language Models via Layer-Wise Adaptive Fusion and Alignment StrategyNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

293

17 Feb 2025

ALGEN: Few-shot Inversion Attacks on Textual Embeddings using Alignment and Generation

Yiyi Chen

Qiongkai Xu

Johannes Bjerva

406

16 Feb 2025

The underlying structures of self-attention: symmetry, directionality, and emergent dynamics in Transformer training

Matteo Saponati

Pascal Sager

Pau Vilimelis Aceituno

Thilo Stadelmann

Benjamin Grewe

208

15 Feb 2025

Matina: A Large-Scale 73B Token Persian Text CorpusNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

Sara Bourbour Hosseinbeigi

249

13 Feb 2025

Examining and Adapting Time for Multilingual Classification via Mixture of Temporal ExpertsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

211

12 Feb 2025

A Large-Scale Benchmark for Vietnamese Sentence ParaphrasesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

Sang Quang Nguyen

Kiet Van Nguyen

360

11 Feb 2025

Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile

379

10 Feb 2025

Towards the Development of Balanced Synthetic Data for Correcting Grammatical Errors in Arabic: An Approach Based on Error Tagging Model and Synthetic Data Generating Model

Ahlam Alrehili

Areej Alhothali

457

07 Feb 2025

Multilingual State Space Models for Structured Question Answering in Indic Languages

505

01 Feb 2025

A linguistically-motivated evaluation methodology for unraveling model's abilities in reading comprehension tasksConference on Empirical Methods in Natural Language Processing (EMNLP), 2025

394

29 Jan 2025

Commute Your Domains: Trajectory Optimality Criterion for Multi-Domain Learning

Alexey Rukhovich

Alexander Podolskiy

Irina Piontkovskaya

269

28 Jan 2025

Data Duplication: A Novel Multi-Purpose Attack Paradigm in Machine Unlearning

372

28 Jan 2025

Faster Machine Translation Ensembling with Reinforcement Learning and Competitive CorrectionNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

223

28 Jan 2025

Test-Time Code-Switching for Cross-lingual Aspect Sentiment Triplet ExtractionNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

183

24 Jan 2025

Can MLLMs Generalize to Multi-Party dialog? Exploring Multilingual Response Generation in Complex Scenarios

201

20 Jan 2025

ViBidirectionMT-Eval: Machine Translation for Vietnamese-Chinese and Vietnamese-Lao language pairJournal of Computer Science and Cybernetics (JCSC), 2025

103

15 Jan 2025

Exploring Robustness of Multilingual LLMs on Real-World Noisy Data

Amirhossein Aliakbarzadeh

Lucie Flek

Akbar Karimi

234

14 Jan 2025