v1v2 (latest)

Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond

Automatic Speech Recognition & Understanding (ASRU), 2023

9 October 2023

Jiatong Shi

Yuxun Tang

Hung-yi Lee

Shinji Watanabe

LRM

ELM

ArXiv (abs)PDF HTML Github (2543★)

Papers citing "Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond"

50 / 51 papers shown

The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties

...

Antonis Anastasopoulos

227

08 Sep 2025

An Exploration of Mamba for Speech Self-Supervised Models

239

14 Jun 2025

CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech ProcessingNeural Information Processing Systems (NeurIPS), 2024

Yen-Ju Lu

Jing Liu

Thomas Thebaud

Laureano Moro-Velazquez

Ariya Rastrow

Najim Dehak

Jesus Villalba

375

05 Dec 2024

ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and IntegrationSpoken Language Technology Workshop (SLT), 2024

William Chen

Yifan Peng

Jiatong Shi

Vaibhav Srivastav

Shinji Watanabe

VLM

357

14 Sep 2024

The Faetar Benchmark: Speech Recognition in a Very Under-Resourced Language

Michael Ong

Sean Robertson

Leo Peckham

Alba Jorquera Jimenez de Aberasturi

754

12 Sep 2024

Towards Robust Speech Representation Learning for Thousands of Languages

William Chen

Wangyou Zhang

Yifan Peng

Xinjian Li

Jinchuan Tian

Jiatong Shi

Xuankai Chang

Soumi Maiti

Karen Livescu

Shinji Watanabe

ELM

434

30 Jun 2024

MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech Representation from Self-supervised Learning ModelInterspeech (Interspeech), 2024

Jiatong Shi

Xutai Ma

Hirofumi Inaguma

Anna Y. Sun

Shinji Watanabe

247

14 Jun 2024

ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets

Jiatong Shi

Shih-Heng Wang

William Chen

Martijn Bartelds

Vanya Bannihatti Kumar

...

Xuankai Chang

Shinji Watanabe

353

12 Jun 2024

mHuBERT-147: A Compact Multilingual HuBERT Model

594

10 Jun 2024

Wav2Gloss: Generating Interlinear Glossed Text from Speech

Taiqi He

Kwanghee Choi

Lindia Tjuatja

Nathaniel R. Robinson

Jiatong Shi

Shinji Watanabe

Graham Neubig

David R. Mortensen

Lori S. Levin

VLM

257

19 Mar 2024

Evaluating Self-supervised Speech Models on a Taiwanese Hokkien CorpusAutomatic Speech Recognition & Understanding (ASRU), 2023

...

Jiatong Shi

258

06 Dec 2023

EFFUSE: Efficient Self-Supervised Feature Fusion for E2E ASR in Low Resource and Multilingual ScenariosInterspeech (Interspeech), 2023

Tejes Srivastava

Jiatong Shi

William Chen

Shinji Watanabe

279

05 Oct 2023

Evaluating Self-Supervised Speech Representations for Indigenous American LanguagesInternational Conference on Language Resources and Evaluation (LREC), 2023

318

05 Oct 2023

Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit PredictionInternational Conference on Learning Representations (ICLR), 2023

Jiatong Shi

322

04 Oct 2023

SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech RecognitionIEEE International Conference on Multimedia and Expo (ICME), 2023

Hongfei Xue

Qijie Shao

Tommy Yuan

Peikun Chen

Jie Liu

Lei Xie

316

29 Sep 2023

Joint Prediction and Denoising for Large-scale Multilingual Self-supervised LearningAutomatic Speech Recognition & Understanding (ASRU), 2023

Jiatong Shi

Wangyou Zhang

315

26 Sep 2023

ÌròyìnSpeech: A multi-purpose Yorùbá Speech CorpusInternational Conference on Language Resources and Evaluation (LREC), 2023

David Ifeoluwa Adelani

456

29 Jul 2023

Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic ComputeInterspeech (Interspeech), 2023

Shinji Watanabe

314

11 Jun 2023

Exploration on HuBERT with Multiple ResolutionsInterspeech (Interspeech), 2023

Jiatong Shi

411

01 Jun 2023

Scaling Speech Technology to 1,000+ LanguagesJournal of machine learning research (JMLR), 2023

...

Yossi Adi

531

586

22 May 2023

ML-SUPERB: Multilingual Speech Universal PERformance BenchmarkInterspeech (Interspeech), 2023

Jiatong Shi

...

392

18 May 2023

Improving Massively Multilingual ASR With Auxiliary CTC ObjectivesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Jiatong Shi

349

24 Feb 2023

NusaCrowd: Open Source Initiative for Indonesian NLP ResourcesAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

...

555

19 Dec 2022

SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech TranslationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Paul-Ambroise Duquenne

306

08 Nov 2022

SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation LearningSpoken Language Technology Workshop (SLT), 2022

Tzu-Quan Lin

...

327

16 Oct 2022

ASR2K: Speech Recognition for Around 2000 Languages without AudioInterspeech (Interspeech), 2022

181

06 Sep 2022

IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languagesAAAI Conference on Artificial Intelligence (AAAI), 2022

Mitesh M. Khapra

236

24 Aug 2022

FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech RecognitionInterspeech (Interspeech), 2022

Szu-Jui Chen

Jiamin Xie

John H. L. Hansen

265

30 Jun 2022

Self-Supervised Speech Representation Learning: A ReviewIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022

Abdel-rahman Mohamed

Hung-yi Lee

Lasse Borgholt

Jakob Drachmann Havtorn

...

789

475

21 May 2022

Muskits: an End-to-End Music Processing Toolkit for Singing Voice SynthesisInterspeech (Interspeech), 2022

Jiatong Shi

Tao Qian

...

Peter Wu

Qin Jin

266

09 May 2022

Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and TranslationInterspeech (Interspeech), 2022

Dan Berrebbi

Jiatong Shi

Brian Yan

Osbel López-Francisco

Jonathan D. Amith

Shinji Watanabe

258

05 Apr 2022

XTREME-S: Evaluating Cross-lingual Speech RepresentationsInterspeech (Interspeech), 2022

...

345

21 Mar 2022

Opencpop: A High-Quality Open Source Chinese Popular Song Corpus for Singing Voice SynthesisInterspeech (Interspeech), 2022

Pengcheng Zhu

Lei Xie

348

140

19 Jan 2022

Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale CorpusACM Multimedia (MM), 2021

Rongjie Huang

Zhou Zhao

289

129

20 Dec 2021

Textless Speech-to-Speech Translation on Real Data

Ann Lee

Hongyu Gong

Paul-Ambroise Duquenne

...

321

183

15 Dec 2021

ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet

...

249

29 Nov 2021

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale

...

573

982

17 Nov 2021

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

...

Jian Wu

1.4K

2,988

26 Oct 2021

Improved Language Identification Through Cross-Lingual Self-Supervised LearningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Andros Tjandra

Diptanu Gon Choudhury

241

08 Jul 2021

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden UnitsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021

759

4,394

14 Jun 2021

SUPERB: Speech processing Universal PERformance BenchmarkInterspeech (Interspeech), 2021

...

643

1,141

03 May 2021

Scaling End-to-End Models for Large-Scale Multilingual ASRAutomatic Speech Recognition & Understanding (ASRU), 2021

665

30 Apr 2021

LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from SpeechInterspeech (Interspeech), 2021

...

282

23 Apr 2021

Leveraging End-to-End ASR for Endangered Language Documentation: An Empirical Study on Yoloxóchitl MixtecConference of the European Chapter of the Association for Computational Linguistics (EACL), 2021

Jiatong Shi

Jiatong Shi. Jonathan D. Amith

Rey Castillo García

Esteban Guadalupe Sierra

Kevin Duh

Shinji Watanabe

219

26 Jan 2021

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and InterpretationAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

776

675

02 Jan 2021

Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters

360

164

06 Jul 2020

Unsupervised Cross-lingual Representation Learning for Speech RecognitionInterspeech (Interspeech), 2020

541

957

24 Jun 2020

Self-Supervised Representations Improve End-to-End Speech Translation

292

22 Jun 2020

wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

4.3K

7,960

20 Jun 2020

Learning Robust and Multilingual Speech RepresentationsFindings (Findings), 2020

345

102

29 Jan 2020