Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2012.03411
Cited By

MLS: A Large-Scale Multilingual Dataset for Speech Research

v1v2 (latest)

MLS: A Large-Scale Multilingual Dataset for Speech Research

Interspeech (Interspeech), 2020

7 December 2020

Gabriel Synnaeve

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)

Papers citing "MLS: A Large-Scale Multilingual Dataset for Speech Research"

40 / 390 papers shown

ASR data augmentation in low-resource settings using cross-lingual
multi-speaker TTS and cross-lingual voice conversion

ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversionInterspeech (Interspeech), 2022

Edresson Casanova

Alexander Korolev

Arnaldo Cândido Júnior

340

16

0

29 Mar 2022

Analyzing Language-Independent Speaker Anonymization Framework under
Unseen Conditions

Analyzing Language-Independent Speaker Anonymization Framework under Unseen ConditionsInterspeech (Interspeech), 2022

Xin Wang

Junichi Yamagishi

152

16

0

28 Mar 2022

Leveraging unsupervised and weakly-supervised data to improve direct
speech-to-speech translation

Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translationInterspeech (Interspeech), 2022

Colin Cherry

Nobuyuki Morioka

242

24

0

24 Mar 2022

XTREME-S: Evaluating Cross-lingual Speech Representations

XTREME-S: Evaluating Cross-lingual Speech RepresentationsInterspeech (Interspeech), 2022

Patrick von Platen

...

Sebastian Ruder

294

23

0

21 Mar 2022

Visual Speech Recognition for Multiple Languages in the Wild

Visual Speech Recognition for Multiple Languages in the WildNature Machine Intelligence (Nat. Mach. Intell.), 2022

Stavros Petridis

421

198

0

26 Feb 2022

Automatic speaker verification spoofing and deepfake detection using
wav2vec 2.0 and data augmentation

Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentationThe Speaker and Language Recognition Workshop (Odyssey), 2022

Massimiliano Todisco

Xin Wang

Junichi Yamagishi

Nicholas W. D. Evans

365

265

0

24 Feb 2022

Self-supervised Learning with Random-projection Quantizer for Speech
Recognition

Self-supervised Learning with Random-projection Quantizer for Speech RecognitionInternational Conference on Machine Learning (ICML), 2022

Chung-Cheng Chiu

305

233

0

03 Feb 2022

mSLAM: Massively multilingual joint pre-training for speech and text

mSLAM: Massively multilingual joint pre-training for speech and text

Colin Cherry

202

124

0

03 Feb 2022

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice
Conversion for everyone

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyoneInternational Conference on Machine Learning (ICML), 2021

Edresson Casanova

Arnaldo Cândido Júnior

731

570

0

04 Dec 2021

The People's Speech: A Large-Scale Diverse English Speech Recognition
Dataset for Commercial Usage

The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage

Juan Felipe Cerón

Maximilian Lam

Vijay Janapa Reddi

295

125

0

17 Nov 2021

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at
Scale

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale

Kushal Lakhotia

...

432

949

0

17 Nov 2021

Joint Unsupervised and Supervised Training for Multilingual ASR

Joint Unsupervised and Supervised Training for Multilingual ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Nikhil Siddhartha

Tara N. Sainath

285

64

0

15 Nov 2021

Cross-lingual Transfer for Speech Processing using Acoustic Language
Similarity

Cross-lingual Transfer for Speech Processing using Acoustic Language SimilarityAutomatic Speech Recognition & Understanding (ASRU), 2021

Peter Wu

Jiatong Shi

Shinji Watanabe

198

8

0

02 Nov 2021

Lhotse: a speech data representation library for the modern deep
learning ecosystem

Lhotse: a speech data representation library for the modern deep learning ecosystem

Willem Hagemann

Jan "Yenda" Trmal

Sanjeev Khudanpur

227

46

0

25 Oct 2021

CORAA: a large corpus of spontaneous and prepared speech manually
validated for speech recognition in Brazilian Portuguese

CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese

Arnaldo Cândido Júnior

Edresson Casanova

...

Daniel Peixoto Pinto da Silva

Fernando Gorgulho Fayet

241

16

0

14 Oct 2021

Advancing the dimensionality reduction of speaker embeddings for speaker
diarisation: disentangling noise and informing speech activity

Advancing the dimensionality reduction of speaker embeddings for speaker diarisation: disentangling noise and informing speech activity

280

3

0

07 Oct 2021

WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech
Recognition

WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition

Binbin Zhang

Qijie Shao

Chao Yang

...

Hui Bu

415

308

0

07 Oct 2021

Building a Noisy Audio Dataset to Evaluate Machine Learning Approaches
for Automatic Speech Recognition Systems

Building a Noisy Audio Dataset to Evaluate Machine Learning Approaches for Automatic Speech Recognition Systems

65

4

0

04 Oct 2021

Comparison of Self-Supervised Speech Pre-Training Methods on Flemish
Dutch

Comparison of Self-Supervised Speech Pre-Training Methods on Flemish Dutch

Jakob Poncelet

153

3

0

29 Sep 2021

Simple and Effective Zero-shot Cross-lingual Phoneme Recognition

Simple and Effective Zero-shot Cross-lingual Phoneme RecognitionInterspeech (Interspeech), 2021

368

121

0

23 Sep 2021

Influence of ASR and Language Model on Alzheimer's Disease Detection

Influence of ASR and Language Model on Alzheimer's Disease Detection

Joan Codina-Filbà

Guillermo Cámbara

104

2

0

20 Sep 2021

Brazilian Portuguese Speech Recognition Using Wav2vec 2.0

Brazilian Portuguese Speech Recognition Using Wav2vec 2.0International Conference on Computational Processing of the Portuguese Language (PROPOR), 2021

Edresson Casanova

174

24

0

23 Jul 2021

CarneliNet: Neural Mixture Model for Automatic Speech Recognition

CarneliNet: Neural Mixture Model for Automatic Speech Recognition

Somshubra Majumdar

Jagadeesh Balam

Boris Ginsburg

120

3

0

22 Jul 2021

FST: the FAIR Speech Translation System for the IWSLT21 Multilingual
Shared Task

FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared TaskInternational Workshop on Spoken Language Translation (IWSLT), 2021

Xian Li

231

11

0

14 Jul 2021

A Survey on Neural Speech Synthesis

A Survey on Neural Speech Synthesis

Xu Tan

403

441

0

29 Jun 2021

HUI-Audio-Corpus-German: A high quality TTS dataset

HUI-Audio-Corpus-German: A high quality TTS datasetDeutsche Jahrestagung für Künstliche Intelligenz (KI), 2021

Pascal Puchtler

107

28

0

11 Jun 2021

Unsupervised Speech Recognition

Unsupervised Speech RecognitionNeural Information Processing Systems (NeurIPS), 2021

445

295

0

24 May 2021

Including Signed Languages in Natural Language Processing

Including Signed Languages in Natural Language ProcessingAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

Malihe Alikhani

244

134

0

11 May 2021

English Accent Accuracy Analysis in a State-of-the-Art Automatic Speech
Recognition System

English Accent Accuracy Analysis in a State-of-the-Art Automatic Speech Recognition System

Guillermo Cámbara

Alex Peiró Lilja

114

3

0

09 May 2021

Scaling End-to-End Models for Large-Scale Multilingual ASR

Scaling End-to-End Models for Large-Scale Multilingual ASRAutomatic Speech Recognition & Understanding (ASRU), 2021

Tara N. Sainath

513

84

0

30 Apr 2021

LeBenchmark: A Reproducible Framework for Assessing Self-Supervised
Representation Learning from Speech

LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from SpeechInterspeech (Interspeech), 2021

Marcely Zanon Boito

Salima Mdhaffar

...

François Portet

Solange Rossato

Fabien Ringeval

Laurent Besacier

259

72

0

23 Apr 2021

Crossing the Conversational Chasm: A Primer on Natural Language
Processing for Multilingual Task-Oriented Dialogue Systems

Crossing the Conversational Chasm: A Primer on Natural Language Processing for Multilingual Task-Oriented Dialogue SystemsJournal of Artificial Intelligence Research (JAIR), 2021

E. Razumovskaia

523

38

0

17 Apr 2021

A Toolbox for Construction and Analysis of Speech Datasets

A Toolbox for Construction and Analysis of Speech Datasets

Evelina Bakhturina

Vitaly Lavrukhin

Boris Ginsburg

162

14

0

11 Apr 2021

HMM-Free Encoder Pre-Training for Streaming RNN Transducer

HMM-Free Encoder Pre-Training for Streaming RNN TransducerInterspeech (Interspeech), 2021

204

3

0

02 Apr 2021

MediaSpeech: Multilanguage ASR Benchmark and Dataset

MediaSpeech: Multilanguage ASR Benchmark and Dataset

Rostislav Kolobov

Olga Omelchishina

Dmitry Menshikov

N. Mikhaylovskiy

157

30

0

30 Mar 2021

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation
Learning, Semi-Supervised Learning and Interpretation

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and InterpretationAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

Chaitanya Talnikar

Mary Williamson

Emmanuel Dupoux

660

643

0

02 Jan 2021

Neural Representations for Modeling Variation in Speech

Neural Representations for Modeling Variation in Speech

Martijn Bartelds

Wietse de Vries

Caitlin Richter

Martijn B. Wieling

215

30

0

25 Nov 2020

Swiss Parliaments Corpus, an Automatically Aligned Swiss German Speech
to Standard German Text Corpus

Swiss Parliaments Corpus, an Automatically Aligned Swiss German Speech to Standard German Text Corpus

Christian Scheller

142

29

0

06 Oct 2020

Unsupervised Cross-lingual Representation Learning for Speech
Recognition

Unsupervised Cross-lingual Representation Learning for Speech RecognitionInterspeech (Interspeech), 2020

Abdel-rahman Mohamed

454

936

0

24 Jun 2020

TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian
Portuguese

TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese

Edresson Casanova

João Paulo Teixeira

252

24

0

11 May 2020

1 2 3 4 5 6 7 8

Page 8 of 8