ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.03411
  4. Cited By
MLS: A Large-Scale Multilingual Dataset for Speech Research
v1v2 (latest)

MLS: A Large-Scale Multilingual Dataset for Speech Research

7 December 2020
Vineel Pratap
Qiantong Xu
Anuroop Sriram
Gabriel Synnaeve
R. Collobert
    AuLLM
ArXiv (abs)PDFHTML

Papers citing "MLS: A Large-Scale Multilingual Dataset for Speech Research"

50 / 321 papers shown
Title
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
62
25
0
24 Oct 2022
Low-Resource Multilingual and Zero-Shot Multispeaker TTS
Low-Resource Multilingual and Zero-Shot Multispeaker TTS
Florian Lux
Julia Koch
Ngoc Thang Vu
101
23
0
21 Oct 2022
Large-scale learning of generalised representations for speaker
  recognition
Large-scale learning of generalised representations for speaker recognition
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesong Lee
Hye-jin Shim
Youngki Kwon
Joon Son Chung
Shinji Watanabe
CVBM
58
6
0
20 Oct 2022
Maestro-U: Leveraging joint speech-text representation learning for zero
  supervised speech ASR
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR
Zhehuai Chen
Ankur Bapna
Andrew Rosenberg
Yu Zhang
Bhuvana Ramabhadran
Pedro J. Moreno
Nanxin Chen
101
17
0
18 Oct 2022
Bringing NURC/SP to Digital Life: the Role of Open-source Automatic
  Speech Recognition Models
Bringing NURC/SP to Digital Life: the Role of Open-source Automatic Speech Recognition Models
L. Gris
Arnaldo Cândido Júnior
V. G. Santos
B. Dias
Marli Quadros Leite
F. Svartman
S. Aluísio
66
3
0
14 Oct 2022
On the Utility of Self-supervised Models for Prosody-related Tasks
On the Utility of Self-supervised Models for Prosody-related Tasks
Guan-Ting Lin
Chiyu Feng
Wei-Ping Huang
Yuan Tseng
Tzu-Han Lin
Chen-An Li
Hung-yi Lee
Nigel G. Ward
63
51
0
13 Oct 2022
Fine-tuning Wav2vec for Vocal-burst Emotion Recognition
Fine-tuning Wav2vec for Vocal-burst Emotion Recognition
Dang-Khanh Nguyen
Sudarshan Pant
Ngoc-Huynh Ho
Gueesang Lee
Soo-Huyng Kim
Hyung-Jeong Yang
37
3
0
01 Oct 2022
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
Andrés Vasco-Carofilis
Laura Fernández-Robles
Enrique Alegre
Eduardo FIDALGO
78
3
0
28 Sep 2022
Bangla-Wave: Improving Bangla Automatic Speech Recognition Utilizing
  N-gram Language Models
Bangla-Wave: Improving Bangla Automatic Speech Recognition Utilizing N-gram Language Models
Mohammed Rakib
Md. Ismail Hossain
Nabeel Mohammed
Fuad Rahman
VLM
85
7
0
13 Sep 2022
Learning ASR pathways: A sparse multilingual ASR model
Learning ASR pathways: A sparse multilingual ASR model
Mu Yang
Andros Tjandra
Chunxi Liu
David C. Zhang
Duc Le
Ozlem Kalinli
94
14
0
13 Sep 2022
Applying wav2vec2 for Speech Recognition on Bengali Common Voices
  Dataset
Applying wav2vec2 for Speech Recognition on Bengali Common Voices Dataset
Haz Sameen Shahgir
Khondker Salman Sayeed
Tanjeem Azwad Zaman
82
9
0
11 Sep 2022
Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
A. I. S. Ferreira
Gustavo dos Reis Oliveira
106
3
0
29 Jul 2022
PoeticTTS -- Controllable Poetry Reading for Literary Studies
PoeticTTS -- Controllable Poetry Reading for Literary Studies
Julia Koch
Florian Lux
Nadja Schauffler
T. Bernhart
Felix Dieterle
Jonas Kuhn
Sandra Richter
Gabriel Viehhauser
Ngoc Thang Vu
61
5
0
11 Jul 2022
Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities
  on Multilingual Speech Recognition
Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition
Muhammad Umar Farooq
Thomas Hain
33
3
0
07 Jul 2022
The THUEE System Description for the IARPA OpenASR21 Challenge
The THUEE System Description for the IARPA OpenASR21 Challenge
Jing Zhao
Haoyu Wang
Jinpeng Li
Shuzhou Chai
Guan-Bo Wang
Guoguo Chen
Weiqiang Zhang
VLM
31
1
0
29 Jun 2022
TEVR: Improving Speech Recognition by Token Entropy Variance Reduction
TEVR: Improving Speech Recognition by Token Entropy Variance Reduction
Hajo N. Krabbenhöft
Erhardt Barth
72
3
0
25 Jun 2022
Exact Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech
Exact Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech
Florian Lux
Julia Koch
Ngoc Thang Vu
75
20
0
24 Jun 2022
Exploring the Effectiveness of Self-supervised Learning and Classifier
  Chains in Emotion Recognition of Nonverbal Vocalizations
Exploring the Effectiveness of Self-supervised Learning and Classifier Chains in Emotion Recognition of Nonverbal Vocalizations
Detai Xin
Shinnosuke Takamichi
Hiroshi Saruwatari
44
14
0
21 Jun 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of
  Speech
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
153
331
0
25 May 2022
Adaptive multilingual speech recognition with pretrained models
Adaptive multilingual speech recognition with pretrained models
Ngoc-Quan Pham
A. Waibel
Jan Niehues
VLM
72
23
0
24 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSLAI4TS
273
367
0
21 May 2022
Automatic Spoken Language Identification using a Time-Delay Neural
  Network
Automatic Spoken Language Identification using a Time-Delay Neural Network
Benjamin Kepecs
Homayoon Beigi
13
2
0
19 May 2022
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech
  Translation
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation
Qianqian Dong
Fengpeng Yue
Tom Ko
Mingxuan Wang
Qibing Bai
Yu Zhang
84
16
0
18 May 2022
Quantifying Language Variation Acoustically with Few Resources
Quantifying Language Variation Acoustically with Few Resources
Martijn Bartelds
Martijn B. Wieling
78
13
0
05 May 2022
ASR in German: A Detailed Error Analysis
ASR in German: A Detailed Error Analysis
John M. Wirth
René Peinl
55
6
0
12 Apr 2022
Transducer-based language embedding for spoken language identification
Transducer-based language embedding for spoken language identification
Peng Shen
Xugang Lu
Hisashi Kawai
80
6
0
08 Apr 2022
MAESTRO: Matched Speech Text Representations through Modality Matching
MAESTRO: Matched Speech Text Representations through Modality Matching
Zhehuai Chen
Yu Zhang
Andrew Rosenberg
Bhuvana Ramabhadran
Pedro J. Moreno
Ankur Bapna
Heiga Zen
92
108
0
07 Apr 2022
Enhanced Direct Speech-to-Speech Translation Using Self-supervised
  Pre-training and Data Augmentation
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation
Sravya Popuri
Peng-Jen Chen
Changhan Wang
J. Pino
Yossi Adi
Jiatao Gu
Wei-Ning Hsu
Ann Lee
135
58
0
06 Apr 2022
Towards End-to-end Unsupervised Speech Recognition
Towards End-to-end Unsupervised Speech Recognition
Alexander H. Liu
Wei-Ning Hsu
Michael Auli
Alexei Baevski
SSL
80
74
0
05 Apr 2022
A Study of Gender Impact in Self-supervised Models for Speech-to-Text
  Systems
A Study of Gender Impact in Self-supervised Models for Speech-to-Text Systems
Marcely Zanon Boito
Laurent Besacier
N. Tomashenko
Yannick Esteve
63
19
0
04 Apr 2022
End-to-End Multi-speaker ASR with Independent Vector Analysis
End-to-End Multi-speaker ASR with Independent Vector Analysis
Robin Scheibler
Wangyou Zhang
Xuankai Chang
Shinji Watanabe
Y. Qian
59
2
0
01 Apr 2022
ASR data augmentation in low-resource settings using cross-lingual
  multi-speaker TTS and cross-lingual voice conversion
ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion
Edresson Casanova
C. Shulby
Alexander Korolev
Arnaldo Cândido Júnior
A. S. Soares
S. Aluísio
M. Ponti
117
14
0
29 Mar 2022
Analyzing Language-Independent Speaker Anonymization Framework under
  Unseen Conditions
Analyzing Language-Independent Speaker Anonymization Framework under Unseen Conditions
Xiaoxiao Miao
Xin Wang
Erica Cooper
Junichi Yamagishi
N. Tomashenko
62
11
0
28 Mar 2022
Leveraging unsupervised and weakly-supervised data to improve direct
  speech-to-speech translation
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation
Ye Jia
Yifan Ding
Ankur Bapna
Colin Cherry
Yu Zhang
Alexis Conneau
Nobuyuki Morioka
94
21
0
24 Mar 2022
XTREME-S: Evaluating Cross-lingual Speech Representations
XTREME-S: Evaluating Cross-lingual Speech Representations
Alexis Conneau
Ankur Bapna
Yu Zhang
Min Ma
Patrick von Platen
...
Orhan Firat
Michael Auli
Sebastian Ruder
Jason Riesa
Melvin Johnson
VLMAILawELM
153
22
0
21 Mar 2022
Visual Speech Recognition for Multiple Languages in the Wild
Visual Speech Recognition for Multiple Languages in the Wild
Pingchuan Ma
Stavros Petridis
Maja Pantic
VLM
195
151
0
26 Feb 2022
Automatic speaker verification spoofing and deepfake detection using
  wav2vec 2.0 and data augmentation
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation
Hemlata Tak
Massimiliano Todisco
Xin Wang
Jee-weon Jung
Junichi Yamagishi
Nicholas W. D. Evans
122
168
0
24 Feb 2022
Self-supervised Learning with Random-projection Quantizer for Speech
  Recognition
Self-supervised Learning with Random-projection Quantizer for Speech Recognition
Chung-Cheng Chiu
James Qin
Yu Zhang
Jiahui Yu
Yonghui Wu
SSL
111
169
0
03 Feb 2022
mSLAM: Massively multilingual joint pre-training for speech and text
mSLAM: Massively multilingual joint pre-training for speech and text
Ankur Bapna
Colin Cherry
Yu Zhang
Ye Jia
Melvin Johnson
Yong Cheng
Simran Khanuja
Jason Riesa
Alexis Conneau
VLM
67
114
0
03 Feb 2022
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice
  Conversion for everyone
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
242
415
0
04 Dec 2021
The People's Speech: A Large-Scale Diverse English Speech Recognition
  Dataset for Commercial Usage
The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage
Daniel Galvez
G. Diamos
Juan Ciro
Juan Felipe Cerón
Keith Achorn
Anjali Gopi
David Kanter
Maximilian Lam
Mark Mazumder
Vijay Janapa Reddi
137
103
0
17 Nov 2021
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at
  Scale
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
...
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
114
710
0
17 Nov 2021
Joint Unsupervised and Supervised Training for Multilingual ASR
Joint Unsupervised and Supervised Training for Multilingual ASR
Junwen Bai
Yue Liu
Yu Zhang
Ankur Bapna
Nikhil Siddhartha
K. Sim
Tara N. Sainath
78
59
0
15 Nov 2021
Cross-lingual Transfer for Speech Processing using Acoustic Language
  Similarity
Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity
Peter Wu
Jiatong Shi
Yifan Zhong
Shinji Watanabe
A. Black
52
8
0
02 Nov 2021
Lhotse: a speech data representation library for the modern deep
  learning ecosystem
Lhotse: a speech data representation library for the modern deep learning ecosystem
Willem Hagemann
Daniel Povey
Jan "Yenda" Trmal
Sanjeev Khudanpur
AuLLMAI4TS
84
36
0
25 Oct 2021
CORAA: a large corpus of spontaneous and prepared speech manually
  validated for speech recognition in Brazilian Portuguese
CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese
Arnaldo Cândido Júnior
Edresson Casanova
A. S. Soares
F. S. Oliveira
L. Oliveira
...
Daniel Peixoto Pinto da Silva
Fernando Gorgulho Fayet
B. Carlotto
L. Gris
S. Aluísio
32
15
0
14 Oct 2021
Advancing the dimensionality reduction of speaker embeddings for speaker
  diarisation: disentangling noise and informing speech activity
Advancing the dimensionality reduction of speaker embeddings for speaker diarisation: disentangling noise and informing speech activity
You Jin Kim
Hee-Soo Heo
Jee-weon Jung
Youngki Kwon
Bong-Jin Lee
Joon Son Chung
75
3
0
07 Oct 2021
WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech
  Recognition
WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition
Binbin Zhang
Hang Lv
Pengcheng Guo
Qijie Shao
Chao Yang
...
Hui Bu
Xiaoyu Chen
Chenchen Zeng
Di Wu
Zhendong Peng
117
231
0
07 Oct 2021
Building a Noisy Audio Dataset to Evaluate Machine Learning Approaches
  for Automatic Speech Recognition Systems
Building a Noisy Audio Dataset to Evaluate Machine Learning Approaches for Automatic Speech Recognition Systems
J. C. Duarte
S. Colcher
16
3
0
04 Oct 2021
Comparison of Self-Supervised Speech Pre-Training Methods on Flemish
  Dutch
Comparison of Self-Supervised Speech Pre-Training Methods on Flemish Dutch
Jakob Poncelet
Hugo Van hamme
SSL
56
1
0
29 Sep 2021
Previous
1234567
Next