ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.03411
  4. Cited By
MLS: A Large-Scale Multilingual Dataset for Speech Research
v1v2 (latest)

MLS: A Large-Scale Multilingual Dataset for Speech Research

Interspeech (Interspeech), 2020
7 December 2020
Vineel Pratap
Qiantong Xu
Anuroop Sriram
Gabriel Synnaeve
R. Collobert
    AuLLM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "MLS: A Large-Scale Multilingual Dataset for Speech Research"

50 / 390 papers shown
From English to More Languages: Parameter-Efficient Model Reprogramming
  for Cross-Lingual Speech Recognition
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Chao-Han Huck Yang
Yue Liu
Yu Zhang
Nanxin Chen
Rohit Prabhavalkar
Tara N. Sainath
Trevor Strohman
188
32
0
19 Jan 2023
Scaling Laws for Generative Mixed-Modal Language Models
Scaling Laws for Generative Mixed-Modal Language ModelsInternational Conference on Machine Learning (ICML), 2023
Armen Aghajanyan
L. Yu
Alexis Conneau
Wei-Ning Hsu
Karen Hambardzumyan
Susan Zhang
Stephen Roller
Naman Goyal
Omer Levy
Luke Zettlemoyer
MoEVLM
314
137
0
10 Jan 2023
Supervised Acoustic Embeddings And Their Transferability Across
  Languages
Supervised Acoustic Embeddings And Their Transferability Across LanguagesInternational Conference on Natural Language and Speech Processing (ICNLSP), 2023
Sreepratha Ram
Hanan Aldarmaki
SSL
152
4
0
03 Jan 2023
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for
  Universal and Generalized Speech Enhancement
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement
Wei-Ning Hsu
Tal Remez
Bowen Shi
Jacob Donley
Yossi Adi
DiffM
238
14
0
21 Dec 2022
Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models
Mu2^{2}2SLAM: Multitask, Multilingual Speech and Language ModelsInternational Conference on Machine Learning (ICML), 2022
Yong Cheng
Yu Zhang
Melvin Johnson
Wolfgang Macherey
Ankur Bapna
194
9
0
19 Dec 2022
UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
UnitY: Two-pass Direct Speech-to-speech Translation with Discrete UnitsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Hirofumi Inaguma
Sravya Popuri
Ilia Kulikov
Peng-Jen Chen
Changhan Wang
Yu-An Chung
Yun Tang
Ann Lee
Shinji Watanabe
J. Pino
320
77
0
15 Dec 2022
Ring That Bell: A Corpus and Method for Multimodal Metaphor Detection in
  Videos
Ring That Bell: A Corpus and Method for Multimodal Metaphor Detection in Videos
Khalid Alnajjar
Mika Hämäläinen
Shuo Zhang
181
10
0
15 Dec 2022
Towards trustworthy phoneme boundary detection with autoregressive model
  and improved evaluation metric
Towards trustworthy phoneme boundary detection with autoregressive model and improved evaluation metricIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Hyeongju Kim
Hyeong-Seok Choi
138
2
0
13 Dec 2022
Robust Speech Recognition via Large-Scale Weak Supervision
Robust Speech Recognition via Large-Scale Weak SupervisionInternational Conference on Machine Learning (ICML), 2022
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
1.0K
5,873
0
06 Dec 2022
EURO: ESPnet Unsupervised ASR Open-source Toolkit
EURO: ESPnet Unsupervised ASR Open-source ToolkitIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Dongji Gao
Jiatong Shi
Shun-Po Chuang
Leibny Paola García-Perera
Hung-yi Lee
Shinji Watanabe
Sanjeev Khudanpur
229
10
0
30 Nov 2022
Dialogs Re-enacted Across Languages
Dialogs Re-enacted Across Languages
Nigel G. Ward
Jonathan Avila
Emilia Rivas
Divette Marco
213
2
0
18 Nov 2022
Casual Conversations v2: Designing a large consent-driven dataset to
  measure algorithmic bias and robustness
Casual Conversations v2: Designing a large consent-driven dataset to measure algorithmic bias and robustness
C. Hazirbas
Yejin Bang
Tiezheng Yu
Parisa Assar
Bilal Porgali
...
Jacqueline Pan
Emily McReynolds
Miranda Bogen
Pascale Fung
Cristian Canton Ferrer
225
8
0
10 Nov 2022
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture,
  and Generalization Capabilities
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization CapabilitiesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Andros Tjandra
Nayan Singhal
David C. Zhang
Ozlem Kalinli
Abdel-rahman Mohamed
Duc Le
M. Seltzer
234
20
0
10 Nov 2022
Multi-blank Transducers for Speech Recognition
Multi-blank Transducers for Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Hainan Xu
Fei Jia
Somshubra Majumdar
Shinji Watanabe
Boris Ginsburg
222
12
0
04 Nov 2022
I4U System Description for NIST SRE'20 CTS Challenge
I4U System Description for NIST SRE'20 CTS Challenge
Kong Aik Lee
Tomi Kinnunen
Daniele Colibro
C. Vair
A. Nautsch
...
Ruijie Tao
Haizhou Li
Alfonso Ortega Giménez
Longbiao Wang
L. Buera
84
1
0
02 Nov 2022
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised
  Learning for Text-To-Speech
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-SpeechIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Takaaki Saeki
Heiga Zen
Zhehuai Chen
Nobuyuki Morioka
Gary Wang
Yu Zhang
Ankur Bapna
Andrew Rosenberg
Bhuvana Ramabhadran
277
22
0
27 Oct 2022
Multi-class Detection of Pathological Speech with Latent Features: How
  does it perform on unseen data?
Multi-class Detection of Pathological Speech with Latent Features: How does it perform on unseen data?Interspeech (Interspeech), 2022
Dominik Wagner
Ilja Baumann
Franziska Braun
Sebastian P. Bayerl
Elmar Nöth
Korbinian Riedhammer
Tobias Bocklet
193
15
0
27 Oct 2022
Improving Speech-to-Speech Translation Through Unlabeled Text
Improving Speech-to-Speech Translation Through Unlabeled TextIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xuan-Phi Nguyen
Sravya Popuri
Changhan Wang
Yun Tang
Ilia Kulikov
Hongyu Gong
203
9
0
26 Oct 2022
EBEN: Extreme bandwidth extension network applied to speech signals
  captured with noise-resilient body-conduction microphones
EBEN: Extreme bandwidth extension network applied to speech signals captured with noise-resilient body-conduction microphonesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
J. Hauret
Thomas Joubaud
V. Zimpfer
Éric Bavu
171
19
0
25 Oct 2022
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
142
27
0
24 Oct 2022
Low-Resource Multilingual and Zero-Shot Multispeaker TTS
Low-Resource Multilingual and Zero-Shot Multispeaker TTS
Florian Lux
Julia Koch
Ngoc Thang Vu
218
26
0
21 Oct 2022
Large-scale learning of generalised representations for speaker
  recognition
Large-scale learning of generalised representations for speaker recognition
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesong Lee
Hye-jin Shim
Youngki Kwon
Joon Son Chung
Shinji Watanabe
CVBM
215
6
0
20 Oct 2022
Maestro-U: Leveraging joint speech-text representation learning for zero
  supervised speech ASR
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASRSpoken Language Technology Workshop (SLT), 2022
Zhehuai Chen
Ankur Bapna
Andrew Rosenberg
Yu Zhang
Bhuvana Ramabhadran
Pedro J. Moreno
Nanxin Chen
246
17
0
18 Oct 2022
Bringing NURC/SP to Digital Life: the Role of Open-source Automatic
  Speech Recognition Models
Bringing NURC/SP to Digital Life: the Role of Open-source Automatic Speech Recognition Models
L. Gris
Arnaldo Cândido Júnior
V. G. Santos
B. Dias
Marli Quadros Leite
F. Svartman
S. Aluísio
161
3
0
14 Oct 2022
On the Utility of Self-supervised Models for Prosody-related Tasks
On the Utility of Self-supervised Models for Prosody-related TasksSpoken Language Technology Workshop (SLT), 2022
Guan-Ting Lin
Chiyu Feng
Wei-Ping Huang
Yuan Tseng
Tzu-Han Lin
Chen-An Li
Hung-yi Lee
Nigel G. Ward
200
61
0
13 Oct 2022
Fine-tuning Wav2vec for Vocal-burst Emotion Recognition
Fine-tuning Wav2vec for Vocal-burst Emotion Recognition
Dang-Khanh Nguyen
Sudarshan Pant
Ngoc-Huynh Ho
Gueesang Lee
Soo-Huyng Kim
Hyung-Jeong Yang
111
3
0
01 Oct 2022
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
MeWEHV: Mel and Wave Embeddings for Human Voice TasksIEEE Access (IEEE Access), 2022
Andrés Vasco-Carofilis
Laura Fernández-Robles
Enrique Alegre
Eduardo FIDALGO
198
4
0
28 Sep 2022
Bangla-Wave: Improving Bangla Automatic Speech Recognition Utilizing
  N-gram Language Models
Bangla-Wave: Improving Bangla Automatic Speech Recognition Utilizing N-gram Language ModelsInternational Conference on Software and Computer Applications (ICSCA), 2022
Mohammed Rakib
Md. Ismail Hossain
Nabeel Mohammed
Fuad Rahman
VLM
174
9
0
13 Sep 2022
Learning ASR pathways: A sparse multilingual ASR model
Learning ASR pathways: A sparse multilingual ASR modelIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Mu Yang
Andros Tjandra
Chunxi Liu
David C. Zhang
Duc Le
Ozlem Kalinli
394
14
0
13 Sep 2022
Applying wav2vec2 for Speech Recognition on Bengali Common Voices
  Dataset
Applying wav2vec2 for Speech Recognition on Bengali Common Voices Dataset
Haz Sameen Shahgir
Khondker Salman Sayeed
Tanjeem Azwad Zaman
159
10
0
11 Sep 2022
Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
A. I. S. Ferreira
Gustavo dos Reis Oliveira
202
3
0
29 Jul 2022
PoeticTTS -- Controllable Poetry Reading for Literary Studies
PoeticTTS -- Controllable Poetry Reading for Literary StudiesInterspeech (Interspeech), 2022
Julia Koch
Florian Lux
Nadja Schauffler
T. Bernhart
Felix Dieterle
Jonas Kuhn
Sandra Richter
Gabriel Viehhauser
Ngoc Thang Vu
140
6
0
11 Jul 2022
Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities
  on Multilingual Speech Recognition
Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech RecognitionInterspeech (Interspeech), 2022
Muhammad Umar Farooq
Thomas Hain
83
4
0
07 Jul 2022
The THUEE System Description for the IARPA OpenASR21 Challenge
The THUEE System Description for the IARPA OpenASR21 ChallengeInterspeech (Interspeech), 2022
Jing Zhao
Haoyu Wang
Jinpeng Li
Shuzhou Chai
Guan-Bo Wang
Guoguo Chen
Weiqiang Zhang
VLM
118
1
0
29 Jun 2022
TEVR: Improving Speech Recognition by Token Entropy Variance Reduction
TEVR: Improving Speech Recognition by Token Entropy Variance Reduction
Hajo N. Krabbenhöft
Erhardt Barth
175
3
0
25 Jun 2022
Exact Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech
Exact Prosody Cloning in Zero-Shot Multispeaker Text-to-SpeechSpoken Language Technology Workshop (SLT), 2022
Florian Lux
Julia Koch
Ngoc Thang Vu
202
23
0
24 Jun 2022
Exploring the Effectiveness of Self-supervised Learning and Classifier
  Chains in Emotion Recognition of Nonverbal Vocalizations
Exploring the Effectiveness of Self-supervised Learning and Classifier Chains in Emotion Recognition of Nonverbal Vocalizations
Detai Xin
Shinnosuke Takamichi
Hiroshi Saruwatari
97
15
0
21 Jun 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of
  Speech
FLEURS: Few-shot Learning Evaluation of Universal Representations of SpeechSpoken Language Technology Workshop (SLT), 2022
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
505
488
0
25 May 2022
Adaptive multilingual speech recognition with pretrained models
Adaptive multilingual speech recognition with pretrained modelsInterspeech (Interspeech), 2022
Ngoc-Quan Pham
A. Waibel
Jan Niehues
VLM
213
26
0
24 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A ReviewIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSLAI4TS
679
445
0
21 May 2022
Automatic Spoken Language Identification using a Time-Delay Neural
  Network
Automatic Spoken Language Identification using a Time-Delay Neural Network
Benjamin Kepecs
Homayoon Beigi
65
2
0
19 May 2022
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech
  Translation
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech TranslationInterspeech (Interspeech), 2022
Qianqian Dong
Fengpeng Yue
Tom Ko
Mingxuan Wang
Qibing Bai
Yu Zhang
245
19
0
18 May 2022
Quantifying Language Variation Acoustically with Few Resources
Quantifying Language Variation Acoustically with Few ResourcesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022
Martijn Bartelds
Martijn B. Wieling
176
16
0
05 May 2022
ASR in German: A Detailed Error Analysis
ASR in German: A Detailed Error Analysis
John M. Wirth
René Peinl
149
7
0
12 Apr 2022
Transducer-based language embedding for spoken language identification
Transducer-based language embedding for spoken language identificationInterspeech (Interspeech), 2022
Peng Shen
Xugang Lu
Hisashi Kawai
178
8
0
08 Apr 2022
MAESTRO: Matched Speech Text Representations through Modality Matching
MAESTRO: Matched Speech Text Representations through Modality MatchingInterspeech (Interspeech), 2022
Zhehuai Chen
Yu Zhang
Andrew Rosenberg
Bhuvana Ramabhadran
Pedro J. Moreno
Ankur Bapna
Heiga Zen
250
119
0
07 Apr 2022
Enhanced Direct Speech-to-Speech Translation Using Self-supervised
  Pre-training and Data Augmentation
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data AugmentationInterspeech (Interspeech), 2022
Sravya Popuri
Peng-Jen Chen
Changhan Wang
J. Pino
Yossi Adi
Jiatao Gu
Wei-Ning Hsu
Ann Lee
293
65
0
06 Apr 2022
Towards End-to-end Unsupervised Speech Recognition
Towards End-to-end Unsupervised Speech RecognitionSpoken Language Technology Workshop (SLT), 2022
Alexander H. Liu
Wei-Ning Hsu
Michael Auli
Alexei Baevski
SSL
235
84
0
05 Apr 2022
A Study of Gender Impact in Self-supervised Models for Speech-to-Text
  Systems
A Study of Gender Impact in Self-supervised Models for Speech-to-Text SystemsInterspeech (Interspeech), 2022
Marcely Zanon Boito
Laurent Besacier
N. Tomashenko
Yannick Esteve
214
24
0
04 Apr 2022
End-to-End Multi-speaker ASR with Independent Vector Analysis
End-to-End Multi-speaker ASR with Independent Vector AnalysisSpoken Language Technology Workshop (SLT), 2022
Robin Scheibler
Wangyou Zhang
Xuankai Chang
Shinji Watanabe
Y. Qian
195
2
0
01 Apr 2022
Previous
12345678
Next
Page 7 of 8