Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2012.03411
Cited By
v1
v2 (latest)
MLS: A Large-Scale Multilingual Dataset for Speech Research
Interspeech (Interspeech), 2020
7 December 2020
Vineel Pratap
Qiantong Xu
Anuroop Sriram
Gabriel Synnaeve
R. Collobert
AuLLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"MLS: A Large-Scale Multilingual Dataset for Speech Research"
50 / 390 papers shown
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Chao-Han Huck Yang
Yue Liu
Yu Zhang
Nanxin Chen
Rohit Prabhavalkar
Tara N. Sainath
Trevor Strohman
188
32
0
19 Jan 2023
Scaling Laws for Generative Mixed-Modal Language Models
International Conference on Machine Learning (ICML), 2023
Armen Aghajanyan
L. Yu
Alexis Conneau
Wei-Ning Hsu
Karen Hambardzumyan
Susan Zhang
Stephen Roller
Naman Goyal
Omer Levy
Luke Zettlemoyer
MoE
VLM
314
137
0
10 Jan 2023
Supervised Acoustic Embeddings And Their Transferability Across Languages
International Conference on Natural Language and Speech Processing (ICNLSP), 2023
Sreepratha Ram
Hanan Aldarmaki
SSL
152
4
0
03 Jan 2023
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement
Wei-Ning Hsu
Tal Remez
Bowen Shi
Jacob Donley
Yossi Adi
DiffM
238
14
0
21 Dec 2022
Mu
2
^{2}
2
SLAM: Multitask, Multilingual Speech and Language Models
International Conference on Machine Learning (ICML), 2022
Yong Cheng
Yu Zhang
Melvin Johnson
Wolfgang Macherey
Ankur Bapna
194
9
0
19 Dec 2022
UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Hirofumi Inaguma
Sravya Popuri
Ilia Kulikov
Peng-Jen Chen
Changhan Wang
Yu-An Chung
Yun Tang
Ann Lee
Shinji Watanabe
J. Pino
320
77
0
15 Dec 2022
Ring That Bell: A Corpus and Method for Multimodal Metaphor Detection in Videos
Khalid Alnajjar
Mika Hämäläinen
Shuo Zhang
181
10
0
15 Dec 2022
Towards trustworthy phoneme boundary detection with autoregressive model and improved evaluation metric
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Hyeongju Kim
Hyeong-Seok Choi
138
2
0
13 Dec 2022
Robust Speech Recognition via Large-Scale Weak Supervision
International Conference on Machine Learning (ICML), 2022
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
1.0K
5,873
0
06 Dec 2022
EURO: ESPnet Unsupervised ASR Open-source Toolkit
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Dongji Gao
Jiatong Shi
Shun-Po Chuang
Leibny Paola García-Perera
Hung-yi Lee
Shinji Watanabe
Sanjeev Khudanpur
229
10
0
30 Nov 2022
Dialogs Re-enacted Across Languages
Nigel G. Ward
Jonathan Avila
Emilia Rivas
Divette Marco
213
2
0
18 Nov 2022
Casual Conversations v2: Designing a large consent-driven dataset to measure algorithmic bias and robustness
C. Hazirbas
Yejin Bang
Tiezheng Yu
Parisa Assar
Bilal Porgali
...
Jacqueline Pan
Emily McReynolds
Miranda Bogen
Pascale Fung
Cristian Canton Ferrer
225
8
0
10 Nov 2022
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Andros Tjandra
Nayan Singhal
David C. Zhang
Ozlem Kalinli
Abdel-rahman Mohamed
Duc Le
M. Seltzer
234
20
0
10 Nov 2022
Multi-blank Transducers for Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Hainan Xu
Fei Jia
Somshubra Majumdar
Shinji Watanabe
Boris Ginsburg
222
12
0
04 Nov 2022
I4U System Description for NIST SRE'20 CTS Challenge
Kong Aik Lee
Tomi Kinnunen
Daniele Colibro
C. Vair
A. Nautsch
...
Ruijie Tao
Haizhou Li
Alfonso Ortega Giménez
Longbiao Wang
L. Buera
84
1
0
02 Nov 2022
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Takaaki Saeki
Heiga Zen
Zhehuai Chen
Nobuyuki Morioka
Gary Wang
Yu Zhang
Ankur Bapna
Andrew Rosenberg
Bhuvana Ramabhadran
277
22
0
27 Oct 2022
Multi-class Detection of Pathological Speech with Latent Features: How does it perform on unseen data?
Interspeech (Interspeech), 2022
Dominik Wagner
Ilja Baumann
Franziska Braun
Sebastian P. Bayerl
Elmar Nöth
Korbinian Riedhammer
Tobias Bocklet
193
15
0
27 Oct 2022
Improving Speech-to-Speech Translation Through Unlabeled Text
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xuan-Phi Nguyen
Sravya Popuri
Changhan Wang
Yun Tang
Ilia Kulikov
Hongyu Gong
203
9
0
26 Oct 2022
EBEN: Extreme bandwidth extension network applied to speech signals captured with noise-resilient body-conduction microphones
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
J. Hauret
Thomas Joubaud
V. Zimpfer
Éric Bavu
171
19
0
25 Oct 2022
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
142
27
0
24 Oct 2022
Low-Resource Multilingual and Zero-Shot Multispeaker TTS
Florian Lux
Julia Koch
Ngoc Thang Vu
218
26
0
21 Oct 2022
Large-scale learning of generalised representations for speaker recognition
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesong Lee
Hye-jin Shim
Youngki Kwon
Joon Son Chung
Shinji Watanabe
CVBM
215
6
0
20 Oct 2022
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR
Spoken Language Technology Workshop (SLT), 2022
Zhehuai Chen
Ankur Bapna
Andrew Rosenberg
Yu Zhang
Bhuvana Ramabhadran
Pedro J. Moreno
Nanxin Chen
246
17
0
18 Oct 2022
Bringing NURC/SP to Digital Life: the Role of Open-source Automatic Speech Recognition Models
L. Gris
Arnaldo Cândido Júnior
V. G. Santos
B. Dias
Marli Quadros Leite
F. Svartman
S. Aluísio
161
3
0
14 Oct 2022
On the Utility of Self-supervised Models for Prosody-related Tasks
Spoken Language Technology Workshop (SLT), 2022
Guan-Ting Lin
Chiyu Feng
Wei-Ping Huang
Yuan Tseng
Tzu-Han Lin
Chen-An Li
Hung-yi Lee
Nigel G. Ward
200
61
0
13 Oct 2022
Fine-tuning Wav2vec for Vocal-burst Emotion Recognition
Dang-Khanh Nguyen
Sudarshan Pant
Ngoc-Huynh Ho
Gueesang Lee
Soo-Huyng Kim
Hyung-Jeong Yang
111
3
0
01 Oct 2022
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
IEEE Access (IEEE Access), 2022
Andrés Vasco-Carofilis
Laura Fernández-Robles
Enrique Alegre
Eduardo FIDALGO
198
4
0
28 Sep 2022
Bangla-Wave: Improving Bangla Automatic Speech Recognition Utilizing N-gram Language Models
International Conference on Software and Computer Applications (ICSCA), 2022
Mohammed Rakib
Md. Ismail Hossain
Nabeel Mohammed
Fuad Rahman
VLM
174
9
0
13 Sep 2022
Learning ASR pathways: A sparse multilingual ASR model
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Mu Yang
Andros Tjandra
Chunxi Liu
David C. Zhang
Duc Le
Ozlem Kalinli
394
14
0
13 Sep 2022
Applying wav2vec2 for Speech Recognition on Bengali Common Voices Dataset
Haz Sameen Shahgir
Khondker Salman Sayeed
Tanjeem Azwad Zaman
159
10
0
11 Sep 2022
Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
A. I. S. Ferreira
Gustavo dos Reis Oliveira
202
3
0
29 Jul 2022
PoeticTTS -- Controllable Poetry Reading for Literary Studies
Interspeech (Interspeech), 2022
Julia Koch
Florian Lux
Nadja Schauffler
T. Bernhart
Felix Dieterle
Jonas Kuhn
Sandra Richter
Gabriel Viehhauser
Ngoc Thang Vu
140
6
0
11 Jul 2022
Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition
Interspeech (Interspeech), 2022
Muhammad Umar Farooq
Thomas Hain
83
4
0
07 Jul 2022
The THUEE System Description for the IARPA OpenASR21 Challenge
Interspeech (Interspeech), 2022
Jing Zhao
Haoyu Wang
Jinpeng Li
Shuzhou Chai
Guan-Bo Wang
Guoguo Chen
Weiqiang Zhang
VLM
118
1
0
29 Jun 2022
TEVR: Improving Speech Recognition by Token Entropy Variance Reduction
Hajo N. Krabbenhöft
Erhardt Barth
175
3
0
25 Jun 2022
Exact Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech
Spoken Language Technology Workshop (SLT), 2022
Florian Lux
Julia Koch
Ngoc Thang Vu
202
23
0
24 Jun 2022
Exploring the Effectiveness of Self-supervised Learning and Classifier Chains in Emotion Recognition of Nonverbal Vocalizations
Detai Xin
Shinnosuke Takamichi
Hiroshi Saruwatari
97
15
0
21 Jun 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Spoken Language Technology Workshop (SLT), 2022
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
505
488
0
25 May 2022
Adaptive multilingual speech recognition with pretrained models
Interspeech (Interspeech), 2022
Ngoc-Quan Pham
A. Waibel
Jan Niehues
VLM
213
26
0
24 May 2022
Self-Supervised Speech Representation Learning: A Review
IEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
679
445
0
21 May 2022
Automatic Spoken Language Identification using a Time-Delay Neural Network
Benjamin Kepecs
Homayoon Beigi
65
2
0
19 May 2022
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation
Interspeech (Interspeech), 2022
Qianqian Dong
Fengpeng Yue
Tom Ko
Mingxuan Wang
Qibing Bai
Yu Zhang
245
19
0
18 May 2022
Quantifying Language Variation Acoustically with Few Resources
North American Chapter of the Association for Computational Linguistics (NAACL), 2022
Martijn Bartelds
Martijn B. Wieling
176
16
0
05 May 2022
ASR in German: A Detailed Error Analysis
John M. Wirth
René Peinl
149
7
0
12 Apr 2022
Transducer-based language embedding for spoken language identification
Interspeech (Interspeech), 2022
Peng Shen
Xugang Lu
Hisashi Kawai
178
8
0
08 Apr 2022
MAESTRO: Matched Speech Text Representations through Modality Matching
Interspeech (Interspeech), 2022
Zhehuai Chen
Yu Zhang
Andrew Rosenberg
Bhuvana Ramabhadran
Pedro J. Moreno
Ankur Bapna
Heiga Zen
250
119
0
07 Apr 2022
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation
Interspeech (Interspeech), 2022
Sravya Popuri
Peng-Jen Chen
Changhan Wang
J. Pino
Yossi Adi
Jiatao Gu
Wei-Ning Hsu
Ann Lee
293
65
0
06 Apr 2022
Towards End-to-end Unsupervised Speech Recognition
Spoken Language Technology Workshop (SLT), 2022
Alexander H. Liu
Wei-Ning Hsu
Michael Auli
Alexei Baevski
SSL
235
84
0
05 Apr 2022
A Study of Gender Impact in Self-supervised Models for Speech-to-Text Systems
Interspeech (Interspeech), 2022
Marcely Zanon Boito
Laurent Besacier
N. Tomashenko
Yannick Esteve
214
24
0
04 Apr 2022
End-to-End Multi-speaker ASR with Independent Vector Analysis
Spoken Language Technology Workshop (SLT), 2022
Robin Scheibler
Wangyou Zhang
Xuankai Chang
Shinji Watanabe
Y. Qian
195
2
0
01 Apr 2022
Previous
1
2
3
4
5
6
7
8
Next
Page 7 of 8