Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.13979
Cited By
Unsupervised Cross-lingual Representation Learning for Speech Recognition
24 June 2020
Alexis Conneau
Alexei Baevski
R. Collobert
Abdel-rahman Mohamed
Michael Auli
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Unsupervised Cross-lingual Representation Learning for Speech Recognition"
50 / 402 papers shown
Title
Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models
Travis M. Bartley
Fei Jia
Krishna C. Puvvada
Samuel Kriman
Boris Ginsburg
SSL
21
6
0
09 Nov 2022
Comparative layer-wise analysis of self-supervised speech models
Ankita Pasad
Bowen Shi
Karen Livescu
SSL
22
109
0
08 Nov 2022
When to Laugh and How Hard? A Multimodal Approach to Detecting Humor and its Intensity
Khalid Alnajjar
Mika Hämäläinen
Jörg Tiedemann
Jorma T. Laaksonen
M. Kurimo
24
2
0
03 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
24
8
0
02 Nov 2022
M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval
Layne Berry
Yi-Jen Shih
Hsuan-Fu Wang
Heng-Jui Chang
Hung-yi Lee
David F. Harwath
VLM
8
9
0
02 Nov 2022
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Takaaki Saeki
Heiga Zen
Zhehuai Chen
Nobuyuki Morioka
Gary Wang
Yu Zhang
Ankur Bapna
Andrew Rosenberg
Bhuvana Ramabhadran
61
19
0
27 Oct 2022
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
Jingyi Li
Weiping Tu
Li Xiao
43
96
0
27 Oct 2022
Learning Music Representations with wav2vec 2.0
Alessandro Ragano
Emmanouil Benetos
Andrew Hines
22
9
0
27 Oct 2022
Iterative pseudo-forced alignment by acoustic CTC loss for self-supervised ASR domain adaptation
F. López
Jordi Luque
6
6
0
27 Oct 2022
Training Autoregressive Speech Recognition Models with Limited in-domain Supervision
Chak-Fai Li
Francis Keith
William Hartmann
M. Snover
14
0
0
27 Oct 2022
Efficient Utilization of Large Pre-Trained Models for Low Resource ASR
Peter Vieting
Christoph Luscher
Julian Dierkes
Ralf Schluter
Hermann Ney
33
5
0
26 Oct 2022
Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0
Marie Kunesova
Zbynek Zajíc
SSL
VLM
13
15
0
26 Oct 2022
Investigating self-supervised, weakly supervised and fully supervised training approaches for multi-domain automatic speech recognition: a study on Bangladeshi Bangla
Ahnaf Mozib Samin
M. Kobir
Md. Mushtaq Shahriyar Rafee
M. F. Ahmed
Mehedi Hasan
Partha Ghosh
Shafkat Kibria
M. S. Rahman
SSL
18
0
0
24 Oct 2022
Optimizing Bilingual Neural Transducer with Synthetic Code-switching Text Generation
Thien Nguyen
Nathalie Tran
Liuhui Deng
Thiago Fraga da Silva
Matthew Radzihovsky
...
Honza Silovsky
Arnab Ghoshal
M. Martel
Bharat Ram Ambati
Mohamed Ali
22
5
0
21 Oct 2022
Experiments on Turkish ASR with Self-Supervised Speech Representation Learning
Ali Safaya
E. Erzin
16
1
0
13 Oct 2022
Multilingual Zero Resource Speech Recognition Base on Self-Supervise Pre-Trained Acoustic Models
Haoyu Wang
Weiqiang Zhang
Hongbin Suo
Yulong Wan
8
0
0
13 Oct 2022
Automatic Speech Recognition of Low-Resource Languages Based on Chukchi
Anastasia N. Safonova
Tatiana Yudina
Emil Nadimanov
Cydnie Davenport
11
3
0
11 Oct 2022
Fine-tuning Wav2vec for Vocal-burst Emotion Recognition
Dang-Khanh Nguyen
Sudarshan Pant
Ngoc-Huynh Ho
Gueesang Lee
Soo-Huyng Kim
Hyung-Jeong Yang
22
3
0
01 Oct 2022
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
Andrés Vasco-Carofilis
Laura Fernández-Robles
Enrique Alegre
Eduardo FIDALGO
40
1
0
28 Sep 2022
Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs
Ðorðe Miladinovic
Kumar Shridhar
Kushal Kumar Jain
Max B. Paulus
J. M. Buhmann
Mrinmaya Sachan
Carl Allen
DRL
21
5
0
26 Sep 2022
Cross-domain Voice Activity Detection with Self-Supervised Representations
Sina Alisamir
F. Ringeval
François Portet
17
3
0
22 Sep 2022
Bangla-Wave: Improving Bangla Automatic Speech Recognition Utilizing N-gram Language Models
Mohammed Rakib
Md. Ismail Hossain
Nabeel Mohammed
Fuad Rahman
VLM
17
6
0
13 Sep 2022
Learning ASR pathways: A sparse multilingual ASR model
Mu Yang
Andros Tjandra
Chunxi Liu
David C. Zhang
Duc Le
Ozlem Kalinli
33
13
0
13 Sep 2022
Multilingual Transformer Language Model for Speech Recognition in Low-resource Languages
Li Miao
Jian Wu
Piyush Behre
Shuangyu Chang
S. Parthasarathy
16
2
0
08 Sep 2022
ASR2K: Speech Recognition for Around 2000 Languages without Audio
Xinjian Li
Florian Metze
David R. Mortensen
A. Black
Shinji Watanabe
15
27
0
06 Sep 2022
Distinguishing between pre- and post-treatment in the speech of patients with chronic obstructive pulmonary disease
Andreas Triantafyllopoulos
M. Fendler
A. Batliner
Maurice Gerczuk
Shahin Amiriparian
Thomas Berghaus
Björn W. Schuller
10
7
0
26 Jul 2022
SecretGen: Privacy Recovery on Pre-Trained Models via Distribution Discrimination
Zhu-rong Yuan
Fan Wu
Yunhui Long
Chaowei Xiao
Bo-wen Li
18
8
0
25 Jul 2022
When Is TTS Augmentation Through a Pivot Language Useful?
Nathaniel R. Robinson
Perez Ogayo
Swetha Gangu
David R. Mortensen
Shinji Watanabe
12
9
0
20 Jul 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality
Wei-Ning Hsu
Bowen Shi
SSL
VLM
19
41
0
14 Jul 2022
A Comparative Study of Self-supervised Speech Representation Based Voice Conversion
Wen-Chin Huang
Shu-Wen Yang
Tomoki Hayashi
T. Toda
10
15
0
10 Jul 2022
Non-Linear Pairwise Language Mappings for Low-Resource Multilingual Acoustic Model Fusion
Muhammad Umar Farooq
Darshan Adiga Haniya Narayana
Thomas Hain
11
2
0
07 Jul 2022
Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition
Muhammad Umar Farooq
Thomas Hain
11
3
0
07 Jul 2022
Improving Low-Resource Speech Recognition with Pretrained Speech Models: Continued Pretraining vs. Semi-Supervised Training
Mitchell DeHaven
J. Billa
VLM
AI4TS
15
8
0
01 Jul 2022
Toward Low-Cost End-to-End Spoken Language Understanding
Marco Dinarelli
M. Naguib
Franccois Portet
20
5
0
01 Jul 2022
The THUEE System Description for the IARPA OpenASR21 Challenge
Jing Zhao
Haoyu Wang
Jinpeng Li
Shuzhou Chai
Guan-Bo Wang
Guoguo Chen
Weiqiang Zhang
VLM
22
1
0
29 Jun 2022
Wav2Vec-Aug: Improved self-supervised training with limited data
Anuroop Sriram
Michael Auli
Alexei Baevski
SSL
VLM
6
15
0
27 Jun 2022
Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding
Wei-Ping Huang
Po-Chun Chen
Sung-Feng Huang
Hung-yi Lee
19
1
0
27 Jun 2022
Annotated Speech Corpus for Low Resource Indian Languages: Awadhi, Bhojpuri, Braj and Magahi
Ritesh Kumar
Siddharth Singh
Shyam Ratan
Mohith S Raj
Sonal Sinha
Bornini Lahiri
Vivek Seshadri
Kalika Bali
Atul Kr. Ojha
14
6
0
26 Jun 2022
TEVR: Improving Speech Recognition by Token Entropy Variance Reduction
Hajo N. Krabbenhöft
Erhardt Barth
14
3
0
25 Jun 2022
Distilling a Pretrained Language Model to a Multilingual ASR Model
Kwanghee Choi
Hyung-Min Park
VLM
11
10
0
25 Jun 2022
The MuSe 2022 Multimodal Sentiment Analysis Challenge: Humor, Emotional Reactions, and Stress
Lukas Christ
Shahin Amiriparian
Alice Baird
Panagiotis Tzirakis
Alexander Kathan
...
Eva-Maria Messner
Andreas Konig
Alan S. Cowen
Erik Cambria
Björn W. Schuller
27
60
0
23 Jun 2022
COVYT: Introducing the Coronavirus YouTube and TikTok speech dataset featuring the same speakers with and without infection
Andreas Triantafyllopoulos
A. Semertzidou
Meishu Song
Florian B. Pokorny
Björn W. Schuller
44
2
0
20 Jun 2022
Transformer-based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project
Jan Lehecka
J. Psutka
Josef Psutka
8
4
0
15 Jun 2022
Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech
Jan Lehecka
J. Svec
A. Pražák
J. Psutka
14
12
0
15 Jun 2022
LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Jinchuan Tian
Jianwei Yu
Chunlei Zhang
Chao Weng
Yuexian Zou
Dong Yu
AuLLM
17
25
0
05 Jun 2022
Speaker Identification using Speech Recognition
Syeda Rabia Arshad
Syed Mujtaba Haider
Abdul Basit Mughal
6
1
0
29 May 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
78
282
0
25 May 2022
T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation
Paul-Ambroise Duquenne
Hongyu Gong
Benoît Sagot
Holger Schwenk
22
18
0
24 May 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
128
349
0
21 May 2022
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
Wonjune Kang
M. Hasegawa-Johnson
D. Roy
24
8
0
19 May 2022
Previous
1
2
3
4
5
6
7
8
9
Next