ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.13979
  4. Cited By
Unsupervised Cross-lingual Representation Learning for Speech
  Recognition

Unsupervised Cross-lingual Representation Learning for Speech Recognition

24 June 2020
Alexis Conneau
Alexei Baevski
R. Collobert
Abdel-rahman Mohamed
Michael Auli
    SSL
ArXivPDFHTML

Papers citing "Unsupervised Cross-lingual Representation Learning for Speech Recognition"

50 / 402 papers shown
Title
Accidental Learners: Spoken Language Identification in Multilingual
  Self-Supervised Models
Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models
Travis M. Bartley
Fei Jia
Krishna C. Puvvada
Samuel Kriman
Boris Ginsburg
SSL
21
6
0
09 Nov 2022
Comparative layer-wise analysis of self-supervised speech models
Comparative layer-wise analysis of self-supervised speech models
Ankita Pasad
Bowen Shi
Karen Livescu
SSL
22
109
0
08 Nov 2022
When to Laugh and How Hard? A Multimodal Approach to Detecting Humor and
  its Intensity
When to Laugh and How Hard? A Multimodal Approach to Detecting Humor and its Intensity
Khalid Alnajjar
Mika Hämäläinen
Jörg Tiedemann
Jorma T. Laaksonen
M. Kurimo
24
2
0
03 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
24
8
0
02 Nov 2022
M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for
  Multilingual Speech to Image Retrieval
M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval
Layne Berry
Yi-Jen Shih
Hsuan-Fu Wang
Heng-Jui Chang
Hung-yi Lee
David F. Harwath
VLM
8
9
0
02 Nov 2022
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised
  Learning for Text-To-Speech
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Takaaki Saeki
Heiga Zen
Zhehuai Chen
Nobuyuki Morioka
Gary Wang
Yu Zhang
Ankur Bapna
Andrew Rosenberg
Bhuvana Ramabhadran
61
19
0
27 Oct 2022
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
Jingyi Li
Weiping Tu
Li Xiao
43
96
0
27 Oct 2022
Learning Music Representations with wav2vec 2.0
Learning Music Representations with wav2vec 2.0
Alessandro Ragano
Emmanouil Benetos
Andrew Hines
22
9
0
27 Oct 2022
Iterative pseudo-forced alignment by acoustic CTC loss for
  self-supervised ASR domain adaptation
Iterative pseudo-forced alignment by acoustic CTC loss for self-supervised ASR domain adaptation
F. López
Jordi Luque
6
6
0
27 Oct 2022
Training Autoregressive Speech Recognition Models with Limited in-domain
  Supervision
Training Autoregressive Speech Recognition Models with Limited in-domain Supervision
Chak-Fai Li
Francis Keith
William Hartmann
M. Snover
14
0
0
27 Oct 2022
Efficient Utilization of Large Pre-Trained Models for Low Resource ASR
Efficient Utilization of Large Pre-Trained Models for Low Resource ASR
Peter Vieting
Christoph Luscher
Julian Dierkes
Ralf Schluter
Hermann Ney
33
5
0
26 Oct 2022
Multitask Detection of Speaker Changes, Overlapping Speech and Voice
  Activity Using wav2vec 2.0
Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0
Marie Kunesova
Zbynek Zajíc
SSL
VLM
13
15
0
26 Oct 2022
Investigating self-supervised, weakly supervised and fully supervised
  training approaches for multi-domain automatic speech recognition: a study on
  Bangladeshi Bangla
Investigating self-supervised, weakly supervised and fully supervised training approaches for multi-domain automatic speech recognition: a study on Bangladeshi Bangla
Ahnaf Mozib Samin
M. Kobir
Md. Mushtaq Shahriyar Rafee
M. F. Ahmed
Mehedi Hasan
Partha Ghosh
Shafkat Kibria
M. S. Rahman
SSL
18
0
0
24 Oct 2022
Optimizing Bilingual Neural Transducer with Synthetic Code-switching
  Text Generation
Optimizing Bilingual Neural Transducer with Synthetic Code-switching Text Generation
Thien Nguyen
Nathalie Tran
Liuhui Deng
Thiago Fraga da Silva
Matthew Radzihovsky
...
Honza Silovsky
Arnab Ghoshal
M. Martel
Bharat Ram Ambati
Mohamed Ali
22
5
0
21 Oct 2022
Experiments on Turkish ASR with Self-Supervised Speech Representation
  Learning
Experiments on Turkish ASR with Self-Supervised Speech Representation Learning
Ali Safaya
E. Erzin
16
1
0
13 Oct 2022
Multilingual Zero Resource Speech Recognition Base on Self-Supervise
  Pre-Trained Acoustic Models
Multilingual Zero Resource Speech Recognition Base on Self-Supervise Pre-Trained Acoustic Models
Haoyu Wang
Weiqiang Zhang
Hongbin Suo
Yulong Wan
8
0
0
13 Oct 2022
Automatic Speech Recognition of Low-Resource Languages Based on Chukchi
Automatic Speech Recognition of Low-Resource Languages Based on Chukchi
Anastasia N. Safonova
Tatiana Yudina
Emil Nadimanov
Cydnie Davenport
11
3
0
11 Oct 2022
Fine-tuning Wav2vec for Vocal-burst Emotion Recognition
Fine-tuning Wav2vec for Vocal-burst Emotion Recognition
Dang-Khanh Nguyen
Sudarshan Pant
Ngoc-Huynh Ho
Gueesang Lee
Soo-Huyng Kim
Hyung-Jeong Yang
22
3
0
01 Oct 2022
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
Andrés Vasco-Carofilis
Laura Fernández-Robles
Enrique Alegre
Eduardo FIDALGO
40
1
0
28 Sep 2022
Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs
Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs
Ðorðe Miladinovic
Kumar Shridhar
Kushal Kumar Jain
Max B. Paulus
J. M. Buhmann
Mrinmaya Sachan
Carl Allen
DRL
21
5
0
26 Sep 2022
Cross-domain Voice Activity Detection with Self-Supervised
  Representations
Cross-domain Voice Activity Detection with Self-Supervised Representations
Sina Alisamir
F. Ringeval
François Portet
17
3
0
22 Sep 2022
Bangla-Wave: Improving Bangla Automatic Speech Recognition Utilizing
  N-gram Language Models
Bangla-Wave: Improving Bangla Automatic Speech Recognition Utilizing N-gram Language Models
Mohammed Rakib
Md. Ismail Hossain
Nabeel Mohammed
Fuad Rahman
VLM
17
6
0
13 Sep 2022
Learning ASR pathways: A sparse multilingual ASR model
Learning ASR pathways: A sparse multilingual ASR model
Mu Yang
Andros Tjandra
Chunxi Liu
David C. Zhang
Duc Le
Ozlem Kalinli
33
13
0
13 Sep 2022
Multilingual Transformer Language Model for Speech Recognition in
  Low-resource Languages
Multilingual Transformer Language Model for Speech Recognition in Low-resource Languages
Li Miao
Jian Wu
Piyush Behre
Shuangyu Chang
S. Parthasarathy
16
2
0
08 Sep 2022
ASR2K: Speech Recognition for Around 2000 Languages without Audio
ASR2K: Speech Recognition for Around 2000 Languages without Audio
Xinjian Li
Florian Metze
David R. Mortensen
A. Black
Shinji Watanabe
15
27
0
06 Sep 2022
Distinguishing between pre- and post-treatment in the speech of patients
  with chronic obstructive pulmonary disease
Distinguishing between pre- and post-treatment in the speech of patients with chronic obstructive pulmonary disease
Andreas Triantafyllopoulos
M. Fendler
A. Batliner
Maurice Gerczuk
Shahin Amiriparian
Thomas Berghaus
Björn W. Schuller
10
7
0
26 Jul 2022
SecretGen: Privacy Recovery on Pre-Trained Models via Distribution
  Discrimination
SecretGen: Privacy Recovery on Pre-Trained Models via Distribution Discrimination
Zhu-rong Yuan
Fan Wu
Yunhui Long
Chaowei Xiao
Bo-wen Li
18
8
0
25 Jul 2022
When Is TTS Augmentation Through a Pivot Language Useful?
When Is TTS Augmentation Through a Pivot Language Useful?
Nathaniel R. Robinson
Perez Ogayo
Swetha Gangu
David R. Mortensen
Shinji Watanabe
12
9
0
20 Jul 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer
  to Unlabeled Modality
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality
Wei-Ning Hsu
Bowen Shi
SSL
VLM
19
41
0
14 Jul 2022
A Comparative Study of Self-supervised Speech Representation Based Voice
  Conversion
A Comparative Study of Self-supervised Speech Representation Based Voice Conversion
Wen-Chin Huang
Shu-Wen Yang
Tomoki Hayashi
T. Toda
10
15
0
10 Jul 2022
Non-Linear Pairwise Language Mappings for Low-Resource Multilingual
  Acoustic Model Fusion
Non-Linear Pairwise Language Mappings for Low-Resource Multilingual Acoustic Model Fusion
Muhammad Umar Farooq
Darshan Adiga Haniya Narayana
Thomas Hain
11
2
0
07 Jul 2022
Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities
  on Multilingual Speech Recognition
Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition
Muhammad Umar Farooq
Thomas Hain
11
3
0
07 Jul 2022
Improving Low-Resource Speech Recognition with Pretrained Speech Models:
  Continued Pretraining vs. Semi-Supervised Training
Improving Low-Resource Speech Recognition with Pretrained Speech Models: Continued Pretraining vs. Semi-Supervised Training
Mitchell DeHaven
J. Billa
VLM
AI4TS
15
8
0
01 Jul 2022
Toward Low-Cost End-to-End Spoken Language Understanding
Toward Low-Cost End-to-End Spoken Language Understanding
Marco Dinarelli
M. Naguib
Franccois Portet
20
5
0
01 Jul 2022
The THUEE System Description for the IARPA OpenASR21 Challenge
The THUEE System Description for the IARPA OpenASR21 Challenge
Jing Zhao
Haoyu Wang
Jinpeng Li
Shuzhou Chai
Guan-Bo Wang
Guoguo Chen
Weiqiang Zhang
VLM
22
1
0
29 Jun 2022
Wav2Vec-Aug: Improved self-supervised training with limited data
Wav2Vec-Aug: Improved self-supervised training with limited data
Anuroop Sriram
Michael Auli
Alexei Baevski
SSL
VLM
6
15
0
27 Jun 2022
Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding
Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding
Wei-Ping Huang
Po-Chun Chen
Sung-Feng Huang
Hung-yi Lee
19
1
0
27 Jun 2022
Annotated Speech Corpus for Low Resource Indian Languages: Awadhi,
  Bhojpuri, Braj and Magahi
Annotated Speech Corpus for Low Resource Indian Languages: Awadhi, Bhojpuri, Braj and Magahi
Ritesh Kumar
Siddharth Singh
Shyam Ratan
Mohith S Raj
Sonal Sinha
Bornini Lahiri
Vivek Seshadri
Kalika Bali
Atul Kr. Ojha
14
6
0
26 Jun 2022
TEVR: Improving Speech Recognition by Token Entropy Variance Reduction
TEVR: Improving Speech Recognition by Token Entropy Variance Reduction
Hajo N. Krabbenhöft
Erhardt Barth
14
3
0
25 Jun 2022
Distilling a Pretrained Language Model to a Multilingual ASR Model
Distilling a Pretrained Language Model to a Multilingual ASR Model
Kwanghee Choi
Hyung-Min Park
VLM
11
10
0
25 Jun 2022
The MuSe 2022 Multimodal Sentiment Analysis Challenge: Humor, Emotional
  Reactions, and Stress
The MuSe 2022 Multimodal Sentiment Analysis Challenge: Humor, Emotional Reactions, and Stress
Lukas Christ
Shahin Amiriparian
Alice Baird
Panagiotis Tzirakis
Alexander Kathan
...
Eva-Maria Messner
Andreas Konig
Alan S. Cowen
Erik Cambria
Björn W. Schuller
27
60
0
23 Jun 2022
COVYT: Introducing the Coronavirus YouTube and TikTok speech dataset
  featuring the same speakers with and without infection
COVYT: Introducing the Coronavirus YouTube and TikTok speech dataset featuring the same speakers with and without infection
Andreas Triantafyllopoulos
A. Semertzidou
Meishu Song
Florian B. Pokorny
Björn W. Schuller
44
2
0
20 Jun 2022
Transformer-based Automatic Speech Recognition of Formal and Colloquial
  Czech in MALACH Project
Transformer-based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project
Jan Lehecka
J. Psutka
Josef Psutka
8
4
0
15 Jun 2022
Exploring Capabilities of Monolingual Audio Transformers using Large
  Datasets in Automatic Speech Recognition of Czech
Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech
Jan Lehecka
J. Svec
A. Pražák
J. Psutka
14
12
0
15 Jun 2022
LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Jinchuan Tian
Jianwei Yu
Chunlei Zhang
Chao Weng
Yuexian Zou
Dong Yu
AuLLM
17
25
0
05 Jun 2022
Speaker Identification using Speech Recognition
Speaker Identification using Speech Recognition
Syeda Rabia Arshad
Syed Mujtaba Haider
Abdul Basit Mughal
6
1
0
29 May 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of
  Speech
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
78
282
0
25 May 2022
T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine
  Translation
T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation
Paul-Ambroise Duquenne
Hongyu Gong
Benoît Sagot
Holger Schwenk
22
18
0
24 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
128
349
0
21 May 2022
End-to-End Zero-Shot Voice Conversion with Location-Variable
  Convolutions
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
Wonjune Kang
M. Hasegawa-Johnson
D. Roy
24
8
0
19 May 2022
Previous
123456789
Next