ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.13979
  4. Cited By
Unsupervised Cross-lingual Representation Learning for Speech
  Recognition

Unsupervised Cross-lingual Representation Learning for Speech Recognition

24 June 2020
Alexis Conneau
Alexei Baevski
R. Collobert
Abdel-rahman Mohamed
Michael Auli
    SSL
ArXivPDFHTML

Papers citing "Unsupervised Cross-lingual Representation Learning for Speech Recognition"

50 / 402 papers shown
Title
From `Snippet-lects' to Doculects and Dialects: Leveraging Neural
  Representations of Speech for Placing Audio Signals in a Language Landscape
From `Snippet-lects' to Doculects and Dialects: Leveraging Neural Representations of Speech for Placing Audio Signals in a Language Landscape
Severine Guillaume
Guillaume Wisniewski
Alexis Michaud
13
2
0
29 May 2023
CommonAccent: Exploring Large Acoustic Pretrained Models for Accent
  Classification Based on Common Voice
CommonAccent: Exploring Large Acoustic Pretrained Models for Accent Classification Based on Common Voice
Juan Pablo Zuluaga
Sara Ahmed
Danielius Visockas
Cem Subakan
VLM
14
10
0
29 May 2023
DisfluencyFixer: A tool to enhance Language Learning through Speech To
  Speech Disfluency Correction
DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction
Vineet Bhat
P. Jyothi
P. Bhattacharyya
19
0
0
26 May 2023
Transfer Learning for Personality Perception via Speech Emotion
  Recognition
Transfer Learning for Personality Perception via Speech Emotion Recognition
Yuanchao Li
P. Bell
Catherine Lai
CVBM
19
3
0
25 May 2023
PoCaPNet: A Novel Approach for Surgical Phase Recognition Using Speech
  and X-Ray Images
PoCaPNet: A Novel Approach for Surgical Phase Recognition Using Speech and X-Ray Images
Kubilay Can Demir
Tobias Weise
M. May
Axel Schmid
Andreas K. Maier
Seung Hee Yang
32
1
0
25 May 2023
Cross-lingual Knowledge Transfer and Iterative Pseudo-labeling for
  Low-Resource Speech Recognition with Transducers
Cross-lingual Knowledge Transfer and Iterative Pseudo-labeling for Low-Resource Speech Recognition with Transducers
J. Silovský
Liuhui Deng
Arturo Argueta
Tresi Arvizo
Roger Hsiao
Sasha Kuznietsov
Yiu-Chang Lin
Xiaoqiang Xiao
Yuanyuan Zhang
16
2
0
23 May 2023
Scaling Speech Technology to 1,000+ Languages
Scaling Speech Technology to 1,000+ Languages
Vineel Pratap
Andros Tjandra
Bowen Shi
Paden Tomasello
Arun Babu
...
Yossi Adi
Xiaohui Zhang
Wei-Ning Hsu
Alexis Conneau
Michael Auli
VLM
77
298
0
22 May 2023
Language-universal phonetic encoder for low-resource speech recognition
Language-universal phonetic encoder for low-resource speech recognition
Siyuan Feng
Ming Tu
Rui Xia
Chuanzeng Huang
Yuxuan Wang
31
2
0
19 May 2023
Language-Universal Phonetic Representation in Multilingual Speech
  Pretraining for Low-Resource Speech Recognition
Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition
Siyuan Feng
Ming Tu
Rui Xia
Chuanzeng Huang
Yuxuan Wang
35
5
0
19 May 2023
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Jiatong Shi
Dan Berrebbi
William Chen
Ho-Lam Chung
En-Pei Hu
...
Xuankai Chang
Shang-Wen Li
Abdel-rahman Mohamed
Hung-yi Lee
Shinji Watanabe
ELM
55
58
0
18 May 2023
DinoSR: Self-Distillation and Online Clustering for Self-supervised
  Speech Representation Learning
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Alexander H. Liu
Heng-Jui Chang
Michael Auli
Wei-Ning Hsu
James R. Glass
22
24
0
17 May 2023
The Interpreter Understands Your Meaning: End-to-end Spoken Language
  Understanding Aided by Speech Translation
The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation
Mutian He
Philip N. Garner
36
4
0
16 May 2023
Improving Cascaded Unsupervised Speech Translation with Denoising
  Back-translation
Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation
Yu-Kuan Fu
Liang-Hsuan Tseng
Jiatong Shi
Chen An Li
Tsung-Yuan Hsu
Shinji Watanabe
Hung-yi Lee
15
4
0
12 May 2023
Exploration of Language Dependency for Japanese Self-Supervised Speech
  Representation Models
Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models
Takanori Ashihara
Takafumi Moriya
Kohei Matsuura
Tomohiro Tanaka
25
3
0
09 May 2023
A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers
A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers
Juan Pablo Zuluaga
Amrutha Prasad
Iuliia Nigmatulina
P. Motlícek
Matthias Kleinert
24
21
0
16 Apr 2023
Looking Similar, Sounding Different: Leveraging Counterfactual
  Cross-Modal Pairs for Audiovisual Representation Learning
Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning
Nikhil Singh
Chih-Wei Wu
Iroro Orife
Mahdi M. Kalayeh
23
2
0
12 Apr 2023
Multilingual Word Error Rate Estimation: e-WER3
Multilingual Word Error Rate Estimation: e-WER3
Shammur A. Chowdhury
Ahmed M. Ali
16
7
0
02 Apr 2023
AfroDigits: A Community-Driven Spoken Digit Dataset for African
  Languages
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Chris C. Emezue
Sanchit Gandhi
Lewis Tunstall
Abubakar Abid
Josh Meyer
...
Douwe Kiela
Yacine Jernite
Julien Chaumond
Merve Noyan
Omar Sanseviero
25
2
0
22 Mar 2023
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to
  GPT-5 All You Need?
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?
Chaoning Zhang
Chenshuang Zhang
Sheng Zheng
Yu Qiao
Chenghao Li
...
Lik-Hang Lee
Yang Yang
Heng Tao Shen
In So Kweon
Choong Seon Hong
79
159
0
21 Mar 2023
Cocktail HuBERT: Generalized Self-Supervised Pre-training for Mixture
  and Single-Source Speech
Cocktail HuBERT: Generalized Self-Supervised Pre-training for Mixture and Single-Source Speech
Maryam Fazel-Zarandi
Wei-Ning Hsu
SSL
16
8
0
20 Mar 2023
Cascading and Direct Approaches to Unsupervised Constituency Parsing on
  Spoken Sentences
Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences
Yuan Tseng
Cheng-I Jeff Lai
Hung-yi Lee
SSL
35
4
0
15 Mar 2023
Learning Cross-lingual Visual Speech Representations
Learning Cross-lingual Visual Speech Representations
Andreas Zinonos
A. Haliassos
Pingchuan Ma
Stavros Petridis
M. Pantic
SSL
6
8
0
14 Mar 2023
A Hierarchical Regression Chain Framework for Affective Vocal Burst
  Recognition
A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition
Jinchao Li
Xixin Wu
Kaitao Song
Dongsheng Li
Xunying Liu
Helen M. Meng
20
2
0
14 Mar 2023
Cross-lingual Alzheimer's Disease detection based on paralinguistic and
  pre-trained features
Cross-lingual Alzheimer's Disease detection based on paralinguistic and pre-trained features
Xuchu Chen
Yujiang Pu
Jinpeng Li
Weiqiang Zhang
17
14
0
14 Mar 2023
wav2vec and its current potential to Automatic Speech Recognition in
  German for the usage in Digital History: A comparative assessment of
  available ASR-technologies for the use in cultural heritage contexts
wav2vec and its current potential to Automatic Speech Recognition in German for the usage in Digital History: A comparative assessment of available ASR-technologies for the use in cultural heritage contexts
Michael Fleck
Wolfgang Göderle
21
0
0
06 Mar 2023
WESPER: Zero-shot and Realtime Whisper to Normal Voice Conversion for
  Whisper-based Speech Interactions
WESPER: Zero-shot and Realtime Whisper to Normal Voice Conversion for Whisper-based Speech Interactions
Jun Rekimoto
38
19
0
03 Mar 2023
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio
Max Bain
Jaesung Huh
Tengda Han
Andrew Zisserman
26
202
0
01 Mar 2023
Language-Universal Adapter Learning with Knowledge Distillation for
  End-to-End Multilingual Speech Recognition
Language-Universal Adapter Learning with Knowledge Distillation for End-to-End Multilingual Speech Recognition
Zhijie Shen
Wu Guo
Bin Gu
44
4
0
28 Feb 2023
Exploring Self-supervised Pre-trained ASR Models For Dysarthric and
  Elderly Speech Recognition
Exploring Self-supervised Pre-trained ASR Models For Dysarthric and Elderly Speech Recognition
Shujie Hu
Xurong Xie
Zengrui Jin
Mengzhe Geng
Yi Wang
Mingyu Cui
Jiajun Deng
Xunying Liu
Helen M. Meng
19
30
0
28 Feb 2023
Improving Massively Multilingual ASR With Auxiliary CTC Objectives
Improving Massively Multilingual ASR With Auxiliary CTC Objectives
William Chen
Brian Yan
Jiatong Shi
Yifan Peng
Soumi Maiti
Shinji Watanabe
39
38
0
24 Feb 2023
Speaker and Language Change Detection using Wav2vec2 and Whisper
Speaker and Language Change Detection using Wav2vec2 and Whisper
Tijn Berns
Nik Vaessen
David A. van Leeuwen
46
4
0
18 Feb 2023
MAC: A unified framework boosting low resource automatic speech
  recognition
MAC: A unified framework boosting low resource automatic speech recognition
Zeping Min
Qian Ge
Zhong Li
E. Weinan
11
1
0
05 Feb 2023
From English to More Languages: Parameter-Efficient Model Reprogramming
  for Cross-Lingual Speech Recognition
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition
Chao-Han Huck Yang
Bo-wen Li
Yu Zhang
Nanxin Chen
Rohit Prabhavalkar
Tara N. Sainath
Trevor Strohman
11
28
0
19 Jan 2023
Adapting Multilingual Speech Representation Model for a New,
  Underresourced Language through Multilingual Fine-tuning and Continued
  Pretraining
Adapting Multilingual Speech Representation Model for a New, Underresourced Language through Multilingual Fine-tuning and Continued Pretraining
Karol Nowakowski
M. Ptaszynski
Kyoko Murasaki
Jagna Nieuwazny
15
23
0
18 Jan 2023
MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module
MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module
Ondvrej Plátek
Ondrej Dusek
21
2
0
17 Jan 2023
BayesSpeech: A Bayesian Transformer Network for Automatic Speech
  Recognition
BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition
Will Rieger
BDL
UQCV
11
0
0
16 Jan 2023
Learning Audio-Driven Viseme Dynamics for 3D Face Animation
Learning Audio-Driven Viseme Dynamics for 3D Face Animation
Linchao Bao
Haoxian Zhang
Yue Qian
Tangli Xue
Changan Chen
Xuefei Zhe
Di Kang
3DH
20
12
0
15 Jan 2023
Automated speech- and text-based classification of neuropsychiatric
  conditions in a multidiagnostic setting
Automated speech- and text-based classification of neuropsychiatric conditions in a multidiagnostic setting
L. Hansen
R. Rocca
A. Simonsen
A. Parola
V. Bliksted
...
Dan Bang
Kristian Tylén
Ethan Weed
S. Ostergaard
Riccardo Fusaroli
38
3
0
13 Jan 2023
Supervised Acoustic Embeddings And Their Transferability Across
  Languages
Supervised Acoustic Embeddings And Their Transferability Across Languages
Sreepratha Ram
Hanan Aldarmaki
SSL
19
3
0
03 Jan 2023
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition
  Systems A case study for Modern Greek
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Georgios Paraskevopoulos
Theodoros Kouzelis
Georgios Rouvalis
Athanasios Katsamanis
V. Katsouros
Alexandros Potamianos
VLM
23
7
0
31 Dec 2022
Pushing the performances of ASR models on English and Spanish accents
Pushing the performances of ASR models on English and Spanish accents
Pooja Chitkara
M. Rivière
Jade Copet
Frank Zhang
Yatharth Saraf
13
0
0
22 Dec 2022
End-to-End Automatic Speech Recognition model for the Sudanese Dialect
End-to-End Automatic Speech Recognition model for the Sudanese Dialect
Ayman Mansour
Wafaa F. Mukhtar
14
1
0
21 Dec 2022
Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models
Mu2^{2}2SLAM: Multitask, Multilingual Speech and Language Models
Yong Cheng
Yu Zhang
Melvin Johnson
Wolfgang Macherey
Ankur Bapna
20
8
0
19 Dec 2022
Effectiveness of Text, Acoustic, and Lattice-based representations in
  Spoken Language Understanding tasks
Effectiveness of Text, Acoustic, and Lattice-based representations in Spoken Language Understanding tasks
Esaú Villatoro-Tello
S. Madikeri
Juan Pablo Zuluaga
Bidisha Sharma
Seyyed Saeed Sarfjoo
Iuliia Nigmatulina
P. Motlícek
A. Ivanov
A. Ganapathiraju
15
3
0
16 Dec 2022
BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric
BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric
Mingda Chen
Paul-Ambroise Duquenne
Pierre Yves Andrews
Justine T. Kao
Alexandre Mourachko
Holger Schwenk
Marta R. Costa-jussá
14
17
0
16 Dec 2022
Improved Self-Supervised Multilingual Speech Representation Learning
  Combined with Auxiliary Language Information
Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information
Fenglin Ding
Genshun Wan
Pengcheng Li
Jia-Yu Pan
Cong Liu
SSL
25
1
0
07 Dec 2022
BARTSmiles: Generative Masked Language Models for Molecular
  Representations
BARTSmiles: Generative Masked Language Models for Molecular Representations
Gayane Chilingaryan
Hovhannes Tamoyan
Ani Tevosyan
N. Babayan
L. Khondkaryan
Karen Hambardzumyan
Zaven Navoyan
Hrant Khachatrian
Armen Aghajanyan
SSL
27
25
0
29 Nov 2022
Towards continually learning new languages
Towards continually learning new languages
Ngoc-Quan Pham
J. Niehues
A. Waibel
CLL
11
1
0
21 Nov 2022
Self-Transriber: Few-shot Lyrics Transcription with Self-training
Self-Transriber: Few-shot Lyrics Transcription with Self-training
Xiaoxue Gao
Xianghu Yue
Haizhou Li
28
7
0
18 Nov 2022
MaskedSpeech: Context-aware Speech Synthesis with Masking Strategy
MaskedSpeech: Context-aware Speech Synthesis with Masking Strategy
Ya-Jie Zhang
Wei Song
Ya Yue
Zhengchen Zhang
Youzheng Wu
Xiaodong He
21
7
0
11 Nov 2022
Previous
123456789
Next