ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.13979
  4. Cited By
Unsupervised Cross-lingual Representation Learning for Speech
  Recognition

Unsupervised Cross-lingual Representation Learning for Speech Recognition

24 June 2020
Alexis Conneau
Alexei Baevski
R. Collobert
Abdel-rahman Mohamed
Michael Auli
    SSL
ArXivPDFHTML

Papers citing "Unsupervised Cross-lingual Representation Learning for Speech Recognition"

50 / 402 papers shown
Title
Extending Multilingual Speech Synthesis to 100+ Languages without
  Transcribed Data
Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Takaaki Saeki
Gary Wang
Nobuyuki Morioka
Isaac Elias
Kyle Kastner
...
Andrew Rosenberg
Bhuvana Ramabhadran
Heiga Zen
Francoise Beaufays
Hadar Shemtov
36
13
0
29 Feb 2024
Experimental Study: Enhancing Voice Spoofing Detection Models with
  wav2vec 2.0
Experimental Study: Enhancing Voice Spoofing Detection Models with wav2vec 2.0
Taein Kang
Soyul Han
Sunmook Choi
Jaejin Seo
Sanghyeok Chung
Seungeun Lee
Seungsang Oh
Il-Youp Kwak
41
8
0
27 Feb 2024
The Effect of Batch Size on Contrastive Self-Supervised Speech
  Representation Learning
The Effect of Batch Size on Contrastive Self-Supervised Speech Representation Learning
Nik Vaessen
David A. van Leeuwen
30
3
0
21 Feb 2024
Soft-Weighted CrossEntropy Loss for Continous Alzheimer's Disease
  Detection
Soft-Weighted CrossEntropy Loss for Continous Alzheimer's Disease Detection
Xiaohui Zhang
Wenjie Fu
Mangui Liang
35
1
0
19 Feb 2024
Establishing degrees of closeness between audio recordings along
  different dimensions using large-scale cross-lingual models
Establishing degrees of closeness between audio recordings along different dimensions using large-scale cross-lingual models
Maxime Fily
Guillaume Wisniewski
Severine Guillaume
Gilles Adda
Alexis Michaud
22
1
0
08 Feb 2024
REBORN: Reinforcement-Learned Boundary Segmentation with Iterative
  Training for Unsupervised ASR
REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR
Liang-Hsuan Tseng
En-Pei Hu
Cheng-Han Chiang
Yuan Tseng
Hung-yi Lee
Lin-shan Lee
Shao-Hua Sun
59
1
0
06 Feb 2024
The last Dance : Robust backdoor attack via diffusion models and
  bayesian approach
The last Dance : Robust backdoor attack via diffusion models and bayesian approach
Orson Mengara
DiffM
32
4
0
05 Feb 2024
Layer-Wise Analysis of Self-Supervised Acoustic Word Embeddings: A Study
  on Speech Emotion Recognition
Layer-Wise Analysis of Self-Supervised Acoustic Word Embeddings: A Study on Speech Emotion Recognition
Alexandra Saliba
Yuanchao Li
Ramon Sanabria
Catherine Lai
38
8
0
04 Feb 2024
Predicting positive transfer for improved low-resource speech
  recognition using acoustic pseudo-tokens
Predicting positive transfer for improved low-resource speech recognition using acoustic pseudo-tokens
Nay San
Georgios Paraskevopoulos
Aryaman Arora
Xiluo He
Prabhjot Kaur
Oliver Adams
Dan Jurafsky
31
7
0
03 Feb 2024
Are Paralinguistic Representations all that is needed for Speech Emotion
  Recognition?
Are Paralinguistic Representations all that is needed for Speech Emotion Recognition?
Orchid Chetia Phukan
Gautam Siddharth Kashyap
Arun Balaji Buduru
Rajesh Sharma
29
0
0
02 Feb 2024
AccentFold: A Journey through African Accents for Zero-Shot ASR
  Adaptation to Target Accents
AccentFold: A Journey through African Accents for Zero-Shot ASR Adaptation to Target Accents
A. Owodunni
Aditya Yadavalli
Chris C. Emezue
Tobi Olatunji
Clinton Mbataku
22
1
0
02 Feb 2024
SpeechBERTScore: Reference-Aware Automatic Evaluation of Speech
  Generation Leveraging NLP Evaluation Metrics
SpeechBERTScore: Reference-Aware Automatic Evaluation of Speech Generation Leveraging NLP Evaluation Metrics
Takaaki Saeki
Soumi Maiti
Shinnosuke Takamichi
Shinji Watanabe
Hiroshi Saruwatari
22
11
0
30 Jan 2024
Benchmarking Large Multimodal Models against Common Corruptions
Benchmarking Large Multimodal Models against Common Corruptions
Jiawei Zhang
Tianyu Pang
Chao Du
Yi Ren
Bo-wen Li
Min-Bin Lin
MLLM
22
14
0
22 Jan 2024
TranSentence: Speech-to-speech Translation via Language-agnostic
  Sentence-level Speech Encoding without Language-parallel Data
TranSentence: Speech-to-speech Translation via Language-agnostic Sentence-level Speech Encoding without Language-parallel Data
Seung-Bin Kim
Sang-Hoon Lee
Seong-Whan Lee
22
4
0
17 Jan 2024
Efficient Adapter Finetuning for Tail Languages in Streaming
  Multilingual ASR
Efficient Adapter Finetuning for Tail Languages in Streaming Multilingual ASR
Junwen Bai
Bo-wen Li
Qiujia Li
Tara N. Sainath
Trevor Strohman
28
3
0
17 Jan 2024
XLS-R Deep Learning Model for Multilingual ASR on Low- Resource
  Languages: Indonesian, Javanese, and Sundanese
XLS-R Deep Learning Model for Multilingual ASR on Low- Resource Languages: Indonesian, Javanese, and Sundanese
Panji Arisaputra
Alif Tri Handoyo
Amalia Zahra
18
4
0
12 Jan 2024
Singer Identity Representation Learning using Self-Supervised Techniques
Singer Identity Representation Learning using Self-Supervised Techniques
Bernardo Torres
Stefan Lattner
Gaël Richard
SSL
30
8
0
10 Jan 2024
Towards a Foundation Purchasing Model: Pretrained Generative
  Autoregression on Transaction Sequences
Towards a Foundation Purchasing Model: Pretrained Generative Autoregression on Transaction Sequences
Piotr Skalski
David Sutton
Stuart Burrell
Iker Perez
Jason Wong
AI4TS
32
2
0
03 Jan 2024
Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic
  Token Prediction
Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction
Minchan Kim
Myeonghun Jeong
Byoung Jin Choi
Semin Kim
Joun Yeop Lee
Nam Soo Kim
AI4TS
23
4
0
03 Jan 2024
Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech
  Recognition using Adversarial Data Augmentation
Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation
Huimeng Wang
Zengrui Jin
Mengzhe Geng
Shujie Hu
Guinan Li
Tianzi Wang
Haoning Xu
Xunying Liu
19
10
0
01 Jan 2024
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis
  Conditioned on Self-supervised Discrete Speech Representations
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
Cheng Gong
Xin Wang
Erica Cooper
Dan Wells
Longbiao Wang
Jianwu Dang
Korin Richmond
Junichi Yamagishi
24
21
0
22 Dec 2023
FusDom: Combining In-Domain and Out-of-Domain Knowledge for Continuous
  Self-Supervised Learning
FusDom: Combining In-Domain and Out-of-Domain Knowledge for Continuous Self-Supervised Learning
Ashish Seth
Sreyan Ghosh
S. Umesh
Dinesh Manocha
CLL
22
1
0
20 Dec 2023
Stable Distillation: Regularizing Continued Pre-training for
  Low-Resource Automatic Speech Recognition
Stable Distillation: Regularizing Continued Pre-training for Low-Resource Automatic Speech Recognition
Ashish Seth
Sreyan Ghosh
S. Umesh
Dinesh Manocha
CLL
26
1
0
20 Dec 2023
Efficiency-oriented approaches for self-supervised speech representation
  learning
Efficiency-oriented approaches for self-supervised speech representation learning
Luis Lugo
Valentin Vielzeuf
SSL
21
1
0
18 Dec 2023
What to Remember: Self-Adaptive Continual Learning for Audio Deepfake
  Detection
What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection
Xiaohui Zhang
Jiangyan Yi
Chenglong Wang
Chuyuan Zhang
Siding Zeng
Jianhua Tao
57
25
0
15 Dec 2023
Attention-Guided Adaptation for Code-Switching Speech Recognition
Attention-Guided Adaptation for Code-Switching Speech Recognition
Bobbi Aditya
Mahdin Rohmatillah
Liang-Hsuan Tai
Jen-Tzung Chien
21
8
0
14 Dec 2023
Optimizing Two-Pass Cross-Lingual Transfer Learning: Phoneme Recognition
  and Phoneme to Grapheme Translation
Optimizing Two-Pass Cross-Lingual Transfer Learning: Phoneme Recognition and Phoneme to Grapheme Translation
Wonjun Lee
Gary Geunbae Lee
Yunsu Kim
29
0
0
06 Dec 2023
Evaluating Self-supervised Speech Models on a Taiwanese Hokkien Corpus
Evaluating Self-supervised Speech Models on a Taiwanese Hokkien Corpus
Yi-Hui Chou
Kalvin Chang
Meng-Ju Wu
Winston Ou
Alice Wen-Hsin Bi
...
Iu-Tshian Phoann
Winnie Chang
Chenxuan Cui
Noel Chen
Jiatong Shi
37
3
0
06 Dec 2023
PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo
  Multi-modal Features
PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features
Tianshun Han
Shengnan Gui
Yiqing Huang
Baihui Li
Lijian Liu
...
Quan Lu
Ruicong Zhi
Yanyan Liang
Du Zhang
Jun Wan
VGen
27
1
0
05 Dec 2023
A Quantitative Approach to Understand Self-Supervised Models as
  Cross-lingual Feature Extractors
A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors
Shuyue Stella Li
Beining Xu
Xiangyu Zhang
Hexin Liu
Wen-Han Chao
Leibny Paola García
SSL
29
4
0
27 Nov 2023
Multilingual self-supervised speech representations improve the speech
  recognition of low-resource African languages with codeswitching
Multilingual self-supervised speech representations improve the speech recognition of low-resource African languages with codeswitching
Tolúlopé Ogúnremí
Christopher D. Manning
Dan Jurafsky
17
5
0
25 Nov 2023
The taste of IPA: Towards open-vocabulary keyword spotting and forced
  alignment in any language
The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any language
Jian Zhu
Changbing Yang
Farhan Samir
Jahurul Islam
32
4
0
14 Nov 2023
Transduce and Speak: Neural Transducer for Text-to-Speech with Semantic
  Token Prediction
Transduce and Speak: Neural Transducer for Text-to-Speech with Semantic Token Prediction
Minchan Kim
Myeonghun Jeong
Byoung Jin Choi
Dongjune Lee
N. Kim
AI4TS
25
10
0
06 Nov 2023
Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech
  Models via Language-Specific Experts
Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts
Thomas Palmeira Ferraz
Marcely Zanon Boito
Caroline Brun
Vassilina Nikoulina
29
12
0
02 Nov 2023
Combining Language Models For Specialized Domains: A Colorful Approach
Combining Language Models For Specialized Domains: A Colorful Approach
Daniel Eitan
Menachem Pirchi
Neta Glazer
Shai Meital
Gil Ayach
Gidon Krendel
Aviv Shamsian
Aviv Navon
Gil Hetz
Joseph Keshet
11
1
0
30 Oct 2023
Pre-trained Speech Processing Models Contain Human-Like Biases that
  Propagate to Speech Emotion Recognition
Pre-trained Speech Processing Models Contain Human-Like Biases that Propagate to Speech Emotion Recognition
Isaac Slaughter
Craig Greenberg
Reva Schwartz
Aylin Caliskan
22
4
0
29 Oct 2023
TorchAudio 2.1: Advancing speech recognition, self-supervised learning,
  and audio processing components for PyTorch
TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch
Jeff Hwang
Moto Hira
Caroline Chen
Xiaohui Zhang
Zhaoheng Ni
...
Yumeng Tao
Robin Scheibler
Samuele Cornell
Sean Kim
Stavros Petridis
38
22
0
27 Oct 2023
Quantifying the Dialect Gap and its Correlates Across Languages
Quantifying the Dialect Gap and its Correlates Across Languages
Anjali Kantharuban
Ivan Vulić
Anna Korhonen
57
20
0
23 Oct 2023
Spatial HuBERT: Self-supervised Spatial Speech Representation Learning
  for a Single Talker from Multi-channel Audio
Spatial HuBERT: Self-supervised Spatial Speech Representation Learning for a Single Talker from Multi-channel Audio
Antoni Dimitriadis
Siqi Pan
V. Sethu
Beena Ahmed
SSL
20
3
0
17 Oct 2023
Advancing Audio Emotion and Intent Recognition with Large Pre-Trained
  Models and Bayesian Inference
Advancing Audio Emotion and Intent Recognition with Large Pre-Trained Models and Bayesian Inference
Dejan Porjazovski
Yaroslav Getman
Tamás Grósz
M. Kurimo
28
3
0
16 Oct 2023
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech
  Transformers
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers
Hosein Mohebbi
Grzegorz Chrupała
Willem H. Zuidema
A. Alishahi
28
12
0
15 Oct 2023
Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and Text
Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and Text
Chanho Park
Chengsong Lu
Mingjie Chen
Thomas Hain
28
3
0
12 Oct 2023
Enhancing expressivity transfer in textless speech-to-speech translation
Enhancing expressivity transfer in textless speech-to-speech translation
J. Duret
Benjamin O’Brien
Yannick Esteve
Titouan Parcollet
43
2
0
11 Oct 2023
Temporally Aligning Long Audio Interviews with Questions: A Case Study
  in Multimodal Data Integration
Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration
Piyush Singh Pasi
Karthikeya Battepati
P. Jyothi
Ganesh Ramakrishnan
T. Mahapatra
Manoj Singh
51
0
0
10 Oct 2023
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Jiatong Shi
William Chen
Dan Berrebbi
Hsiu-Hsuan Wang
Wei-Ping Huang
...
Yuxun Tang
Shang-Wen Li
Abdelrahman Mohamed
Hung-yi Lee
Shinji Watanabe
LRM
ELM
34
15
0
09 Oct 2023
XLS-R fine-tuning on noisy word boundaries for unsupervised speech
  segmentation into words
XLS-R fine-tuning on noisy word boundaries for unsupervised speech segmentation into words
Robin Algayres
Pablo Diego-Simon
Benoît Sagot
Emmanuel Dupoux
36
1
0
08 Oct 2023
Zero-shot Cross-lingual Transfer without Parallel Corpus
Zero-shot Cross-lingual Transfer without Parallel Corpus
Yuyang Zhang
Xiaofeng Han
Baojun Wang
VLM
29
0
0
07 Oct 2023
Evaluating Self-Supervised Speech Representations for Indigenous
  American Languages
Evaluating Self-Supervised Speech Representations for Indigenous American Languages
Chih-Chen Chen
William Chen
Rodolfo Zevallos
John E. Ortega
34
7
0
05 Oct 2023
Zero Resource Code-switched Speech Benchmark Using Speech Utterance
  Pairs For Multiple Spoken Languages
Zero Resource Code-switched Speech Benchmark Using Speech Utterance Pairs For Multiple Spoken Languages
Kuan-Po Huang
Chih-Kai Yang
Yu-Kuan Fu
Ewan Dunbar
Hung-yi Lee
29
5
0
04 Oct 2023
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised
  Learning with Masked Unit Prediction
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction
Jiatong Shi
H. Inaguma
Xutai Ma
Ilia Kulikov
Anna Y. Sun
45
24
0
04 Oct 2023
Previous
123456789
Next