ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.13979
  4. Cited By
Unsupervised Cross-lingual Representation Learning for Speech
  Recognition

Unsupervised Cross-lingual Representation Learning for Speech Recognition

24 June 2020
Alexis Conneau
Alexei Baevski
R. Collobert
Abdel-rahman Mohamed
Michael Auli
    SSL
ArXivPDFHTML

Papers citing "Unsupervised Cross-lingual Representation Learning for Speech Recognition"

50 / 402 papers shown
Title
Analyzing Speech Unit Selection for Textless Speech-to-Speech
  Translation
Analyzing Speech Unit Selection for Textless Speech-to-Speech Translation
J. Duret
Yannick Esteve
Titouan Parcollet
41
0
0
08 Jul 2024
TokenVerse: Unifying Speech and NLP Tasks via Transducer-based ASR
TokenVerse: Unifying Speech and NLP Tasks via Transducer-based ASR
Shashi Kumar
S. Madikeri
Juan Zuluaga-Gomez
Iuliia Nigmatulina
Esaú Villatoro-Tello
Sergio Burdisso
P. Motlícek
Karthik Pandia
A. Ganapathiraju
44
0
0
05 Jul 2024
Continual Learning Optimizations for Auto-regressive Decoder of
  Multilingual ASR systems
Continual Learning Optimizations for Auto-regressive Decoder of Multilingual ASR systems
Chin Yuen Kwok
J. Yip
Eng Siong Chng
CLL
32
1
0
04 Jul 2024
Self-supervised ASR Models and Features For Dysarthric and Elderly
  Speech Recognition
Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition
Shujie Hu
Xurong Xie
Mengzhe Geng
Zengrui Jin
Jiajun Deng
...
Yi Wang
Mingyu Cui
Tianzi Wang
Helen Meng
Xunying Liu
43
5
0
03 Jul 2024
uDistil-Whisper: Label-Free Data Filtering for Knowledge Distillation in Low-Data Regimes
uDistil-Whisper: Label-Free Data Filtering for Knowledge Distillation in Low-Data Regimes
Abdul Waheed
Karima Kadaoui
Bhiksha Raj
Muhammad Abdul-Mageed
32
1
0
01 Jul 2024
Towards Robust Speech Representation Learning for Thousands of Languages
Towards Robust Speech Representation Learning for Thousands of Languages
William Chen
Wangyou Zhang
Yifan Peng
Xinjian Li
Jinchuan Tian
Jiatong Shi
Xuankai Chang
Soumi Maiti
Karen Livescu
Shinji Watanabe
ELM
40
6
0
30 Jun 2024
Less Forgetting for Better Generalization: Exploring Continual-learning
  Fine-tuning Methods for Speech Self-supervised Representations
Less Forgetting for Better Generalization: Exploring Continual-learning Fine-tuning Methods for Speech Self-supervised Representations
Salah Zaiem
Titouan Parcollet
S. Essid
CLL
28
3
0
30 Jun 2024
BESTOW: Efficient and Streamable Speech Language Model with the Best of
  Two Worlds in GPT and T5
BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5
Zhehuai Chen
He Huang
Oleksii Hrinchuk
Krishna C. Puvvada
Nithin Rao Koluguri
Piotr Żelasko
Jagadeesh Balam
Boris Ginsburg
AuLLM
RALM
34
10
0
28 Jun 2024
Towards Building an End-to-End Multilingual Automatic Lyrics
  Transcription Model
Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model
Jiawen Huang
Emmanouil Benetos
41
1
0
25 Jun 2024
Leveraging Parameter-Efficient Transfer Learning for Multi-Lingual
  Text-to-Speech Adaptation
Leveraging Parameter-Efficient Transfer Learning for Multi-Lingual Text-to-Speech Adaptation
Yingting Li
Ambuj Mehrish
Bryan Chew
Bo Cheng
Soujanya Poria
38
0
0
25 Jun 2024
Speech Analysis of Language Varieties in Italy
Speech Analysis of Language Varieties in Italy
Moreno La Quatra
Alkis Koudounas
Elena Baralis
Sabato Marco Siniscalchi
27
3
0
22 Jun 2024
The Greek podcast corpus: Competitive speech models for low-resourced
  languages with weakly supervised data
The Greek podcast corpus: Competitive speech models for low-resourced languages with weakly supervised data
Georgios Paraskevopoulos
Chara Tsoukala
Athanasios Katsamanis
V. Katsouros
OffRL
23
0
0
21 Jun 2024
MultiTalk: Enhancing 3D Talking Head Generation Across Languages with
  Multilingual Video Dataset
MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset
Kim Sung-Bin
Lee Chae-Yeon
Gihun Son
Oh Hyun-Bin
Janghoon Ju
Suekyeong Nam
Tae-Hyun Oh
34
11
0
20 Jun 2024
Seamless Language Expansion: Enhancing Multilingual Mastery in
  Self-Supervised Models
Seamless Language Expansion: Enhancing Multilingual Mastery in Self-Supervised Models
Jing Xu
Minglin Wu
Xixin Wu
Helen Meng
CLL
32
1
0
20 Jun 2024
ManWav: The First Manchu ASR Model
ManWav: The First Manchu ASR Model
Jean Seo
Minha Kang
Sungjoo Byun
Sangah Lee
18
1
0
19 Jun 2024
Medical Spoken Named Entity Recognition
Medical Spoken Named Entity Recognition
Khai Le-Duc
David Thulke
Hung-Phong Tran
Long Vo-Dang
Khai-Nguyen Nguyen
Truong Son-Hy
Ralf Schluter
41
0
0
19 Jun 2024
GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for
  Low-Resource Languages with Automated Crawling, Transcription and Refinement
GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement
Yifan Yang
Zheshu Song
Jianheng Zhuo
Mingyu Cui
Jinpeng Li
...
Shuai Fan
Kai Yu
Wei-Qiang Zhang
Guoguo Chen
Xie Chen
14
8
0
17 Jun 2024
Large Language Models for Dysfluency Detection in Stuttered Speech
Large Language Models for Dysfluency Detection in Stuttered Speech
Dominik Wagner
Sebastian P. Bayerl
Ilja Baumann
K. Riedhammer
Elmar Nöth
Tobias Bocklet
42
3
0
16 Jun 2024
Impact of Speech Mode in Automatic Pathological Speech Detection
Impact of Speech Mode in Automatic Pathological Speech Detection
S. A. Sheikh
Ina Kodrasi
29
3
0
14 Jun 2024
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
Holy Lovenia
Rahmad Mahendra
Salsabil Maulana Akbar
Lester James Validad Miranda
Jennifer Santoso
...
Genta Indra Winata
Ruochen Zhang
Fajri Koto
Zheng-Xin Yong
Samuel Cahyawijaya
79
9
0
14 Jun 2024
Language Complexity and Speech Recognition Accuracy: Orthographic
  Complexity Hurts, Phonological Complexity Doesn't
Language Complexity and Speech Recognition Accuracy: Orthographic Complexity Hurts, Phonological Complexity Doesn't
Chihiro Taguchi
David Chiang
26
2
0
13 Jun 2024
An Initial Investigation of Language Adaptation for TTS Systems under
  Low-resource Scenarios
An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios
Cheng Gong
Erica Cooper
Xin Wang
Chunyu Qiang
Mengzhe Geng
...
Jianwu Dang
Marc Tessier
Aidan Pine
Korin Richmond
Junichi Yamagishi
35
2
0
13 Jun 2024
Emotion Manipulation Through Music -- A Deep Learning Interactive Visual
  Approach
Emotion Manipulation Through Music -- A Deep Learning Interactive Visual Approach
Adel N. Abdalla
Jared Osborne
Razvan Andonie
23
1
0
12 Jun 2024
MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword
  Spotting
MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword Spotting
Zhiqi Ai
Zhiyong Chen
Shugong Xu
32
2
0
11 Jun 2024
mHuBERT-147: A Compact Multilingual HuBERT Model
mHuBERT-147: A Compact Multilingual HuBERT Model
Marcely Zanon Boito
Vivek Iyer
Nikolaos Lagos
Laurent Besacier
Ioan Calapodescu
VLM
62
8
0
10 Jun 2024
To Distill or Not to Distill? On the Robustness of Robust Knowledge
  Distillation
To Distill or Not to Distill? On the Robustness of Robust Knowledge Distillation
Abdul Waheed
Karima Kadaoui
Muhammad Abdul-Mageed
VLM
38
3
0
06 Jun 2024
Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large
  Language Models
Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models
Ziyun Cui
Chang Lei
Wen Wu
Yinan Duan
Diyang Qu
Ji Wu
Runsen Chen
Chao Zhang
31
2
0
06 Jun 2024
Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing
  Conversion
Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion
Ruiqi Li
Rongjie Huang
Yongqi Wang
Zhiqing Hong
Zhou Zhao
32
1
0
04 Jun 2024
SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar
  Latent Transformer Diffusion Models
SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models
Dongchao Yang
Dingdong Wang
Haohan Guo
Xueyuan Chen
Xixin Wu
Helen M. Meng
57
25
0
04 Jun 2024
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision
Saierdaer Yusuyin
Te Ma
Hao Huang
Wenbo Zhao
Zhijian Ou
44
2
0
04 Jun 2024
Deep Learning for Assessment of Oral Reading Fluency
Deep Learning for Assessment of Oral Reading Fluency
Mithilesh Vaidya
Binaya Kumar Sahoo
Preeti Rao
13
0
0
29 May 2024
Robust Singing Voice Transcription Serves Synthesis
Robust Singing Voice Transcription Serves Synthesis
Ruiqi Li
Yu Zhang
Yongqi Wang
Zhiqing Hong
Rongjie Huang
Zhou Zhao
38
7
0
16 May 2024
Selfsupervised learning for pathological speech detection
Selfsupervised learning for pathological speech detection
S. A. Sheikh
28
0
0
16 May 2024
Continued Pretraining for Domain Adaptation of Wav2vec2.0 in Automatic
  Speech Recognition for Elementary Math Classroom Settings
Continued Pretraining for Domain Adaptation of Wav2vec2.0 in Automatic Speech Recognition for Elementary Math Classroom Settings
Ahmed Adel Attia
Dorottya Demszky
Tolúlopé Ògúnrèmí
Jing Liu
Carol Y. Espy-Wilson
CLL
VLM
19
2
0
15 May 2024
THERADIA WoZ: An Ecological Corpus for Appraisal-based Affect Research
  in Healthcare
THERADIA WoZ: An Ecological Corpus for Appraisal-based Affect Research in Healthcare
Hippolyte Fournier
Sina Alisamir
Safaa Azzakhnini
Hannah Chainay
Olivier Koenig
...
Joan Fruitet
Franck Tarpin-Bernard
Solange Rossato
Franccois Portet
F. Ringeval
26
2
0
10 May 2024
TIPAA-SSL: Text Independent Phone-to-Audio Alignment based on
  Self-Supervised Learning and Knowledge Transfer
TIPAA-SSL: Text Independent Phone-to-Audio Alignment based on Self-Supervised Learning and Knowledge Transfer
Noé Tits
Prernna Bhatnagar
Thierry Dutoit
33
0
0
03 May 2024
Efficient Compression of Multitask Multilingual Speech Models
Efficient Compression of Multitask Multilingual Speech Models
Thomas Palmeira Ferraz
39
0
0
02 May 2024
Killkan: The Automatic Speech Recognition Dataset for Kichwa with
  Morphosyntactic Information
Killkan: The Automatic Speech Recognition Dataset for Kichwa with Morphosyntactic Information
Chihiro Taguchi
Jefferson Saransig
Dayana Velásquez
David Chiang
16
0
0
23 Apr 2024
StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual
  Expressiveness Annotations
StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations
Sen Liu
Yiwei Guo
Xie Chen
Kai Yu
24
1
0
23 Apr 2024
Semantically Corrected Amharic Automatic Speech Recognition
Semantically Corrected Amharic Automatic Speech Recognition
Samuael Adnew
Paul Pu Liang
28
0
0
20 Apr 2024
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
34
10
0
09 Apr 2024
VietMed: A Dataset and Benchmark for Automatic Speech Recognition of Vietnamese in the Medical Domain
VietMed: A Dataset and Benchmark for Automatic Speech Recognition of Vietnamese in the Medical Domain
Khai Le-Duc
LM&MA
36
8
0
08 Apr 2024
Africa-Centric Self-Supervised Pre-Training for Multilingual Speech
  Representation in a Sub-Saharan Context
Africa-Centric Self-Supervised Pre-Training for Multilingual Speech Representation in a Sub-Saharan Context
Antoine Caubrière
Elodie Gauthier
21
1
0
02 Apr 2024
Automated Assessment of Encouragement and Warmth in Classrooms
  Leveraging Multimodal Emotional Features and ChatGPT
Automated Assessment of Encouragement and Warmth in Classrooms Leveraging Multimodal Emotional Features and ChatGPT
Ruikun Hou
Tim Fütterer
B. Bühler
Efe Bozkir
Peter Gerjets
Ulrich Trautwein
Enkelejda Kasneci
33
7
0
01 Apr 2024
Encoding of lexical tone in self-supervised models of spoken language
Encoding of lexical tone in self-supervised models of spoken language
Gaofei Shen
Michaela Watkins
A. Alishahi
Arianna Bisazza
Grzegorz Chrupala
30
4
0
25 Mar 2024
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for
  Noise-Robust Speech Perception
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
HyoJung Han
Mohamed Anwar
J. Pino
Wei-Ning Hsu
Marine Carpuat
Bowen Shi
Changhan Wang
VLM
32
9
0
21 Mar 2024
Unimodal Multi-Task Fusion for Emotional Mimicry Intensity Prediction
Unimodal Multi-Task Fusion for Emotional Mimicry Intensity Prediction
Tobias Hallmen
Fabian Deuser
Norbert Oswald
Elisabeth André
33
2
0
18 Mar 2024
Improving Acoustic Word Embeddings through Correspondence Training of
  Self-supervised Speech Representations
Improving Acoustic Word Embeddings through Correspondence Training of Self-supervised Speech Representations
Amit Meghanani
Thomas Hain
SSL
35
1
0
13 Mar 2024
Automatic Speech Recognition (ASR) for the Diagnosis of pronunciation of
  Speech Sound Disorders in Korean children
Automatic Speech Recognition (ASR) for the Diagnosis of pronunciation of Speech Sound Disorders in Korean children
Taekyung Ahn
Yeonjung Hong
Younggon Im
Do Hyung Kim
Dayoung Kang
...
Jae Won Kim
Min Jung Kim
Ah-ra Cho
Dae-Hyun Jang
Hosung Nam
16
1
0
13 Mar 2024
Speech Robust Bench: A Robustness Benchmark For Speech Recognition
Speech Robust Bench: A Robustness Benchmark For Speech Recognition
Muhammad A. Shah
David Solans Noguero
Mikko A. Heikkilä
Nicolas Kourtellis
19
5
0
08 Mar 2024
Previous
123456789
Next