Unsupervised Cross-lingual Representation Learning for Speech Recognition

24 June 2020

Papers citing "Unsupervised Cross-lingual Representation Learning for Speech Recognition"

50 / 402 papers shown

Title
UniAudio: An Audio Foundation Model Toward Universal Audio Generation Dongchao Yang Jinchuan Tian Xuejiao Tan Rongjie Huang Songxiang Liu ... Jiang Bian Xixin Wu Zhou Zhao Shinji Watanabe Helen M. Meng CVBM AuLLM 22 114 0 01 Oct 2023
AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR Tobi Olatunji Tejumade Afonja Aditya Yadavalli Chris C. Emezue Sahib Singh ... Joanne I. Osuchukwu Salomey Osei A. Tonja Naome A. Etori Clinton Mbataku 22 14 0 30 Sep 2023
SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech Recognition Hongfei Xue Qijie Shao Tommy Yuan Peikun Chen Jie Liu Lei Xie 32 2 0 29 Sep 2023
Joint Prediction and Denoising for Large-scale Multilingual Self-supervised Learning William Chen Jiatong Shi Brian Yan Dan Berrebbi Wangyou Zhang Yifan Peng Xuankai Chang Soumi Maiti Shinji Watanabe 24 8 0 26 Sep 2023
Unsupervised Pre-Training for Vietnamese Automatic Speech Recognition in the HYKIST Project Khai Le-Duc 11 2 0 26 Sep 2023
Unsupervised Representations Improve Supervised Learning in Speech Emotion Recognition Amirali Soltani Tehrani Niloufar Faridani Ramin Toosi SSL 17 2 0 22 Sep 2023
CoMFLP: Correlation Measure based Fast Search on ASR Layer Pruning W. Liu Zhiyuan Peng Tan Lee 11 1 0 21 Sep 2023
Improving Speech Inversion Through Self-Supervised Embeddings and Enhanced Tract Variables Ahmed Adel Attia Yashish M. Siriwardena Carol Y. Espy-Wilson SSL 35 4 0 17 Sep 2023
The complementary roles of non-verbal cues for Robust Pronunciation Assessment Yassine El Kheir Shammur A. Chowdhury Ahmed M. Ali 23 1 0 14 Sep 2023
L1-aware Multilingual Mispronunciation Detection Framework Yassine El Kheir Shammur Absar Chwodhury Ahmed M. Ali 16 0 0 14 Sep 2023
CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders Heng-Jui Chang Ning Dong Ruslan Mavlyutov Sravya Popuri Yu-An Chung 40 6 0 14 Sep 2023
Can large-scale vocoded spoofed data improve speech spoofing countermeasure with a self-supervised front end? Xin Wang Junichi Yamagishi SyDa 50 23 0 12 Sep 2023
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech Titouan Parcollet H. Nguyen Solène Evain Marcely Zanon Boito Adrien Pupier ... François Portet Solange Rossato F. Ringeval D. Schwab Laurent Besacier 40 15 0 11 Sep 2023
Towards generalisable and calibrated synthetic speech detection with self-supervised representations Octavian Pascu Adriana Stan Dan Oneaţă Elisabeta Oneata H. Cucu SSL 28 5 0 11 Sep 2023
RepCodec: A Speech Representation Codec for Speech Tokenization Zhichao Huang Chutong Meng Tom Ko 14 22 0 31 Aug 2023
Decoupled Structure for Improved Adaptability of End-to-End Models Keqi Deng P. Woodland AuLLM 22 2 0 25 Aug 2023
Indonesian Automatic Speech Recognition with XLSR-53 Panji Arisaputra Amalia Zahra 16 6 0 20 Aug 2023
Decoding Emotions: A comprehensive Multilingual Study of Speech Models for Speech Emotion Recognition Anant Singh Akshat Gupta 26 4 0 17 Aug 2023
A Novel Self-training Approach for Low-resource Speech Recognition Satwinder Singh Feng Hou Ruili Wang 14 9 0 10 Aug 2023
Comparative Analysis of the wav2vec 2.0 Feature Extractor Peter Vieting Ralf Schluter Hermann Ney 20 2 0 08 Aug 2023
Universal Automatic Phonetic Transcription into the International Phonetic Alphabet Chihiro Taguchi Yusuke Sakai Parisa Haghani David Chiang 25 4 0 07 Aug 2023
Do You Remember? Overcoming Catastrophic Forgetting for Fake Audio Detection Xiaohui Zhang Jiangyan Yi J. Tao Chenglong Wang Chuyuan Zhang CLL 34 22 0 07 Aug 2023
Federated Representation Learning for Automatic Speech Recognition Guruprasad V Ramesh Gopinath Chennupati Milind Rao Anit Kumar Sahu Ariya Rastrow J. Droppo 26 0 0 03 Aug 2023
Careful Whisper -- leveraging advances in automatic speech recognition for robust and interpretable aphasia subtype classification Laurin Wagner M. Zusag Theresa Bloder 8 9 0 02 Aug 2023
Prompting Large Language Models with Speech Recognition Abilities Yassir Fathullah Chunyang Wu Egor Lakomkin J. Jia Yuan Shangguan ... Wenhan Xiong Jay Mahadeokar Ozlem Kalinli Christian Fuegen M. Seltzer AuLLM 27 124 0 21 Jul 2023
Replay to Remember: Continual Layer-Specific Fine-tuning for German Speech Recognition Theresa Pekarek-Rosin S. Wermter VLM CLL 19 2 0 14 Jul 2023
Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition Wenxuan Wang Guodong Ma Yuke Li Binbin Du MoE 14 23 0 12 Jul 2023
Performance Comparison of Pre-trained Models for Speech-to-Text in Turkish: Whisper-Small and Wav2Vec2-XLS-R-300M Ö. B. Mercan Sercan Cepni D. E. Tasar ¸Sükrü Ozan VLM 15 1 0 06 Jul 2023
Learning Multilingual Expressive Speech Representation for Prosody Prediction without Parallel Data J. Duret Titouan Parcollet Yannick Esteve 27 4 0 29 Jun 2023
DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech Sen Liu Yiwei Guo Chenpeng Du Xie Chen Kai Yu 24 6 0 25 Jun 2023
Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning Zhongzhi Yu Yang Zhang Kaizhi Qian Y. Fu Yingyan Lin 33 13 0 23 Jun 2023
Recent Advances in Direct Speech-to-text Translation Chen Xu Rong Ye Qianqian Dong Chengqi Zhao Tom Ko Mingxuan Wang Tong Xiao Jingbo Zhu 19 18 0 20 Jun 2023
SelfTalk: A Self-Supervised Commutative Training Diagram to Comprehend 3D Talking Faces Ziqiao Peng Yihao Luo Yue Shi Hao-Xuan Xu Xiangyu Zhu Jun He Hongyan Liu Zhaoxin Fan 52 39 0 19 Jun 2023
Evaluation of Speech Representations for MOS prediction F. S. Oliveira Edresson Casanova Arnaldo Cândido Júnior L. Gris A. S. Soares A. R. G. Filho 24 4 0 16 Jun 2023
ITALIC: An Italian Intent Classification Dataset Alkis Koudounas Moreno La Quatra Lorenzo Vaiani Luca Colomba Giuseppe Attanasio Eliana Pastor Luca Cagliero Elena Baralis 28 24 0 14 Jun 2023
Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions, and Prospects Xinghua Qu Hongyang Liu Zhu Sun Xiang Yin Yew-Soon Ong Lu Lu Zejun Ma 29 3 0 14 Jun 2023
NAVER LABS Europe's Multilingual Speech Translation Systems for the IWSLT 2023 Low-Resource Track Edward Gow-Smith Alexandre Berard Marcely Zanon Boito Ioan Calapodescu 18 12 0 13 Jun 2023
A Comprehensive Survey on Applications of Transformers for Deep Learning Tasks Saidul Islam Hanae Elmekki Ahmed Elsebai Jamal Bentahar Najat Drawel Gaith Rjoub Witold Pedrycz ViT MedIm 19 171 0 11 Jun 2023
Low-rank Adaptation Method for Wav2vec2-based Fake Audio Detection Chenglong Wang Jiangyan Yi Xiaohui Zhang J. Tao Le Xu Ruibo Fu 21 10 0 09 Jun 2023
Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages Claytone Sikasote Kalinda Siaminwe Stanly Mwape Bangiwe Zulu Mofya Phiri Martin Phiri David Zulu Mayumbo Nyirenda Antonios Anastasopoulos 20 6 0 07 Jun 2023
Introduction to Latent Variable Energy-Based Models: A Path Towards Autonomous Machine Intelligence Anna Dawid Yann LeCun DRL 24 29 0 05 Jun 2023
An Information-Theoretic Analysis of Self-supervised Discrete Representations of Speech Badr M. Abdullah Mohammed Maqsood Shaik Bernd Möbius Dietrich Klakow SSL 15 7 0 04 Jun 2023
Acoustic Word Embeddings for Untranscribed Target Languages with Continued Pretraining and Learned Pooling Ramon Sanabria Ondˇrej Klejch Hao Tang Sharon Goldwater 22 1 0 03 Jun 2023
Multi-View Multi-Task Representation Learning for Mispronunciation Detection Yassine El Kheir Shammur A. Chowdhury Ahmed M. Ali 14 4 0 02 Jun 2023
Speech Translation with Foundation Models and Optimal Transport: UPC at IWSLT23 Ioannis Tsiamas Gerard I. Gállego José A. R. Fonollosa Marta R. Costa-jussá OT 16 3 0 02 Jun 2023
Improved Cross-Lingual Transfer Learning For Automatic Speech Translation Sameer Khurana Nauman Dawalatabad Antoine Laurent Luis Vicente Pablo Gimeno Victoria Mingote James R. Glass VLM 14 1 0 01 Jun 2023
Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech Shashi Kant Gupta Sushant Hiray Prashant Kukde 20 3 0 01 Jun 2023
AfriNames: Most ASR models "butcher" African Names Tobi Olatunji Tejumade Afonja Bonaventure F. P. Dossou A. Tonja Chris C. Emezue Amina Mardiyyah Rufai Sahib Singh 19 5 0 01 Jun 2023
Findings of the VarDial Evaluation Campaign 2023 Noëmi Aepli Çagri Çöltekin Rob van der Goot T. Jauhiainen Mourhaf Kazzaz Nikola Ljubesic Kai North Barbara Plank Yves Scherrer Marcos Zampieri 19 29 0 31 May 2023
Simple yet Effective Code-Switching Language Identification with Multitask Pre-Training and Transfer Learning Shuyue Stella Li Cihan Xiao Tianjian Li Bismarck Odoom 26 3 0 31 May 2023