Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.09296
Cited By
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
17 November 2021
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
Naman Goyal
Kritika Singh
Patrick von Platen
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale"
50 / 115 papers shown
Title
Replay to Remember: Continual Layer-Specific Fine-tuning for German Speech Recognition
Theresa Pekarek-Rosin
S. Wermter
VLM
CLL
24
2
0
14 Jul 2023
Leveraging multilingual transfer for unsupervised semantic acoustic word embeddings
C. Jacobs
Herman Kamper
32
1
0
05 Jul 2023
Boosting Norwegian Automatic Speech Recognition
Javier de la Rosa
Rolv-Arild Braaten
P. Kummervold
Freddy Wetjen
Svein Arne Brygfjeld
33
7
0
04 Jul 2023
Allophant: Cross-lingual Phoneme Recognition with Articulatory Attributes
Kevin Glocker
Aaricia Herygers
Munir Georges
21
4
0
07 Jun 2023
On the Robustness of Arabic Speech Dialect Identification
Peter Sullivan
AbdelRahim Elmadany
Muhammad Abdul-Mageed
23
8
0
01 Jun 2023
Findings of the VarDial Evaluation Campaign 2023
Noëmi Aepli
Çagri Çöltekin
Rob van der Goot
T. Jauhiainen
Mourhaf Kazzaz
Nikola Ljubesic
Kai North
Barbara Plank
Yves Scherrer
Marcos Zampieri
19
29
0
31 May 2023
mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models
Peiqin Lin
Chengzhi Hu
Zheyu Zhang
André F. T. Martins
Hinrich Schütze
27
1
0
23 May 2023
The defender's perspective on automatic speaker verification: An overview
Haibin Wu
Jiawen Kang
Lingwei Meng
H. Meng
Hung-yi Lee
AAML
24
14
0
22 May 2023
Hystoc: Obtaining word confidences for fusion of end-to-end ASR systems
Karel Beneš
M. Kocour
L. Burget
32
2
0
21 May 2023
DUB: Discrete Unit Back-translation for Speech Translation
Dong Zhang
Rong Ye
Tom Ko
Mingxuan Wang
Yaqian Zhou
13
23
0
19 May 2023
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Jiatong Shi
Dan Berrebbi
William Chen
Ho-Lam Chung
En-Pei Hu
...
Xuankai Chang
Shang-Wen Li
Abdel-rahman Mohamed
Hung-yi Lee
Shinji Watanabe
ELM
55
58
0
18 May 2023
The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation
Mutian He
Philip N. Garner
38
4
0
16 May 2023
Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models
Takanori Ashihara
Takafumi Moriya
Kohei Matsuura
Tomohiro Tanaka
25
3
0
09 May 2023
AI-Synthesized Voice Detection Using Neural Vocoder Artifacts
Chengzhe Sun
Shan Jia
Shuwei Hou
Siwei Lyu
30
38
0
25 Apr 2023
Spaiche: Extending State-of-the-Art ASR Models to Swiss German Dialects
Clément Sicard
Kajetan Pyszkowski
Victor Gillioz
24
7
0
20 Apr 2023
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Chris C. Emezue
Sanchit Gandhi
Lewis Tunstall
Abubakar Abid
Josh Meyer
...
Douwe Kiela
Yacine Jernite
Julien Chaumond
Merve Noyan
Omar Sanseviero
25
2
0
22 Mar 2023
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
42
47
0
21 Mar 2023
Leveraging supplementary text data to kick-start automatic speech recognition system development with limited transcriptions
Nay San
Martijn Bartelds
Blaine Billings
Ella de Falco
Hendi Feriza
Johan Safri
Wawan Sahrozi
Ben Foley
Bradley McDonnell
Dan Jurafsky
13
9
0
09 Feb 2023
Adapting Multilingual Speech Representation Model for a New, Underresourced Language through Multilingual Fine-tuning and Continued Pretraining
Karol Nowakowski
M. Ptaszynski
Kyoko Murasaki
Jagna Nieuwazny
15
23
0
18 Jan 2023
2nd Swiss German Speech to Standard German Text Shared Task at SwissText 2022
Michel Plüss
Yanick Schraner
Christian Scheller
Manfred Vogel
29
2
0
17 Jan 2023
Automated speech- and text-based classification of neuropsychiatric conditions in a multidiagnostic setting
L. Hansen
R. Rocca
A. Simonsen
A. Parola
V. Bliksted
...
Dan Bang
Kristian Tylén
Ethan Weed
S. Ostergaard
Riccardo Fusaroli
43
3
0
13 Jan 2023
Perceive and predict: self-supervised speech representation based loss functions for speech enhancement
George Close
William Ravenscroft
Thomas Hain
Stefan Goetze
SSL
30
12
0
11 Jan 2023
SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations
Ioannis Tsiamas
José A. R. Fonollosa
Marta R. Costa-jussá
41
6
0
19 Dec 2022
Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Alexei Baevski
Arun Babu
Wei-Ning Hsu
Michael Auli
VLM
SSL
32
92
0
14 Dec 2022
Towards continually learning new languages
Ngoc-Quan Pham
J. Niehues
A. Waibel
CLL
11
1
0
21 Nov 2022
NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis
Hyeong-Seok Choi
Jinhyeok Yang
Juheon Lee
Hyeongju Kim
20
46
0
17 Nov 2022
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Andros Tjandra
Nayan Singhal
David C. Zhang
Ozlem Kalinli
Abdel-rahman Mohamed
Duc Le
M. Seltzer
32
12
0
10 Nov 2022
Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models
Travis M. Bartley
Fei Jia
Krishna C. Puvvada
Samuel Kriman
Boris Ginsburg
SSL
21
6
0
09 Nov 2022
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations
Paul-Ambroise Duquenne
Hongyu Gong
Ning Dong
Jingfei Du
Ann Lee
Vedanuj Goswani
Changhan Wang
J. Pino
Benoît Sagot
Holger Schwenk
37
34
0
08 Nov 2022
Textless Direct Speech-to-Speech Translation with Discrete Speech Representation
Xinjian Li
Ye Jia
Chung-Cheng Chiu
23
23
0
31 Oct 2022
A Compact End-to-End Model with Local and Global Context for Spoken Language Identification
Fei Jia
Nithin Rao Koluguri
Jagadeesh Balam
Boris Ginsburg
25
3
0
27 Oct 2022
Iterative pseudo-forced alignment by acoustic CTC loss for self-supervised ASR domain adaptation
F. López
Jordi Luque
6
6
0
27 Oct 2022
Training Autoregressive Speech Recognition Models with Limited in-domain Supervision
Chak-Fai Li
Francis Keith
William Hartmann
M. Snover
14
0
0
27 Oct 2022
Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of Downstream Tasks
Colin Leong
Joshua Nemecek
Jacob Mansdorfer
Anna Filighera
A. Owodunni
Daniel Whitenack
VLM
AI4CE
41
24
0
26 Oct 2022
Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings
Jian Zhu
Zuoyu Tian
Yadong Liu
Cong Zhang
Chia-wen Lo
SSL
30
2
0
23 Oct 2022
A Textless Metric for Speech-to-Speech Comparison
Laurent Besacier
S. Ribeiro
Olivier Galibert
Ioan Calapodescu
33
5
0
21 Oct 2022
Pre-trained Speech Representations as Feature Extractors for Speech Quality Assessment in Online Conferencing Applications
Bastiaan Tamm
Helena Balabin
Rik Vandenberghe
Hugo Van hamme
36
9
0
01 Oct 2022
The Kriston AI System for the VoxCeleb Speaker Recognition Challenge 2022
Qutang Cai
Guoqiang Hong
Zhijian Ye
Ximin Li
Haizhou Li
33
7
0
23 Sep 2022
Effectiveness of Mining Audio and Text Pairs from Public Data for Improving ASR Systems for Low-Resource Languages
Kaushal Bhogale
A. Raman
Tahir Javed
Sumanth Doddapaneni
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
19
22
0
26 Aug 2022
IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages
Tahir Javed
Kaushal Bhogale
A. Raman
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
ELM
30
20
0
24 Aug 2022
Low-Level Physiological Implications of End-to-End Learning of Speech Recognition
Louise Coppieters de Gibson
Philip N. Garner
21
1
0
22 Aug 2022
FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition
Szu-Jui Chen
Jiamin Xie
John H. L. Hansen
35
8
0
30 Jun 2022
Predicting within and across language phoneme recognition performance of self-supervised learning speech pre-trained models
Han Ji
T. Patel
O. Scharenborg
34
7
0
24 Jun 2022
Transformer-based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project
Jan Lehecka
J. Psutka
Josef Psutka
8
4
0
15 Jun 2022
Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech
Jan Lehecka
J. Svec
A. Pražák
J. Psutka
14
12
0
15 Jun 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
128
349
0
21 May 2022
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation
Sameer Khurana
Antoine Laurent
James R. Glass
25
36
0
17 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
42
37
0
02 May 2022
ByT5 model for massively multilingual grapheme-to-phoneme conversion
Jian Zhu
Cong Zhang
David Jurgens
11
36
0
06 Apr 2022
Federated Self-supervised Speech Representations: Are We There Yet?
Yan Gao
Javier Fernandez-Marques
Titouan Parcollet
Abhinav Mehrotra
Nicholas D. Lane
32
13
0
06 Apr 2022
Previous
1
2
3
Next