Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.09296
Cited By
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
17 November 2021
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
Naman Goyal
Kritika Singh
Patrick von Platen
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale"
14 / 114 papers shown
Title
How Does Pre-trained Wav2Vec 2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications
Juan Pablo Zuluaga
Amrutha Prasad
Iuliia Nigmatulina
Seyyed Saeed Sarfjoo
P. Motlícek
Matthias Kleinert
H. Helmke
Oliver Ohneiser
Qingran Zhan
19
43
0
31 Mar 2022
SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks
Kai-Wei Chang
Wei-Cheng Tseng
Shang-Wen Li
Hung-yi Lee
22
22
0
31 Mar 2022
Robust Speaker Recognition with Transformers Using wav2vec 2.0
Sergey Novoselov
G. Lavrentyeva
Anastasia Avdeeva
V. Volokhov
Aleksei Gusev
ViT
11
18
0
28 Mar 2022
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation
Ye Jia
Yifan Ding
Ankur Bapna
Colin Cherry
Yu Zhang
Alexis Conneau
Nobuyuki Morioka
39
20
0
24 Mar 2022
The Vicomtech Audio Deepfake Detection System based on Wav2Vec2 for the 2022 ADD Challenge
Juan M. Martín-Donas
Aitor Álvarez
30
98
0
03 Mar 2022
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
35
106
0
02 Mar 2022
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation
Hemlata Tak
Massimiliano Todisco
Xin Wang
Jee-weon Jung
Junichi Yamagishi
Nicholas W. D. Evans
32
151
0
24 Feb 2022
mSLAM: Massively multilingual joint pre-training for speech and text
Ankur Bapna
Colin Cherry
Yu Zhang
Ye Jia
Melvin Johnson
Yong Cheng
Simran Khanuja
Jason Riesa
Alexis Conneau
VLM
19
111
0
03 Feb 2022
Speech Resources in the Tamasheq Language
Marcely Zanon Boito
Fethi Bougares
Florentin Barbier
Souhir Gahbiche
Loïc Barrault
Mickael Rouvier
Yannick Esteve
28
14
0
13 Jan 2022
Signal Transformer: Complex-valued Attention and Meta-Learning for Signal Recognition
Yihong Dong
Ying Peng
Muqiao Yang
Songtao Lu
Qingjiang Shi
40
9
0
05 Jun 2021
Larger-Scale Transformers for Multilingual Masked Language Modeling
Naman Goyal
Jingfei Du
Myle Ott
Giridhar Anantharaman
Alexis Conneau
90
98
0
02 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
317
5,775
0
29 Apr 2021
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition
Yu Zhang
James Qin
Daniel S. Park
Wei Han
Chung-Cheng Chiu
Ruoming Pang
Quoc V. Le
Yonghui Wu
VLM
SSL
146
308
0
20 Oct 2020
Multi-task self-supervised learning for Robust Speech Recognition
Mirco Ravanelli
Jianyuan Zhong
Santiago Pascual
P. Swietojanski
João Monteiro
J. Trmal
Yoshua Bengio
SSL
189
288
0
25 Jan 2020
Previous
1
2
3