Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.02848
Cited By
Unsupervised pretraining transfers well across languages
7 February 2020
M. Rivière
Armand Joulin
Pierre-Emmanuel Mazaré
Emmanuel Dupoux
SSL
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Unsupervised pretraining transfers well across languages"
50 / 92 papers shown
Title
Voice Activity Projection Model with Multimodal Encoders
Takeshi Saga
Catherine Pelachaud
81
0
0
04 Jun 2025
What do self-supervised speech models know about Dutch? Analyzing advantages of language-specific pre-training
Marianne de Heer Kloots
Hosein Mohebbi
Charlotte Pouw
Gaofei Shen
Willem H. Zuidema
Martijn Bentum
SSL
52
0
0
01 Jun 2025
Visual Cues Enhance Predictive Turn-Taking for Two-Party Human Interaction
Sam O'Connor Russell
Naomi Harte
31
1
0
27 May 2025
Self-supervised learning method using multiple sampling strategies for general-purpose audio representation
Ibuki Kuroyanagi
Tatsuya Komatsu
SSL
11
2
0
25 May 2025
Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection
K. Inoue
Divesh Lala
Gabriel Skantze
Tatsuya Kawahara
69
3
0
21 Oct 2024
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget
Andy T. Liu
Yi-Cheng Lin
Haibin Wu
Stefan Winkler
Hung-yi Lee
119
3
0
09 Sep 2024
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
92
26
0
15 Apr 2024
MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models
Yu-Hsiang Wang
Huan Chen
Kai-Wei Chang
Winston H. Hsu
Hung-yi Lee
115
7
0
30 May 2023
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Andrew Rouditchenko
Sameer Khurana
Samuel Thomas
Rogerio Feris
Leonid Karlinsky
Hilde Kuehne
David Harwath
Brian Kingsbury
James R. Glass
VLM
104
22
0
21 May 2023
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Chris C. Emezue
Sanchit Gandhi
Lewis Tunstall
Abubakar Abid
Josh Meyer
...
Douwe Kiela
Yacine Jernite
Julien Chaumond
Merve Noyan
Omar Sanseviero
55
2
0
22 Mar 2023
TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
Hyun Joon Park
Seok Woo Yang
Jin Sob Kim
Wooseok Shin
S. W. Han
66
20
0
16 Mar 2023
Supervised Acoustic Embeddings And Their Transferability Across Languages
Sreepratha Ram
Hanan Aldarmaki
SSL
54
3
0
03 Jan 2023
Analysing Discrete Self Supervised Speech Representation for Spoken Language Modeling
Amitay Sicherman
Yossi Adi
88
37
0
02 Jan 2023
Disentangling Prosody Representations with Unsupervised Speech Reconstruction
Leyuan Qu
Taiha Li
C. Weber
Theresa Pekarek-Rosin
F. Ren
S. Wermter
70
10
0
14 Dec 2022
ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification
Sara Atito
Muhammad Awais
Wenwu Wang
Mark D. Plumbley
J. Kittler
ViT
55
11
0
23 Nov 2022
Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Zili Huang
Zhuo Chen
Naoyuki Kanda
Jian Wu
Yiming Wang
Jinyu Li
Takuya Yoshioka
Xiaofei Wang
Peidong Wang
63
3
0
10 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
161
9
0
02 Nov 2022
Audio Language Modeling using Perceptually-Guided Discrete Representations
Felix Kreuk
Yaniv Taigman
Adam Polyak
Jade Copet
Gabriel Synnaeve
Alexandre Défossez
Yossi Adi
83
4
0
02 Nov 2022
Self-supervised language learning from raw audio: Lessons from the Zero Resource Speech Challenge
Ewan Dunbar
Nicolas Hamilakis
Emmanuel Dupoux
SSL
80
30
0
27 Oct 2022
On the Utility of Self-supervised Models for Prosody-related Tasks
Guan-Ting Lin
Chiyu Feng
Wei-Ping Huang
Yuan Tseng
Tzu-Han Lin
Chen-An Li
Hung-yi Lee
Nigel G. Ward
63
51
0
13 Oct 2022
SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
Zi-Hua Zhang
Sanyuan Chen
Long Zhou
Yu Wu
Shuo Ren
...
Zhuoyuan Yao
Xun Gong
Lirong Dai
Jinyu Li
Furu Wei
79
57
0
30 Sep 2022
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
Longshen Ou
Xiangming Gu
Ye Wang
75
24
0
20 Jul 2022
RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech Insertion
Dacheng Yin
Chuanxin Tang
Yanqing Liu
Xiaoqiang Wang
Zhiyuan Zhao
Yucheng Zhao
Zhiwei Xiong
Sheng Zhao
Chong Luo
83
12
0
28 Jun 2022
Predicting within and across language phoneme recognition performance of self-supervised learning speech pre-trained models
Han Ji
T. Patel
O. Scharenborg
134
8
0
24 Jun 2022
Variable-rate hierarchical CPC leads to acoustic unit discovery in speech
Santiago Cuervo
Adrian Lañcucki
R. Marxer
Paweł Rychlikowski
J. Chorowski
SSL
73
13
0
05 Jun 2022
Do self-supervised speech models develop human-like perception biases?
Juliette Millet
Ewan Dunbar
SSL
68
23
0
31 May 2022
Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR
Qiu-shi Zhu
Jie Zhang
Zitian Zhang
Lirong Dai
90
15
0
26 May 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
276
367
0
21 May 2022
Voice Activity Projection: Self-supervised Learning of Turn-taking Events
Erik Ekstedt
Gabriel Skantze
59
40
0
19 May 2022
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation
Sameer Khurana
Antoine Laurent
James R. Glass
65
37
0
17 May 2022
Automatic Data Augmentation Selection and Parametrization in Contrastive Self-Supervised Speech Representation Learning
Salah Zaiem
Titouan Parcollet
S. Essid
SSL
31
6
0
08 Apr 2022
Towards End-to-end Unsupervised Speech Recognition
Alexander H. Liu
Wei-Ning Hsu
Michael Auli
Alexei Baevski
SSL
83
74
0
05 Apr 2022
Probing phoneme, language and speaker information in unsupervised speech representations
Maureen de Seyssel
Marvin Lavechin
Yossi Adi
Emmanuel Dupoux
Guillaume Wisniewski
SSL
148
24
0
30 Mar 2022
XTREME-S: Evaluating Cross-lingual Speech Representations
Alexis Conneau
Ankur Bapna
Yu Zhang
Min Ma
Patrick von Platen
...
Orhan Firat
Michael Auli
Sebastian Ruder
Jason Riesa
Melvin Johnson
VLM
AILaw
ELM
153
22
0
21 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zili Huang
Kushal Lakhotia
...
Hsuan-Jui Chen
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
91
110
0
14 Mar 2022
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networks
Yizhou Lu
Mingkun Huang
Xinghua Qu
Pengfei Wei
Zejun Ma
79
19
0
09 Mar 2022
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
97
109
0
02 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDL
AI4TS
SSL
84
11
0
01 Mar 2022
Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph
Dacheng Yin
Xuanchi Ren
Chong Luo
Yuwang Wang
Zhiwei Xiong
Wenjun Zeng
114
13
0
24 Feb 2022
Speaker Normalization for Self-supervised Speech Emotion Recognition
Itai Gat
Hagai Aronowitz
Weizhong Zhu
E. Morais
R. Hoory
80
53
0
02 Feb 2022
A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition
Qiu-shi Zhu
Jie Zhang
Zi-qiang Zhang
Ming Wu
Xin Fang
Lirong Dai
175
41
0
22 Jan 2022
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
...
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
114
711
0
17 Nov 2021
Joint Unsupervised and Supervised Training for Multilingual ASR
Junwen Bai
Yue Liu
Yu Zhang
Ankur Bapna
Nikhil Siddhartha
K. Sim
Tara N. Sainath
78
59
0
15 Nov 2021
Membership Inference Attacks Against Self-supervised Speech Models
Wei-Cheng Tseng
Wei-Tsung Kao
Hung-yi Lee
96
14
0
09 Nov 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
290
1,911
0
26 Oct 2021
Self-Supervised Representation Learning: Introduction, Advances and Challenges
Linus Ericsson
Henry Gouk
Chen Change Loy
Timothy M. Hospedales
SSL
OOD
AI4TS
88
279
0
18 Oct 2021
UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training
Sanyuan Chen
Yu Wu
Chengyi Wang
Zhengyang Chen
Zhuo Chen
...
Jian Wu
Yao Qian
Furu Wei
Jinyu Li
Xiangzhan Yu
SSL
74
93
0
12 Oct 2021
K-Wav2vec 2.0: Automatic Speech Recognition based on Joint Decoding of Graphemes and Syllables
Jounghee Kim
Pilsung Kang
VLM
35
6
0
11 Oct 2021
Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models
Sanjeev Khudanpur
Desh Raj
Sanjeev Khudanpur
93
6
0
10 Oct 2021
An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition
Xuankai Chang
Takashi Maekaku
Pengcheng Guo
Jing Shi
Yen-Ju Lu
...
Tianzi Wang
Shu-Wen Yang
Yu Tsao
Hung-yi Lee
Shinji Watanabe
SSL
AI4TS
78
81
0
09 Oct 2021
1
2
Next