ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.13900
  4. Cited By
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

26 October 2021
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
Zhuo Chen
Jinyu Li
Naoyuki Kanda
Takuya Yoshioka
Xiong Xiao
Jian Wu
Long Zhou
Shuo Ren
Y. Qian
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
    SSL
ArXivPDFHTML

Papers citing "WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing"

19 / 1,019 papers shown
Title
Automatic speaker verification spoofing and deepfake detection using
  wav2vec 2.0 and data augmentation
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation
Hemlata Tak
Massimiliano Todisco
Xin Wang
Jee-weon Jung
Junichi Yamagishi
Nicholas W. D. Evans
9
150
0
24 Feb 2022
Multimodal Emotion Recognition using Transfer Learning from Speaker
  Recognition and BERT-based models
Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models
Sarala Padi
S. O. Sadjadi
Dinesh Manocha
Ram D. Sriram
14
34
0
16 Feb 2022
textless-lib: a Library for Textless Spoken Language Processing
textless-lib: a Library for Textless Spoken Language Processing
Eugene Kharitonov
Jade Copet
Kushal Lakhotia
Tu Nguyen
Paden Tomasello
...
A. Elkahky
Wei-Ning Hsu
Abdel-rahman Mohamed
Emmanuel Dupoux
Yossi Adi
17
32
0
15 Feb 2022
data2vec: A General Framework for Self-supervised Learning in Speech,
  Vision and Language
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Alexei Baevski
Wei-Ning Hsu
Qiantong Xu
Arun Babu
Jiatao Gu
Michael Auli
SSL
VLM
ViT
19
823
0
07 Feb 2022
Self-Supervised Representation Learning for Speech Using Visual
  Grounding and Masked Language Modeling
Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling
Puyuan Peng
David F. Harwath
SSL
20
26
0
07 Feb 2022
Robust Self-Supervised Audio-Visual Speech Recognition
Robust Self-Supervised Audio-Visual Speech Recognition
Bowen Shi
Wei-Ning Hsu
Abdel-rahman Mohamed
8
90
0
05 Jan 2022
Multi-Variant Consistency based Self-supervised Learning for Robust
  Automatic Speech Recognition
Multi-Variant Consistency based Self-supervised Learning for Robust Automatic Speech Recognition
Changfeng Gao
Gaofeng Cheng
Pengyuan Zhang
15
4
0
23 Dec 2021
Self-Supervised Learning for speech recognition with Intermediate layer
  supervision
Self-Supervised Learning for speech recognition with Intermediate layer supervision
Chengyi Wang
Yu-Huan Wu
Sanyuan Chen
Shujie Liu
Jinyu Li
Yao Qian
Zhenglu Yang
SSL
11
24
0
16 Dec 2021
A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion
  Recognition, Speaker Verification and Spoken Language Understanding
A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion Recognition, Speaker Verification and Spoken Language Understanding
Yingzhi Wang
Abdelmoumene Boumadane
A. Heba
13
123
0
04 Nov 2021
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language
  Processing
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing
Junyi Ao
Rui Wang
Long Zhou
Chengyi Wang
Shuo Ren
...
Yu Zhang
Zhihua Wei
Yao Qian
Jinyu Li
Furu Wei
110
148
0
14 Oct 2021
Pretext Tasks selection for multitask self-supervised speech
  representation learning
Pretext Tasks selection for multitask self-supervised speech representation learning
Salah Zaiem
Titouan Parcollet
S. Essid
Abdel Heba
SSL
12
13
0
01 Jul 2021
Signal Transformer: Complex-valued Attention and Meta-Learning for
  Signal Recognition
Signal Transformer: Complex-valued Attention and Meta-Learning for Signal Recognition
Yihong Dong
Ying Peng
Muqiao Yang
Songtao Lu
Qingjiang Shi
32
8
0
05 Jun 2021
A Review of Speaker Diarization: Recent Advances with Deep Learning
A Review of Speaker Diarization: Recent Advances with Deep Learning
Tae Jin Park
Naoyuki Kanda
Dimitrios Dimitriadis
Kyu Jeong Han
Shinji Watanabe
Shrikanth Narayanan
VLM
260
323
0
24 Jan 2021
Interspeech 2021 Deep Noise Suppression Challenge
Interspeech 2021 Deep Noise Suppression Challenge
Chandan K. A. Reddy
Harishchandra Dubey
K. Koishida
A. Nair
Vishak Gopal
Ross Cutler
Sebastian Braun
H. Gamper
R. Aichner
Sriram Srinivasan
AI4CE
66
160
0
06 Jan 2021
Bayesian HMM clustering of x-vector sequences (VBx) in speaker
  diarization: theory, implementation and analysis on standard tasks
Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: theory, implementation and analysis on standard tasks
Federico Landini
Jan Profant
Mireia Díez
L. Burget
203
198
0
29 Dec 2020
Don't shoot butterfly with rifles: Multi-channel Continuous Speech
  Separation with Early Exit Transformer
Don't shoot butterfly with rifles: Multi-channel Continuous Speech Separation with Early Exit Transformer
Sanyuan Chen
Yu-Huan Wu
Zhuo Chen
Takuya Yoshioka
Shujie Liu
Jinyu Li
16
25
0
23 Oct 2020
Multi-task self-supervised learning for Robust Speech Recognition
Multi-task self-supervised learning for Robust Speech Recognition
Mirco Ravanelli
Jianyuan Zhong
Santiago Pascual
P. Swietojanski
João Monteiro
J. Trmal
Yoshua Bengio
SSL
171
288
0
25 Jan 2020
End-to-End Neural Speaker Diarization with Permutation-Free Objectives
End-to-End Neural Speaker Diarization with Permutation-Free Objectives
Yusuke Fujita
Naoyuki Kanda
Shota Horiguchi
Kenji Nagamatsu
Shinji Watanabe
145
242
0
12 Sep 2019
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
206
1,954
0
14 Jun 2018
Previous
123...192021