Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2111.02735
Cited By
v1
v2
v3 (latest)
A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion Recognition, Speaker Verification and Spoken Language Understanding
4 November 2021
Yingzhi Wang
Abdelmoumene Boumadane
A. Heba
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion Recognition, Speaker Verification and Spoken Language Understanding"
33 / 83 papers shown
MFSN: Multi-perspective Fusion Search Network For Pre-training Knowledge in Speech Emotion Recognition
Interspeech (Interspeech), 2023
Haiyang Sun
Fulin Zhang
Yingying Gao
Zheng Lian
Shilei Zhang
Junlan Feng
154
7
0
12 Jun 2023
Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?
Interspeech (Interspeech), 2023
Salah Zaiem
Youcef Kemiche
Titouan Parcollet
S. Essid
Mirco Ravanelli
SSL
212
33
0
01 Jun 2023
CIF-PT: Bridging Speech and Text Representations for Spoken Language Understanding via Continuous Integrate-and-Fire Pre-Training
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Linhao Dong
Zhecheng An
Peihao Wu
Jun Zhang
Lu Lu
Zejun Ma
108
6
0
27 May 2023
Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding
Interspeech (Interspeech), 2023
Mutian He
Philip N. Garner
ELM
AI4MH
LRM
253
35
0
22 May 2023
Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation
Interspeech (Interspeech), 2023
Kangwook Jang
Sungnyun Kim
Se-Young Yun
Hoi-Rim Kim
307
7
0
19 May 2023
The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Mutian He
Philip N. Garner
326
5
0
16 May 2023
Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations
International Conference on Machine Learning (ICML), 2023
Wei-wei Lin
Chenhang He
Man-Wai Mak
Youzhi Tu
171
6
0
14 May 2023
Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Automatic Speech Recognition & Understanding (ASRU), 2023
Dima Rekesh
Nithin Rao Koluguri
Samuel Kriman
Somshubra Majumdar
Vahid Noroozi
...
Oleksii Hrinchuk
Krishna Puvvada
Ankur Kumar
Jagadeesh Balam
Boris Ginsburg
330
144
0
08 May 2023
A vector quantized masked autoencoder for audiovisual speech emotion recognition
Computer Vision and Image Understanding (CVIU), 2023
Samir Sadok
Simon Leglaive
Renaud Séguier
SSL
472
11
0
05 May 2023
A vector quantized masked autoencoder for speech emotion recognition
Samir Sadok
Simon Leglaive
Renaud Séguier
245
26
0
21 Apr 2023
Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
International Conference on Machine Learning (ICML), 2023
Hainan Xu
Fei Jia
Somshubra Majumdar
Hengguan Huang
Shinji Watanabe
Boris Ginsburg
187
44
0
13 Apr 2023
Designing and Evaluating Speech Emotion Recognition Systems: A reality check case study with IEMOCAP
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Nikolaos Antoniou
Athanasios Katsamanis
Theodoros Giannakopoulos
Shrikanth Narayanan
180
24
0
03 Apr 2023
A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jinchao Li
Xixin Wu
Kaitao Song
Dongsheng Li
Xunying Liu
Helen M. Meng
144
2
0
14 Mar 2023
Skit-S2I: An Indian Accented Speech to Intent dataset
Shangeth Rajaa
Swaraj Dalmia
Kumarmanas Nethil
183
6
0
26 Dec 2022
Disentangling Prosody Representations with Unsupervised Speech Reconstruction
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Leyuan Qu
Taiha Li
C. Weber
Theresa Pekarek-Rosin
F. Ren
S. Wermter
242
16
0
14 Dec 2022
Parameter Efficient Transfer Learning for Various Speech Processing Tasks
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Shinta Otake
Rei Kawakami
Nakamasa Inoue
176
21
0
06 Dec 2022
Bidirectional Representations for Low Resource Spoken Language Understanding
Applied Sciences (Appl. Sci.), 2022
Quentin Meeus
Marie-Francine Moens
Hugo Van hamme
188
2
0
24 Nov 2022
Multi-Label Training for Text-Independent Speaker Identification
Yuqi Xue
153
0
0
14 Nov 2022
Speech-based emotion recognition with self-supervised models using attentive channel-wise correlations and label smoothing
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Sofoklis Kakouros
Themos Stafylakis
Ladislav Mošner
L. Burget
141
19
0
03 Nov 2022
Phoneme Segmentation Using Self-Supervised Speech Models
Spoken Language Technology Workshop (SLT), 2022
Luke Strgar
David Harwath
SSL
171
13
0
02 Nov 2022
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Liyong Guo
Xiaoyu Yang
Quandong Wang
Yuxiang Kong
Zengwei Yao
...
Wei Kang
Long Lin
Mingshuang Luo
Piotr Żelasko
Daniel Povey
VLM
188
10
0
31 Oct 2022
Application of Knowledge Distillation to Multi-task Speech Representation Learning
Interspeech (Interspeech), 2022
Mine Kerpicci
V. Nguyen
Shuhua Zhang
Erik M. Visser
185
0
0
29 Oct 2022
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Spoken Language Technology Workshop (SLT), 2022
Tzu-hsun Feng
Annie Dong
Ching-Feng Yeh
Shu-Wen Yang
Tzu-Quan Lin
...
Xuankai Chang
Shinji Watanabe
Abdel-rahman Mohamed
Shang-Wen Li
Hung-yi Lee
ELM
SSL
243
38
0
16 Oct 2022
Training speech emotion classifier without categorical annotations
Meysam Shamsi
Marie Tahon
209
2
0
14 Oct 2022
An Efficient Multitask Learning Architecture for Affective Vocal Burst Analysis
Tobias Hallmen
Silvan Mertes
Dominik Schiller
Elisabeth André
126
5
0
28 Sep 2022
Exploring the Effectiveness of Self-supervised Learning and Classifier Chains in Emotion Recognition of Nonverbal Vocalizations
Detai Xin
Shinnosuke Takamichi
Hiroshi Saruwatari
97
15
0
21 Jun 2022
Self-Supervised Speech Representation Learning: A Review
IEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
668
444
0
21 May 2022
Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Qianying Liu
Zhuo Gong
Zhengdong Yang
Yuhang Yang
Sheng Li
...
Nobuaki Minematsu
Hao-Ming Huang
Fei Cheng
Chenhui Chu
Sadao Kurohashi
177
10
0
08 Apr 2022
MTI-Net: A Multi-Target Speech Intelligibility Prediction Model
Interspeech (Interspeech), 2022
Ryandhimas E. Zezario
Szu-Wei Fu
Fei Chen
C. Fuh
Hsin-Min Wang
Yu Tsao
277
17
0
07 Apr 2022
Probing Speech Emotion Recognition Transformers for Linguistic Knowledge
Interspeech (Interspeech), 2022
Andreas Triantafyllopoulos
Johannes Wagner
H. Wierstorf
Maximilian Schmitt
U. Reichel
F. Eyben
Felix Burkhardt
Björn W. Schuller
326
33
0
01 Apr 2022
Visualizations of Complex Sequences of Family-Infant Vocalizations Using Bag-of-Audio-Words Approach Based on Wav2vec 2.0 Features
Jialu Li
M. Hasegawa-Johnson
Nancy L. McElwain
122
1
0
29 Mar 2022
Dawn of the transformer era in speech emotion recognition: closing the valence gap
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Johannes Wagner
Andreas Triantafyllopoulos
H. Wierstorf
Maximilian Schmitt
Felix Burkhardt
F. Eyben
Björn W. Schuller
389
409
0
14 Mar 2022
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Andy T. Liu
Shu-Wen Yang
Po-Han Chi
Po-Chun Hsu
Hung-yi Lee
SSL
482
393
0
25 Oct 2019
Previous
1
2