ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.03502
  4. Cited By
Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings

Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings

8 April 2021
L. Pepino
Pablo Riera
Luciana Ferrer
ArXivPDFHTML

Papers citing "Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings"

38 / 38 papers shown
Title
Multimodal Assessment of Classroom Discourse Quality: A Text-Centered Attention-Based Multi-Task Learning Approach
Multimodal Assessment of Classroom Discourse Quality: A Text-Centered Attention-Based Multi-Task Learning Approach
Ruikun Hou
B. Bühler
Tim Fütterer
Efe Bozkir
Peter Gerjets
Ulrich Trautwein
Enkelejda Kasneci
26
0
0
12 May 2025
Exploring Local Interpretable Model-Agnostic Explanations for Speech Emotion Recognition with Distribution-Shift
Exploring Local Interpretable Model-Agnostic Explanations for Speech Emotion Recognition with Distribution-Shift
Maja J. Hjuler
Line H. Clemmensen
Sneha Das
FAtt
44
0
0
07 Apr 2025
Heterogeneous bimodal attention fusion for speech emotion recognition
Heterogeneous bimodal attention fusion for speech emotion recognition
Jiachen Luo
Huy Phan
Lin Wang
Joshua Reiss
44
0
0
09 Mar 2025
Personalized Speech Emotion Recognition in Human-Robot Interaction using Vision Transformers
Personalized Speech Emotion Recognition in Human-Robot Interaction using Vision Transformers
Ruchik Mishra
Andrew Frye
M. M. Rayguru
Dan O. Popa
35
1
0
16 Sep 2024
The Unreliability of Acoustic Systems in Alzheimer's Speech Datasets
  with Heterogeneous Recording Conditions
The Unreliability of Acoustic Systems in Alzheimer's Speech Datasets with Heterogeneous Recording Conditions
L. Gauder
Pablo Riera
A. Slachevsky
G. Forno
Adolfo M. Garcia
Luciana Ferrer
16
1
0
11 Sep 2024
Exploring Self-Supervised Multi-view Contrastive Learning for Speech
  Emotion Recognition with Limited Annotations
Exploring Self-Supervised Multi-view Contrastive Learning for Speech Emotion Recognition with Limited Annotations
Bulat Khaertdinov
Pedro Jeuris
Annanda Sousa
Enrique Hortal
25
1
0
12 Jun 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
Beyond the Labels: Unveiling Text-Dependency in Paralinguistic Speech
  Recognition Datasets
Beyond the Labels: Unveiling Text-Dependency in Paralinguistic Speech Recognition Datasets
Jan Pevsán
Santosh Kesiraju
Lukávs Burget
JanHonza'' vCernocký
19
0
0
12 Mar 2024
emotion2vec: Self-Supervised Pre-Training for Speech Emotion
  Representation
emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
Ziyang Ma
Zhisheng Zheng
Jiaxin Ye
Jinchao Li
Zhifu Gao
Shiliang Zhang
Xie Chen
MDE
SLR
SSL
25
85
0
23 Dec 2023
Advancing Audio Emotion and Intent Recognition with Large Pre-Trained
  Models and Bayesian Inference
Advancing Audio Emotion and Intent Recognition with Large Pre-Trained Models and Bayesian Inference
Dejan Porjazovski
Yaroslav Getman
Tamás Grósz
M. Kurimo
28
3
0
16 Oct 2023
Test-Time Training for Speech
Test-Time Training for Speech
Sri Harsha Dumpala
Chandramouli Shama Sastry
Sageev Oore
35
1
0
19 Sep 2023
Leveraging Label Information for Multimodal Emotion Recognition
Leveraging Label Information for Multimodal Emotion Recognition
Pei-Hsin Wang
Sunlu Zeng
Junqing Chen
Lu Fan
Meng Chen
Youzheng Wu
Xiaodong He
27
4
0
05 Sep 2023
MFSN: Multi-perspective Fusion Search Network For Pre-training Knowledge
  in Speech Emotion Recognition
MFSN: Multi-perspective Fusion Search Network For Pre-training Knowledge in Speech Emotion Recognition
Haiyang Sun
Fulin Zhang
Yingying Gao
Zheng Lian
Shilei Zhang
Junlan Feng
19
4
0
12 Jun 2023
Recycle-and-Distill: Universal Compression Strategy for
  Transformer-based Speech SSL Models with Attention Map Reusing and Masking
  Distillation
Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation
Kangwook Jang
Sungnyun Kim
Se-Young Yun
Hoi-Rim Kim
24
5
0
19 May 2023
A multimodal dynamical variational autoencoder for audiovisual speech
  representation learning
A multimodal dynamical variational autoencoder for audiovisual speech representation learning
Samir Sadok
Simon Leglaive
Laurent Girin
Xavier Alameda-Pineda
Renaud Séguier
16
11
0
05 May 2023
A vector quantized masked autoencoder for audiovisual speech emotion recognition
A vector quantized masked autoencoder for audiovisual speech emotion recognition
Samir Sadok
Simon Leglaive
Renaud Séguier
SSL
79
6
0
05 May 2023
Designing and Evaluating Speech Emotion Recognition Systems: A reality
  check case study with IEMOCAP
Designing and Evaluating Speech Emotion Recognition Systems: A reality check case study with IEMOCAP
Nikolaos Antoniou
Athanasios Katsamanis
Theodoros Giannakopoulos
Shrikanth Narayanan
19
17
0
03 Apr 2023
Leveraging TCN and Transformer for effective visual-audio fusion in
  continuous emotion recognition
Leveraging TCN and Transformer for effective visual-audio fusion in continuous emotion recognition
Weiwei Zhou
Jiada Lu
Zhaolong Xiong
Weifeng Wang
19
28
0
15 Mar 2023
Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition
Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition
Zihan Zhao
Yu Wang
Yanfeng Wang
14
18
0
20 Feb 2023
Generative Models for Improved Naturalness, Intelligibility, and Voicing
  of Whispered Speech
Generative Models for Improved Naturalness, Intelligibility, and Voicing of Whispered Speech
Dominik Wagner
Sebastian P. Bayerl
H. A. C. Maruri
Tobias Bocklet
19
7
0
04 Dec 2022
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge
  Distillation
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
Liyong Guo
Xiaoyu Yang
Quandong Wang
Yuxiang Kong
Zengwei Yao
...
Wei Kang
Long Lin
Mingshuang Luo
Piotr Żelasko
Daniel Povey
VLM
21
7
0
31 Oct 2022
Multilevel Transformer For Multimodal Emotion Recognition
Multilevel Transformer For Multimodal Emotion Recognition
Junyi He
Meimei Wu
Meng Li
Xiaobo Zhu
Feng Ye
8
6
0
26 Oct 2022
Training speech emotion classifier without categorical annotations
Training speech emotion classifier without categorical annotations
Meysam Shamsi
Marie Tahon
18
2
0
14 Oct 2022
Self-Supervised Attention Networks and Uncertainty Loss Weighting for
  Multi-Task Emotion Recognition on Vocal Bursts
Self-Supervised Attention Networks and Uncertainty Loss Weighting for Multi-Task Emotion Recognition on Vocal Bursts
Vincent Karas
Andreas Triantafyllopoulos
Meishu Song
Björn W. Schuller
24
4
0
15 Sep 2022
Fully Automated End-to-End Fake Audio Detection
Fully Automated End-to-End Fake Audio Detection
Chenglong Wang
Jiangyan Yi
J. Tao
Haiyang Sun
Xun Chen
Zhengkun Tian
Haoxin Ma
Cunhang Fan
Ruibo Fu
24
28
0
20 Aug 2022
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
Longshen Ou
Xiangming Gu
Ye Wang
25
21
0
20 Jul 2022
The Influence of Dataset Partitioning on Dysfluency Detection Systems
The Influence of Dataset Partitioning on Dysfluency Detection Systems
Sebastian P. Bayerl
Dominik Wagner
Elmar Nöth
Tobias Bocklet
K. Riedhammer
28
20
0
07 Jun 2022
Learning Speech Emotion Representations in the Quaternion Domain
Learning Speech Emotion Representations in the Quaternion Domain
E. Guizzo
Tillman Weyde
Simone Scardapane
Danilo Comminiello
19
18
0
05 Apr 2022
Introducing ECAPA-TDNN and Wav2Vec2.0 Embeddings to Stuttering Detection
Introducing ECAPA-TDNN and Wav2Vec2.0 Embeddings to Stuttering Detection
S. A. Sheikh
Md. Sahidullah
F. Hirsch
Slim Ouni
14
17
0
04 Apr 2022
Anti-Spoofing Using Transfer Learning with Variational Information
  Bottleneck
Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck
Youngsik Eom
Yeonghyeon Lee
Ji Sub Um
Hoi-Rim Kim
24
25
0
04 Apr 2022
Visualizations of Complex Sequences of Family-Infant Vocalizations Using
  Bag-of-Audio-Words Approach Based on Wav2vec 2.0 Features
Visualizations of Complex Sequences of Family-Infant Vocalizations Using Bag-of-Audio-Words Approach Based on Wav2vec 2.0 Features
Jialu Li
M. Hasegawa-Johnson
Nancy L. McElwain
16
0
0
29 Mar 2022
The Vicomtech Audio Deepfake Detection System based on Wav2Vec2 for the
  2022 ADD Challenge
The Vicomtech Audio Deepfake Detection System based on Wav2Vec2 for the 2022 ADD Challenge
Juan M. Martín-Donas
Aitor Álvarez
30
98
0
03 Mar 2022
Towards a Common Speech Analysis Engine
Towards a Common Speech Analysis Engine
Hagai Aronowitz
Itai Gat
E. Morais
Weizhong Zhu
R. Hoory
12
3
0
01 Mar 2022
Automatic speaker verification spoofing and deepfake detection using
  wav2vec 2.0 and data augmentation
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation
Hemlata Tak
Massimiliano Todisco
Xin Wang
Jee-weon Jung
Junichi Yamagishi
Nicholas W. D. Evans
29
151
0
24 Feb 2022
Speaker Normalization for Self-supervised Speech Emotion Recognition
Speaker Normalization for Self-supervised Speech Emotion Recognition
Itai Gat
Hagai Aronowitz
Weizhong Zhu
E. Morais
R. Hoory
25
50
0
02 Feb 2022
Exploring Wav2vec 2.0 fine-tuning for improved speech emotion
  recognition
Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition
Li-Wei Chen
Alexander I. Rudnicky
VLM
8
121
0
12 Oct 2021
Fine-tuning wav2vec2 for speaker recognition
Fine-tuning wav2vec2 for speaker recognition
Nik Vaessen
David A. van Leeuwen
34
107
0
30 Sep 2021
Contrastive Unsupervised Learning for Speech Emotion Recognition
Contrastive Unsupervised Learning for Speech Emotion Recognition
Mao Li
Bo Yang
Joshua Levy
A. Stolcke
Viktor Rozgic
Spyros Matsoukas
C. Papayiannis
Daniel Bone
Chao Wang
SSL
26
47
0
12 Feb 2021
1