Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2203.07378
Cited By
v1
v2
v3
v4 (latest)
Dawn of the transformer era in speech emotion recognition: closing the valence gap
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
14 March 2022
Johannes Wagner
Andreas Triantafyllopoulos
H. Wierstorf
Maximilian Schmitt
Felix Burkhardt
F. Eyben
Björn W. Schuller
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (2 upvotes)
Papers citing
"Dawn of the transformer era in speech emotion recognition: closing the valence gap"
50 / 130 papers shown
Title
SAR-LM: Symbolic Audio Reasoning with Large Language Models
Termeh Taheri
Yinghao Ma
Emmanouil Benetos
AuLLM
LRM
130
0
0
09 Nov 2025
Transformer Redesign for Late Fusion of Audio-Text Features on Ultra-Low-Power Edge Hardware
Stavros Mitsis
Ermos Hadjikyriakos
Humaid Ibrahim
Savvas Neofytou
Shashwat Raman
James Myles
Eiman Kanjo
74
0
0
20 Oct 2025
Switchboard-Affect: Emotion Perception Labels from Conversational Speech
Amrit Romana
Jaya Narain
Tien Dung Tran
Andrea Davis
Jason Fong
Ramya Rasipuram
Vikramjit Mitra
60
1
0
14 Oct 2025
Improving Speech Emotion Recognition with Mutual Information Regularized Generative Model
Chung-Soo Ahn
R. Rana
Sunil Sivadas
Carlos Busso
Jagath Rajapakse
65
0
0
11 Oct 2025
Deceptive Exploration in Multi-armed Bandits
I. Arda Vurankaya
Mustafa O. Karabag
Wesley A Suttle
Jesse Milzman
David Fridovich-Keil
Ufuk Topcu
60
0
0
09 Oct 2025
SEER: The Span-based Emotion Evidence Retrieval Benchmark
Aneesha Sampath
Oya Aran
E. Provost
RALM
LRM
132
0
0
03 Oct 2025
SynchroRaMa : Lip-Synchronized and Emotion-Aware Talking Face Generation via Multi-Modal Emotion Embedding
Phyo Thet Yee
D. Kollias
Sudeepta Mishra
Abhinav Dhall
VGen
88
2
0
24 Sep 2025
More Similar than Dissimilar: Modeling Annotators for Cross-Corpus Speech Emotion Recognition
James Tavernor
E. Provost
64
0
0
15 Sep 2025
Joint Effects of Argumentation Theory, Audio Modality and Data Enrichment on LLM-Based Fallacy Classification
Hongxu Zhou
Hylke Westerdijk
Khondoker Ittehadul Islam
36
0
0
14 Sep 2025
Emoanti: audio anti-deepfake with refined emotion-guided representations
Xiaokang Li
Yicheng Gong
Dinghao Zou
Xin Cao
Sunbowen Lee
80
0
0
13 Sep 2025
The MSP-Podcast Corpus
John H. L. Hansen
Reza Lotfian
K. Sridhar
Ali N. Salman
Wei-Cheng Lin
...
Abinay Reddy Naini
Seong-Gyun Leem
Luz Martinez-Lucas
Huang-Cheng Chou
Pravin Mote
68
3
0
11 Sep 2025
Speech-Based Depressive Mood Detection in the Presence of Multiple Sclerosis: A Cross-Corpus and Cross-Lingual Study
Monica Gonzalez-Machorro
U. Reichel
Pascal Hecker
Helly Hammer
Hesam Sagha
F. Eyben
Robert Hoepner
Björn Schuller
32
1
0
25 Aug 2025
EmoTale: An Enacted Speech-emotion Dataset in Danish
Maja J. Hjuler
Harald V. Skat-Rørdam
Line H. Clemmensen
Sneha Das
60
1
0
20 Aug 2025
EmoSLLM: Parameter-Efficient Adaptation of LLMs for Speech Emotion Recognition
Hugo Thimonier
Antony Perzo
Renaud Seguier
96
1
0
19 Aug 2025
RankList -- A Listwise Preference Learning Framework for Predicting Subjective Preferences
Abinay Reddy Naini
Fernando Diaz
John H. L. Hansen
72
0
0
13 Aug 2025
ProMode: A Speech Prosody Model Conditioned on Acoustic and Textual Inputs
Eray Eren
Qingju Liu
Hyeongwoo Kim
Pablo Garrido
Abeer Alwan
78
0
0
12 Aug 2025
Incorporating Contextual Paralinguistic Understanding in Large Speech-Language Models
Qiongqiong Wang
Hardik B. Sailor
Jeremy H.M Wong
Tianchi Liu
Shuo Sun
Wenyu Zhang
Muhammad Huzaifah
Nancy F. Chen
Ai Ti Aw
AuLLM
79
1
0
10 Aug 2025
Charting 15 years of progress in deep learning for speech emotion recognition: A replication study
Andreas Triantafyllopoulos
A. Batliner
B. Schuller
AI4TS
125
0
0
04 Aug 2025
HateClipSeg: A Segment-Level Annotated Dataset for Fine-Grained Hate Video Detection
Huaimin Wang
Zhuoran Wang
Roy Ka-wei Lee
VLM
92
1
0
03 Aug 2025
Unifying Listener Scoring Scales: Comparison Learning Framework for Speech Quality Assessment and Continuous Speech Emotion Recognition
Cheng-Hung Hu
Yusuke Yasuda
Akifumi Yoshimoto
Tomoki Toda
160
0
0
18 Jul 2025
InstructTTSEval: Benchmarking Complex Natural-Language Instruction Following in Text-to-Speech Systems
Kexin Huang
Qian Tu
Liwei Fan
Chenchen Yang
Dong Zhang
Shimin Li
Zhaoye Fei
Qinyuan Cheng
Xipeng Qiu
147
4
0
19 Jun 2025
MELT: Towards Automated Multimodal Emotion Data Annotation by Leveraging LLM Embedded Knowledge
Xin Jing
Jiadong Wang
Iosif Tsangko
Andreas Triantafyllopoulos
Björn Schuller
144
0
0
30 May 2025
Learning Annotation Consensus for Continuous Emotion Recognition
Ibrahim Shoer
E. Erzin
182
0
0
27 May 2025
Rhapsody: A Dataset for Highlight Detection in Podcasts
Younghan Park
Anuj Diwan
David Harwath
Eunsol Choi
184
0
0
26 May 2025
Contrastive Distillation of Emotion Knowledge from LLMs for Zero-Shot Emotion Recognition
Minxue Niu
E. Provost
VLM
290
0
0
23 May 2025
Bridging Speech Emotion Recognition and Personality: Dataset and Temporal Interaction Condition Network
Yuan Gao
Hao Shi
Yahui Fu
Chenhui Chu
Tatsuya Kawahara
208
0
0
20 May 2025
Vox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech Traits
Tiantian Feng
Jihwan Lee
Anfeng Xu
Yoonjeong Lee
Thanathai Lertpetchpun
...
Thomas Thebaud
Laureano Moro-Velazquez
D. Byrd
Najim Dehak
Zengyi Qin
210
9
0
20 May 2025
Contextual Paralinguistic Data Creation for Multi-Modal Speech-LLM: Data Condensation and Spoken QA Generation
Qiongqiong Wang
Hardik B. Sailor
Tianchi Liu
Ai Ti Aw
232
4
0
19 May 2025
Multimodal Emotion Coupling via Speech-to-Facial and Bodily Gestures in Dyadic Interaction
Von Ralph Dane Marquez Herbuela
Yukie Nagai
CVBM
75
0
0
08 May 2025
BLAB: Brutally Long Audio Bench
Orevaoghene Ahia
Martijn Bartelds
Kabir Ahuja
Hila Gonen
Valentin Hofmann
...
Noah Bennett
Shinji Watanabe
Noah A. Smith
Yulia Tsvetkov
Sachin Kumar
AuLLM
LM&MA
VLM
402
2
0
05 May 2025
BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech Recognition
Computer Speech and Language (CSL), 2025
Paige Tuttosi
Mantaj Dhillon
Luna Sang
Shane Eastwood
Poorvi Bhatia
Quang Minh Dinh
Avni Kapoor
Yewon Jin
Angelica Lim
295
2
0
30 Apr 2025
Spatiotemporal Emotional Synchrony in Dyadic Interactions: The Role of Speech Conditions in Facial and Vocal Affective Alignment
Von Ralph Dane Marquez Herbuela
Yukie Nagai
114
0
0
29 Apr 2025
Affect Models Have Weak Generalizability to Atypical Speech
Jaya Narain
Amrit Romana
Vikramjit Mitra
Colin S. Lea
Shirley Ren
136
0
0
22 Apr 2025
Exploring Local Interpretable Model-Agnostic Explanations for Speech Emotion Recognition with Distribution-Shift
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Maja J. Hjuler
Line H. Clemmensen
Sneha Das
FAtt
262
3
0
07 Apr 2025
Interactive Multimodal Fusion with Temporal Modeling
Jun-chen Yu
Yongqi Wang
Lei Wang
Yang Zheng
Shengfan Xu
193
4
0
13 Mar 2025
Scaling Rich Style-Prompted Text-to-Speech Datasets
Anuj Diwan
Zhisheng Zheng
David Harwath
Eunsol Choi
CLIP
VLM
322
12
0
06 Mar 2025
Efficient Finetuning for Dimensional Speech Emotion Recognition in the Age of Transformers
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Aneesha Sampath
James Tavernor
E. Provost
287
3
0
17 Feb 2025
autrainer: A Modular and Extensible Deep Learning Toolkit for Computer Audition Tasks
Simon Rampp
Andreas Triantafyllopoulos
M. Milling
Björn Schuller
454
1
0
16 Dec 2024
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector
IEEE Transactions on Affective Computing (IEEE Trans. Affective Comput.), 2024
Deok-Hyeon Cho
Hyung-Seok Oh
Seung-Bin Kim
Seong-Whan Lee
317
21
0
04 Nov 2024
NTU-NPU System for Voice Privacy 2024 Challenge
Nikita Kuzmin
Hieu-Thi Luong
Jixun Yao
Lei Xie
Kong Aik Lee
Eng Siong Chng
210
5
0
03 Oct 2024
Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions
Kun Zhou
You Zhang
Shengkui Zhao
Hao Wang
Zexu Pan
...
Chongjia Ni
Yukun Ma
Trung Hieu Nguyen
J. Yip
Bin Ma
220
10
0
25 Sep 2024
Stimulus Modality Matters: Impact of Perceptual Evaluations from Different Modalities on Speech Emotion Recognition System Performance
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Huang-Cheng Chou
Haibin Wu
Hung-yi Lee
Chi-Chun Lee
310
3
0
16 Sep 2024
Dynamics of Collective Group Affect: Group-level Annotations and the Multimodal Modeling of Convergence and Divergence
N. Prabhu
Maria Tsfasman
Catharine Oertel
Timo Gerkmann
N. Lehmann-Willenbrock
104
2
0
13 Sep 2024
Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Xin Jing
Kun Zhou
Andreas Triantafyllopoulos
Björn W. Schuller
DiffM
156
6
0
10 Sep 2024
Privacy versus Emotion Preservation Trade-offs in Emotion-Preserving Speaker Anonymization
Spoken Language Technology Workshop (SLT), 2024
Zexin Cai
Lin Zhang
Ashi Garg
Leibny Paola García-Perera
Kevin Duh
Sanjeev Khudanpur
Nicholas Andrews
Sanjeev Khudanpur
83
10
0
05 Sep 2024
Wav2Small: Distilling Wav2Vec2 to 72K parameters for Low-Resource Speech emotion recognition
Dionyssos Kounadis-Bastian
Oliver Schrufer
Anna Derington
H. Wierstorf
F. Eyben
Felix Burkhardt
Björn Schuller
209
1
0
25 Aug 2024
The Whole Is Bigger Than the Sum of Its Parts: Modeling Individual Annotators to Capture Emotional Variability
Interspeech (Interspeech), 2024
James Tavernor
Yara S. El-Tawil
E. Provost
129
3
0
21 Aug 2024
Audio Enhancement for Computer Audition -- An Iterative Training Paradigm Using Sample Importance
Journal of Computational Science and Technology (JCST), 2024
M. Milling
Shuo Liu
Andreas Triantafyllopoulos
Ilhan Aslan
Björn W. Schuller
240
4
0
12 Aug 2024
Conditioning LLMs with Emotion in Neural Machine Translation
International Workshop on Spoken Language Translation (IWSLT), 2024
Charles Brazier
Jean-Luc Rouas
CVBM
192
2
0
06 Aug 2024
Describe Where You Are: Improving Noise-Robustness for Speech Emotion Recognition with Text Description of the Environment
IEEE Transactions on Affective Computing (IEEE Trans. Affective Comput.), 2024
Seong-Gyun Leem
Daniel Fulford
J. Onnela
David Gard
John H. L. Hansen
183
2
0
25 Jul 2024
1
2
3
Next