v1v2v3v4 (latest)

Dawn of the transformer era in speech emotion recognition: closing the valence gap

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

14 March 2022

Johannes Wagner

Andreas Triantafyllopoulos

Björn W. Schuller

ArXiv (abs)PDF HTML HuggingFace (2 upvotes)

Papers citing "Dawn of the transformer era in speech emotion recognition: closing the valence gap"

30 / 130 papers shown

Title
Speech-based Age and Gender Prediction with Transformers Felix Burkhardt Johannes Wagner H. Wierstorf F. Eyben Björn Schuller 117 26 0 29 Jun 2023
Cross-Language Speech Emotion Recognition Using Multimodal Dual Attention Transformers Syed Muhammad talha Zaidi S. Latif Junaid Qadir 214 15 0 23 Jun 2023
Exploring Attention Mechanisms for Multimodal Emotion Recognition in an Emergency Call Center CorpusIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 Théo Deschamps-Berger L. Lamel Laurence Devillers 133 11 0 12 Jun 2023
PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech ModelsAffective Computing and Intelligent Interaction (ACII), 2023 Tiantian Feng Shrikanth Narayanan 227 40 0 08 Jun 2023
In-the-wild Speech Emotion Conversion Using Disentangled Self-Supervised Representations and Neural Vocoder-based Resynthesis N. Prabhu N. Lehmann-Willenbrock Timo Gerkmann 155 4 0 02 Jun 2023
Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled ModelsInterspeech (Interspeech), 2023 Danilo de Oliveira N. Prabhu Timo Gerkmann 108 10 0 30 May 2023
TrustSER: On the Trustworthiness of Fine-tuning Pre-trained Speech Embeddings For Speech Emotion RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 Tiantian Feng Rajat Hebbar Shrikanth Narayanan 151 9 0 18 May 2023
The MuSe 2023 Multimodal Sentiment Analysis Challenge: Mimicked Emotions, Cross-Cultural Humour, and Personalisation Lukas Christ Shahin Amiriparian Alice Baird Alexander Kathan Niklas Muller ... Eva-Maria Messner Andreas Konig Alan S. Cowen Xiaoshi Zhong Björn W. Schuller 287 31 0 05 May 2023
Exploring Emerging Technologies for Requirements Elicitation Interview Training: Empirical Assessment of Robotic and Virtual Tutors Binnur Görer Fatma Başak Aydemir 221 0 0 28 Apr 2023
Lightweight Toxicity Detection in Spoken Language: A Transformer-based Approach for Edge Devices Ahlam Husni Abu Nada S. Latif Junaid Qadir 100 6 0 22 Apr 2023
Transformers in Speech Processing: A Survey S. Latif Aun Zaidi Heriberto Cuayáhuitl Fahad Shamshad Moazzam Shoukat Muhammad Usama Junaid Qadir 396 66 0 21 Mar 2023
SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech ProcessingIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023 Weidong Chen Xiaofen Xing Xiangmin Xu Jianxin Pang Lan Du 145 66 0 27 Feb 2023
A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous SpeechAAAI Conference on Artificial Intelligence (AAAI), 2023 Li-Wei Chen Shinji Watanabe Alexander I. Rudnicky 104 46 0 08 Feb 2023
HEAR4Health: A blueprint for making computer audition a staple of modern healthcare Andreas Triantafyllopoulos Alexander Kathan Alice Baird Lukas Christ Alexander Gebhard ... Shahin Amiriparian K. D. Bartl-Pokorny A. Batliner Florian B. Pokorny Björn W. Schuller 200 10 0 25 Jan 2023
A Review of Speech-centric Trustworthy Machine Learning: Privacy, Safety, and FairnessAPSIPA Transactions on Signal and Information Processing (TASIP), 2022 Tiantian Feng Rajat Hebbar Nicholas Mehlman Xuan Shi Aditya Kommineni and Shrikanth Narayanan 208 36 0 18 Dec 2022
Improving Speech Emotion Recognition with Unsupervised Speaking Style TransferIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Leyuan Qu Wei Wang C. Weber F. Ren Taiha Li S. Wermter 192 4 0 16 Nov 2022
A unified one-shot prosody and speaker conversion system with self-supervised discrete speech unitsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Li-Wei Chen Shinji Watanabe Alexander I. Rudnicky 147 8 0 12 Nov 2022
A Comparative Study of Data Augmentation Techniques for Deep Learning Based Emotion Recognition Ravi Shankar Abdouh Harouna Kenfack Arjun Somayazulu A. Venkataraman 77 5 0 09 Nov 2022
Fast Yet Effective Speech Emotion Recognition with Self-distillationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Zhao Ren Thanh Tam Nguyen Yi Chang Björn W. Schuller 101 16 0 26 Oct 2022
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning EraProceedings of the IEEE (Proc. IEEE), 2022 Andreas Triantafyllopoulos Björn W. Schuller Gokcce .Iymen M. Sezgin Xiangheng He ... Shuo Liu Silvan Mertes Elisabeth André Ruibo Fu Jianhua Tao 254 84 0 06 Oct 2022
Towards Multimodal Prediction of Spontaneous Humour: A Novel Dataset and First ResultsIEEE Transactions on Affective Computing (IEEE TAC), 2022 Lukas Christ Shahin Amiriparian Alexander Kathan Niklas Muller Andreas Konig Björn W. Schuller 254 9 0 28 Sep 2022
An Efficient Multitask Learning Architecture for Affective Vocal Burst Analysis Tobias Hallmen Silvan Mertes Dominik Schiller Elisabeth André 103 5 0 28 Sep 2022
Self-Relation Attention and Temporal Awareness for Emotion Recognition via Vocal BurstItalian National Conference on Sensors (INS), 2022 Dang-Linh Trinh Minh-Cong Vo Gueesang Lee 164 3 0 15 Sep 2022
Self-Supervised Attention Networks and Uncertainty Loss Weighting for Multi-Task Emotion Recognition on Vocal Bursts Vincent Karas Andreas Triantafyllopoulos Meishu Song Björn W. Schuller 136 4 0 15 Sep 2022
Exploring the Effectiveness of Self-supervised Learning and Classifier Chains in Emotion Recognition of Nonverbal Vocalizations Detai Xin Shinnosuke Takamichi Hiroshi Saruwatari 84 15 0 21 Jun 2022
COVYT: Introducing the Coronavirus YouTube and TikTok speech dataset featuring the same speakers with and without infection Andreas Triantafyllopoulos A. Semertzidou Meishu Song Florian B. Pokorny Björn W. Schuller 192 2 0 20 Jun 2022
Words are all you need? Language as an approximation for human similarity judgmentsInternational Conference on Learning Representations (ICLR), 2022 Raja Marjieh Pol van Rijn Ilia Sucholutsky T. Sumers Harin Lee Thomas Griffiths Nori Jacoby 186 22 0 08 Jun 2022
Probing Speech Emotion Recognition Transformers for Linguistic KnowledgeInterspeech (Interspeech), 2022 Andreas Triantafyllopoulos Johannes Wagner H. Wierstorf Maximilian Schmitt U. Reichel F. Eyben Felix Burkhardt Björn W. Schuller 265 33 0 01 Apr 2022
Multistage linguistic conditioning of convolutional layers for speech emotion recognition Andreas Triantafyllopoulos U. Reichel Shuo Liu Simon Huber F. Eyben Björn W. Schuller 182 16 0 13 Oct 2021
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer EncodersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 Andy T. Liu Shu-Wen Yang Po-Han Chi Po-Chun Hsu Hung-yi Lee SSL 408 388 0 25 Oct 2019