ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.11117
  4. Cited By
A vector quantized masked autoencoder for speech emotion recognition

A vector quantized masked autoencoder for speech emotion recognition

21 April 2023
Samir Sadok
Simon Leglaive
Renaud Séguier
ArXivPDFHTML

Papers citing "A vector quantized masked autoencoder for speech emotion recognition"

8 / 8 papers shown
Title
Multimodal Emotion Recognition using Audio-Video Transformer Fusion with Cross Attention
Multimodal Emotion Recognition using Audio-Video Transformer Fusion with Cross Attention
Joe Dhanith
Shravan Venkatraman
Modigari Narendra
Vigya Sharma
Santhosh Malarvannan
72
0
0
20 Feb 2025
MEGA: Masked Generative Autoencoder for Human Mesh Recovery
MEGA: Masked Generative Autoencoder for Human Mesh Recovery
Guénolé Fiche
Simon Leglaive
Xavier Alameda-Pineda
Francesc Moreno-Noguer
3DH
54
1
0
29 May 2024
Adaptive Speech Emotion Representation Learning Based On Dynamic Graph
Adaptive Speech Emotion Representation Learning Based On Dynamic Graph
Yingxue Gao
Huan Zhao
Zixing Zhang
36
1
0
07 May 2024
MultiMAE-DER: Multimodal Masked Autoencoder for Dynamic Emotion
  Recognition
MultiMAE-DER: Multimodal Masked Autoencoder for Dynamic Emotion Recognition
Peihao Xiang
Chaohao Lin
Kaida Wu
Ou Bai
22
3
0
28 Apr 2024
A vector quantized masked autoencoder for audiovisual speech emotion recognition
A vector quantized masked autoencoder for audiovisual speech emotion recognition
Samir Sadok
Simon Leglaive
Renaud Séguier
SSL
68
6
0
05 May 2023
Self-attention fusion for audiovisual emotion recognition with
  incomplete data
Self-attention fusion for audiovisual emotion recognition with incomplete data
K. Chumachenko
Alexandros Iosifidis
M. Gabbouj
70
37
0
26 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
214
2,224
0
14 Jun 2018
1