ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.18327
  4. Cited By
MultiMAE-DER: Multimodal Masked Autoencoder for Dynamic Emotion
  Recognition

MultiMAE-DER: Multimodal Masked Autoencoder for Dynamic Emotion Recognition

28 April 2024
Peihao Xiang
Chaohao Lin
Kaida Wu
Ou Bai
ArXivPDFHTML

Papers citing "MultiMAE-DER: Multimodal Masked Autoencoder for Dynamic Emotion Recognition"

7 / 7 papers shown
Title
Qieemo: Speech Is All You Need in the Emotion Recognition in Conversations
Qieemo: Speech Is All You Need in the Emotion Recognition in Conversations
Jinming Chen
Jingyi Fang
Yuanzhong Zheng
Yaoxuan Wang
Haojun Fei
41
0
0
05 Mar 2025
Multimodal Emotion Recognition using Audio-Video Transformer Fusion with Cross Attention
Multimodal Emotion Recognition using Audio-Video Transformer Fusion with Cross Attention
Joe Dhanith
Shravan Venkatraman
Modigari Narendra
Vigya Sharma
Santhosh Malarvannan
67
0
0
20 Feb 2025
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual
  Representation Models
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models
Yuan Tseng
Layne Berry
Yi-Ting Chen
I-Hsiang Chiu
Hsuan-Hao Lin
...
Yu Tsao
Shinji Watanabe
Abdel-rahman Mohamed
Chi-Luen Feng
Hung-yi Lee
VLM
SSL
44
13
0
19 Sep 2023
Self-attention fusion for audiovisual emotion recognition with
  incomplete data
Self-attention fusion for audiovisual emotion recognition with incomplete data
K. Chumachenko
Alexandros Iosifidis
M. Gabbouj
70
37
0
26 Jan 2022
A Pre-trained Audio-Visual Transformer for Emotion Recognition
A Pre-trained Audio-Visual Transformer for Emotion Recognition
Minh Tran
M. Soleymani
56
25
0
23 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
255
7,337
0
11 Nov 2021
Multimodal Compact Bilinear Pooling for Visual Question Answering and
  Visual Grounding
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
136
1,458
0
06 Jun 2016
1