Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2401.03424
Cited By
v1
v2
v3 (latest)
MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition
7 January 2024
He Wang
Pengcheng Guo
Pan Zhou
Lei Xie
Re-assign community
ArXiv (abs)
PDF
HTML
Github (1826★)
Papers citing
"MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition"
5 / 5 papers shown
DCIM-AVSR : Efficient Audio-Visual Speech Recognition via Dual Conformer Interaction Module
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Xinyu Wang
Qian Wang
Haolin Huang
Yu Fang
Mengjie Xu
Qian Wang
521
2
0
31 Aug 2024
Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
Sungnyun Kim
Kangwook Jang
Sangmin Bae
Hoirin Kim
Se-Young Yun
294
8
0
04 Jul 2024
MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Guanrou Yang
Ziyang Ma
Fan Yu
Zhifu Gao
Shiliang Zhang
Xie Chen
AuLLM
410
5
0
09 Jun 2024
Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder
He Wang
Pengcheng Guo
Xucheng Wan
Huan Zhou
Lei Xie
299
5
0
08 Apr 2024
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Yusheng Dai
Hang Chen
Jun Du
Ruoyu Wang
Shihao Chen
Jie Ma
Haotian Wang
Chin-Hui Lee
324
13
0
07 Mar 2024
1
Page 1 of 1