ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.03424
  4. Cited By
MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech
  Recognition

MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition

7 January 2024
He Wang
Pengcheng Guo
Pan Zhou
Lei Xie
ArXivPDFHTML

Papers citing "MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition"

10 / 10 papers shown
Title
DCIM-AVSR : Efficient Audio-Visual Speech Recognition via Dual Conformer Interaction Module
DCIM-AVSR : Efficient Audio-Visual Speech Recognition via Dual Conformer Interaction Module
Xinyu Wang
Qian Wang
Haolin Huang
Yu Fang
Mengjie Xu
Qian Wang
31
0
0
31 Aug 2024
Learning Video Temporal Dynamics with Cross-Modal Attention for Robust
  Audio-Visual Speech Recognition
Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
Sungnyun Kim
Kangwook Jang
Sangmin Bae
Hoirin Kim
Se-Young Yun
47
3
0
04 Jul 2024
MaLa-ASR: Multimedia-Assisted LLM-Based ASR
MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Guanrou Yang
Ziyang Ma
Fan Yu
Zhifu Gao
Shiliang Zhang
Xie Chen
AuLLM
41
2
0
09 Jun 2024
Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder
Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder
He Wang
Pengcheng Guo
Xucheng Wan
Huan Zhou
Lei Xie
32
2
0
08 Apr 2024
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video
  Frames for Audio-Visual Speech Recognition
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Yusheng Dai
Hang Chen
Jun Du
Ruoyu Wang
Shihao Chen
Jie Ma
Haotian Wang
Chin-Hui Lee
45
4
0
07 Mar 2024
Watch or Listen: Robust Audio-Visual Speech Recognition with Visual
  Corruption Modeling and Reliability Scoring
Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring
Joanna Hong
Minsu Kim
J. Choi
Y. Ro
29
19
0
15 Mar 2023
TSUP Speaker Diarization System for Conversational Short-phrase Speaker
  Diarization Challenge
TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge
Bowen Pang
Huan Zhao
Gaosheng Zhang
Xiaoyue Yang
Yanguo Sun
Li Lyna Zhang
Qing Wang
Linfu Xie
BDL
23
2
0
26 Oct 2022
E-Branchformer: Branchformer with Enhanced merging for speech
  recognition
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
61
105
0
30 Sep 2022
End-to-end Audio-visual Speech Recognition with Conformers
End-to-end Audio-visual Speech Recognition with Conformers
Pingchuan Ma
Stavros Petridis
M. Pantic
84
225
0
12 Feb 2021
Intermediate Loss Regularization for CTC-based Speech Recognition
Intermediate Loss Regularization for CTC-based Speech Recognition
Jaesong Lee
Shinji Watanabe
118
135
0
05 Feb 2021
1