ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1809.01728
  4. Cited By
Attention-based Audio-Visual Fusion for Robust Automatic Speech
  Recognition

Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition

5 September 2018
George Sterpu
Christian Saam
N. Harte
ArXivPDFHTML

Papers citing "Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition"

12 / 12 papers shown
Title
Uncovering the Visual Contribution in Audio-Visual Speech Recognition
Uncovering the Visual Contribution in Audio-Visual Speech Recognition
Zhaofeng Lin
Naomi Harte
73
1
0
20 Jan 2025
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
David Gimeno-Gómez
Carlos David Martínez Hinarejos
86
2
0
09 Jul 2024
Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast
  Conformer
Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast Conformer
Maxime Burchi
Krishna C. Puvvada
Jagadeesh Balam
Boris Ginsburg
Radu Timofte
25
7
0
14 Mar 2024
Trustworthy Sensor Fusion against Inaudible Command Attacks in Advanced
  Driver-Assistance System
Trustworthy Sensor Fusion against Inaudible Command Attacks in Advanced Driver-Assistance System
Jiwei Guan
Lei Pan
Chen Wang
Shui Yu
Longxiang Gao
Xi Zheng
AAML
19
3
0
30 May 2023
Streaming Audio-Visual Speech Recognition with Alignment Regularization
Streaming Audio-Visual Speech Recognition with Alignment Regularization
Pingchuan Ma
Niko Moritz
Stavros Petridis
Christian Fuegen
M. Pantic
23
2
0
03 Nov 2022
Learning Contextually Fused Audio-visual Representations for
  Audio-visual Speech Recognition
Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition
Zitian Zhang
Jie M. Zhang
Jian-Shu Zhang
Ming Wu
Xin Fang
Lirong Dai
SSL
22
10
0
15 Feb 2022
Audio-Visual Transformer Based Crowd Counting
Audio-Visual Transformer Based Crowd Counting
Usman Sajid
Xiangyu Chen
Hasan Sajid
Taejoon Kim
Guanghui Wang
ViT
29
22
0
04 Sep 2021
Attentive Fusion Enhanced Audio-Visual Encoding for Transformer Based
  Robust Speech Recognition
Attentive Fusion Enhanced Audio-Visual Encoding for Transformer Based Robust Speech Recognition
L. Wei
Jie M. Zhang
Junfeng Hou
Lirong Dai
4
14
0
06 Aug 2020
Multiresolution and Multimodal Speech Recognition with Transformers
Multiresolution and Multimodal Speech Recognition with Transformers
Georgios Paraskevopoulos
Srinivas Parthasarathy
Aparna Khare
Shiva Sundaram
18
29
0
29 Apr 2020
How to Teach DNNs to Pay Attention to the Visual Modality in Speech
  Recognition
How to Teach DNNs to Pay Attention to the Visual Modality in Speech Recognition
George Sterpu
Christian Saam
N. Harte
16
28
0
17 Apr 2020
Modality Attention for End-to-End Audio-visual Speech Recognition
Modality Attention for End-to-End Audio-visual Speech Recognition
Pan Zhou
Wenwen Yang
Wei Chen
Yanfeng Wang
Jia Jia
24
69
0
13 Nov 2018
Lip Reading Sentences in the Wild
Lip Reading Sentences in the Wild
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
162
782
0
16 Nov 2016
1