ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.05639
  4. Cited By
Looking Enhances Listening: Recovering Missing Speech Using Images

Looking Enhances Listening: Recovering Missing Speech Using Images

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
13 February 2020
Tejas Srinivasan
Ramon Sanabria
Florian Metze
ArXiv (abs)PDFHTML

Papers citing "Looking Enhances Listening: Recovering Missing Speech Using Images"

10 / 10 papers shown
VHASR: A Multimodal Speech Recognition System With Vision Hotwords
VHASR: A Multimodal Speech Recognition System With Vision HotwordsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Jiliang Hu
Zuchao Li
Ping Wang
Haojun Ai
Lefei Zhang
Hai Zhao
189
3
0
01 Oct 2024
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot
  AV-ASR
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASRComputer Vision and Pattern Recognition (CVPR), 2023
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
201
24
0
29 Mar 2023
Multimodal Speech Recognition for Language-Guided Embodied Agents
Multimodal Speech Recognition for Language-Guided Embodied AgentsInterspeech (Interspeech), 2023
Allen Chang
Xiaoyuan Zhu
Aarav Monga
Seoho Ahn
Tejas Srinivasan
Jesse Thomason
AuLLM
348
6
0
27 Feb 2023
AVATAR: Unconstrained Audiovisual Speech Recognition
AVATAR: Unconstrained Audiovisual Speech RecognitionInterspeech (Interspeech), 2022
Valentin Gabeur
Paul Hongsuck Seo
Arsha Nagrani
Chen Sun
Alahari Karteek
Cordelia Schmid
127
16
0
15 Jun 2022
Improving Multimodal Speech Recognition by Data Augmentation and Speech
  Representations
Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Dan Oneaţă
H. Cucu
121
24
0
27 Apr 2022
Adversarial Attacks on Speech Recognition Systems for Mission-Critical
  Applications: A Survey
Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey
Ngoc Dung Huynh
Mohamed Reda Bouadjenek
Imran Razzak
Kevin Lee
Chetan Arora
Ali Hassani
A. Zaslavsky
AAML
182
8
0
22 Feb 2022
Listen, Look and Deliberate: Visual context-aware speech recognition
  using pre-trained text-video representations
Listen, Look and Deliberate: Visual context-aware speech recognition using pre-trained text-video representations
Shahram Ghorbani
Yashesh Gaur
Yu Shi
Jinyu Li
117
14
0
08 Nov 2020
Multimodal Speech Recognition with Unstructured Audio Masking
Multimodal Speech Recognition with Unstructured Audio Masking
Tejas Srinivasan
Ramon Sanabria
Florian Metze
Desmond Elliott
CVBM
120
10
0
16 Oct 2020
Fine-Grained Grounding for Multimodal Speech Recognition
Fine-Grained Grounding for Multimodal Speech RecognitionFindings (Findings), 2020
Tejas Srinivasan
Ramon Sanabria
Florian Metze
Desmond Elliott
161
11
0
05 Oct 2020
Experience Grounds Language
Experience Grounds Language
Yonatan Bisk
Ari Holtzman
Jesse Thomason
Jacob Andreas
Yoshua Bengio
...
Angeliki Lazaridou
Jonathan May
Aleksandr Nisnevich
Nicolas Pinto
Joseph P. Turian
534
403
0
21 Apr 2020
1
Page 1 of 1