ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.04084
  4. Cited By
Listen, Look and Deliberate: Visual context-aware speech recognition
  using pre-trained text-video representations

Listen, Look and Deliberate: Visual context-aware speech recognition using pre-trained text-video representations

8 November 2020
Shahram Ghorbani
Yashesh Gaur
Yu Shi
Jinyu Li
ArXivPDFHTML

Papers citing "Listen, Look and Deliberate: Visual context-aware speech recognition using pre-trained text-video representations"

3 / 3 papers shown
Title
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot
  AV-ASR
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
29
15
0
29 Mar 2023
Can Visual Context Improve Automatic Speech Recognition for an Embodied
  Agent?
Can Visual Context Improve Automatic Speech Recognition for an Embodied Agent?
Pradip Pramanick
Chayan Sarkar
21
7
0
21 Oct 2022
Improving Multimodal Speech Recognition by Data Augmentation and Speech
  Representations
Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Dan Oneaţă
H. Cucu
19
19
0
27 Apr 2022
1