Multimodal Speech Recognition with Unstructured Audio Masking

16 October 2020

Papers citing "Multimodal Speech Recognition with Unstructured Audio Masking"

8 / 8 papers shown

VHASR: A Multimodal Speech Recognition System With Vision HotwordsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Jiliang Hu

Zuchao Li

Ping Wang

Haojun Ai

Lefei Zhang

Hai Zhao

186

01 Oct 2024

VILAS: Exploring the Effects of Vision and Language Context in Automatic Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Minglun Han

Bo Xu

186

31 May 2023

Multimodal Speech Recognition for Language-Guided Embodied AgentsInterspeech (Interspeech), 2023

343

27 Feb 2023

Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations

Dan Oneaţă

H. Cucu

118

27 Apr 2022

Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey

Ngoc Dung Huynh

Mohamed Reda Bouadjenek

Imran Razzak

178

22 Feb 2022

MMLatch: Bottom-up Top-down Fusion for Multimodal Sentiment AnalysisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Georgios Paraskevopoulos

Efthymios Georgiou

Alexandros Potamianos

123

24 Jan 2022

Text is no more Enough! A Benchmark for Profile-based Spoken Language UnderstandingAAAI Conference on Artificial Intelligence (AAAI), 2021

Linlin Li

215

22 Dec 2021

Detecting Hate Speech in Memes Using Multimodal Deep Learning Approaches: Prize-winning solution to Hateful Memes Challenge

Riza Velioglu

J. Rose

VLM

121

103

23 Dec 2020