Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2004.14840
Cited By
Multiresolution and Multimodal Speech Recognition with Transformers
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
29 April 2020
Georgios Paraskevopoulos
Srinivas Parthasarathy
Aparna Khare
Shiva Sundaram
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Multiresolution and Multimodal Speech Recognition with Transformers"
15 / 15 papers shown
Robust Audiovisual Speech Recognition Models with Mixture-of-Experts
Spoken Language Technology Workshop (SLT), 2024
Yihan Wu
Yifan Peng
Yichen Lu
Xuankai Chang
Ruihua Song
Shinji Watanabe
343
7
0
19 Sep 2024
SynesLM: A Unified Approach for Audio-visual Speech Recognition and Translation via Language Model and Synthetic Data
Yichen Lu
Álvaro Huertas-García
Xuankai Chang
Hengwei Bian
Soumi Maiti
Shinji Watanabe
276
2
0
01 Aug 2024
Adaptation and Optimization of Automatic Speech Recognition (ASR) for the Maritime Domain in the Field of VHF Communication
Emin Cagatay Nakilcioglu
M. Reimann
O. John
129
7
0
01 Jun 2023
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Computer Vision and Pattern Recognition (CVPR), 2023
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
259
26
0
29 Mar 2023
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Georgios Paraskevopoulos
Theodoros Kouzelis
Georgios Rouvalis
Athanasios Katsamanis
Vassilis Katsouros
Alexandros Potamianos
VLM
357
14
0
31 Dec 2022
AVATAR: Unconstrained Audiovisual Speech Recognition
Interspeech (Interspeech), 2022
Valentin Gabeur
Paul Hongsuck Seo
Arsha Nagrani
Chen Sun
Alahari Karteek
Cordelia Schmid
179
17
0
15 Jun 2022
Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Dan Oneaţă
H. Cucu
158
25
0
27 Apr 2022
ASR-Aware End-to-end Neural Diarization
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Aparna Khare
Eunjung Han
Yuguang Yang
A. Stolcke
288
16
0
02 Feb 2022
MMLatch: Bottom-up Top-down Fusion for Multimodal Sentiment Analysis
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Georgios Paraskevopoulos
Efthymios Georgiou
Alexandros Potamianos
170
42
0
24 Jan 2022
Transformers for prompt-level EMA non-response prediction
Supriya Nagesh
Alexander Moreno
Stephanie M Carpenter
Jamie Yap
Soujanya Chatterjee
...
Santosh Kumar
Cho Lam
D. Wetter
Inbal Nahum-Shani
James M. Rehg
128
1
0
01 Nov 2021
LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution Homography Estimation
IEEE International Conference on Computer Vision (ICCV), 2021
Ruizhi Shao
Gaochang Wu
Yuemei Zhou
Ying Fu
Yebin Liu
ViT
291
56
0
08 Jun 2021
Detecting expressions with multimodal transformers
Spoken Language Technology Workshop (SLT), 2020
Srinivas Parthasarathy
Shiva Sundaram
286
34
0
30 Nov 2020
Self-Supervised learning with cross-modal transformers for emotion recognition
Spoken Language Technology Workshop (SLT), 2020
Aparna Khare
Srinivas Parthasarathy
Shiva Sundaram
SSL
195
45
0
20 Nov 2020
Training Strategies to Handle Missing Modalities for Audio-Visual Expression Recognition
Srinivas Parthasarathy
Shiva Sundaram
330
103
0
02 Oct 2020
Multi-modal embeddings using multi-task learning for emotion recognition
Interspeech (Interspeech), 2020
Aparna Khare
Srinivas Parthasarathy
Shiva Sundaram
141
21
0
10 Sep 2020
1
Page 1 of 1