ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.14840
  4. Cited By
Multiresolution and Multimodal Speech Recognition with Transformers

Multiresolution and Multimodal Speech Recognition with Transformers

Annual Meeting of the Association for Computational Linguistics (ACL), 2020
29 April 2020
Georgios Paraskevopoulos
Srinivas Parthasarathy
Aparna Khare
Shiva Sundaram
ArXiv (abs)PDFHTML

Papers citing "Multiresolution and Multimodal Speech Recognition with Transformers"

15 / 15 papers shown
Robust Audiovisual Speech Recognition Models with Mixture-of-Experts
Robust Audiovisual Speech Recognition Models with Mixture-of-ExpertsSpoken Language Technology Workshop (SLT), 2024
Yihan Wu
Yifan Peng
Yichen Lu
Xuankai Chang
Ruihua Song
Shinji Watanabe
343
7
0
19 Sep 2024
SynesLM: A Unified Approach for Audio-visual Speech Recognition and
  Translation via Language Model and Synthetic Data
SynesLM: A Unified Approach for Audio-visual Speech Recognition and Translation via Language Model and Synthetic Data
Yichen Lu
Álvaro Huertas-García
Xuankai Chang
Hengwei Bian
Soumi Maiti
Shinji Watanabe
276
2
0
01 Aug 2024
Adaptation and Optimization of Automatic Speech Recognition (ASR) for the Maritime Domain in the Field of VHF Communication
Adaptation and Optimization of Automatic Speech Recognition (ASR) for the Maritime Domain in the Field of VHF Communication
Emin Cagatay Nakilcioglu
M. Reimann
O. John
129
7
0
01 Jun 2023
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot
  AV-ASR
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASRComputer Vision and Pattern Recognition (CVPR), 2023
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
259
26
0
29 Mar 2023
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition
  Systems A case study for Modern Greek
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern GreekIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Georgios Paraskevopoulos
Theodoros Kouzelis
Georgios Rouvalis
Athanasios Katsamanis
Vassilis Katsouros
Alexandros Potamianos
VLM
357
14
0
31 Dec 2022
AVATAR: Unconstrained Audiovisual Speech Recognition
AVATAR: Unconstrained Audiovisual Speech RecognitionInterspeech (Interspeech), 2022
Valentin Gabeur
Paul Hongsuck Seo
Arsha Nagrani
Chen Sun
Alahari Karteek
Cordelia Schmid
179
17
0
15 Jun 2022
Improving Multimodal Speech Recognition by Data Augmentation and Speech
  Representations
Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Dan Oneaţă
H. Cucu
158
25
0
27 Apr 2022
ASR-Aware End-to-end Neural Diarization
ASR-Aware End-to-end Neural DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Aparna Khare
Eunjung Han
Yuguang Yang
A. Stolcke
288
16
0
02 Feb 2022
MMLatch: Bottom-up Top-down Fusion for Multimodal Sentiment Analysis
MMLatch: Bottom-up Top-down Fusion for Multimodal Sentiment AnalysisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Georgios Paraskevopoulos
Efthymios Georgiou
Alexandros Potamianos
170
42
0
24 Jan 2022
Transformers for prompt-level EMA non-response prediction
Transformers for prompt-level EMA non-response prediction
Supriya Nagesh
Alexander Moreno
Stephanie M Carpenter
Jamie Yap
Soujanya Chatterjee
...
Santosh Kumar
Cho Lam
D. Wetter
Inbal Nahum-Shani
James M. Rehg
128
1
0
01 Nov 2021
LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution
  Homography Estimation
LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution Homography EstimationIEEE International Conference on Computer Vision (ICCV), 2021
Ruizhi Shao
Gaochang Wu
Yuemei Zhou
Ying Fu
Yebin Liu
ViT
291
56
0
08 Jun 2021
Detecting expressions with multimodal transformers
Detecting expressions with multimodal transformersSpoken Language Technology Workshop (SLT), 2020
Srinivas Parthasarathy
Shiva Sundaram
286
34
0
30 Nov 2020
Self-Supervised learning with cross-modal transformers for emotion
  recognition
Self-Supervised learning with cross-modal transformers for emotion recognitionSpoken Language Technology Workshop (SLT), 2020
Aparna Khare
Srinivas Parthasarathy
Shiva Sundaram
SSL
195
45
0
20 Nov 2020
Training Strategies to Handle Missing Modalities for Audio-Visual
  Expression Recognition
Training Strategies to Handle Missing Modalities for Audio-Visual Expression Recognition
Srinivas Parthasarathy
Shiva Sundaram
330
103
0
02 Oct 2020
Multi-modal embeddings using multi-task learning for emotion recognition
Multi-modal embeddings using multi-task learning for emotion recognitionInterspeech (Interspeech), 2020
Aparna Khare
Srinivas Parthasarathy
Shiva Sundaram
141
21
0
10 Sep 2020
1
Page 1 of 1