Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.09212
Cited By
Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition
16 May 2023
Yuchen Hu
Ruizhe Li
Chen Chen
Heqing Zou
Qiu-shi Zhu
E. Chng
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition"
7 / 7 papers shown
Title
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
Sungnyun Kim
Sungwoo Cho
Sangmin Bae
Kangwook Jang
Se-Young Yun
SSL
68
1
0
23 Jan 2025
Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation
Yuchen Hu
Chen Chen
Heqing Zou
Xionghu Zhong
Chng Eng Siong
45
16
0
22 Feb 2023
Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning
Qiu-shi Zhu
Long Zhou
Jie M. Zhang
Shujie Liu
Yu-Chen Hu
Lirong Dai
VLM
SSL
48
36
0
27 Oct 2022
A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition
Qiu-shi Zhu
Jie M. Zhang
Zi-qiang Zhang
Ming Wu
Xin Fang
Lirong Dai
112
39
0
22 Jan 2022
End-to-end Audio-visual Speech Recognition with Conformers
Pingchuan Ma
Stavros Petridis
M. Pantic
79
221
0
12 Feb 2021
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
238
3,359
0
09 Mar 2020
Lip Reading Sentences in the Wild
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
162
782
0
16 Nov 2016
1