v1v2 (latest)

SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision

Computer Vision and Pattern Recognition (CVPR), 2023

30 March 2023

Xubo Liu

Egor Lakomkin

Konstantinos Vougioukas

Papers citing "SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision"

13 / 13 papers shown

Phoneme-Level Visual Speech Recognition via Point-Visual Fusion and Language Model Reconstruction

Matthew Kit Khinn Teng

Haibo Zhang

Takeshi Saitoh

139

25 Jul 2025

VALLR: Visual ASR Language Model for Lip Reading

Marshall Thomas

Edward Fish

Richard Bowden

294

27 Mar 2025

LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

403

08 Jan 2025

Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual InputsNeural Information Processing Systems (NeurIPS), 2024

342

04 Nov 2024

Tailored Design of Audio-Visual Speech Recognition Models using Branchformers

David Gimeno-Gómez

Carlos David Martínez Hinarejos

421

09 Jul 2024

Contrastive Learning from Synthetic Audio DoppelgängersInternational Conference on Learning Representations (ICLR), 2024

Manuel Cherep

Nikhil Singh

364

09 Jun 2024

Exploring the Impact of Synthetic Data for Aerial-view Human Detection

Shuvra S. Bhattacharyya

324

24 May 2024

BRAVEn: Improving Self-Supervised Pre-training for Visual and Auditory Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

258

02 Apr 2024

LiteVSR: Efficient Visual Speech Recognition by Learning from Speech Representations of Unlabeled DataIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Hendrik Laux

Anke Schmeink

138

15 Dec 2023

Do VSR Models Generalize Beyond LRS3?IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

Sanath Narayan

Merouane Debbah

195

23 Nov 2023

AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained ModelIEEE transactions on multimedia (IEEE TMM), 2023

Jeong Hun Yeo

187

15 Aug 2023

Visually-Aware Audio Captioning With Adaptive Audio-Visual AttentionInterspeech (Interspeech), 2022

...

414

28 Oct 2022

StyleGAN2 Distillation for Feed-forward Image ManipulationEuropean Conference on Computer Vision (ECCV), 2020

Yuri Viazovetskyi

V. Ivashkin

Evgenii Kashin

488

149

07 Mar 2020