ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.17200
  4. Cited By
SynthVSR: Scaling Up Visual Speech Recognition With Synthetic
  Supervision
v1v2 (latest)

SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision

Computer Vision and Pattern Recognition (CVPR), 2023
30 March 2023
Xubo Liu
Egor Lakomkin
Konstantinos Vougioukas
Pingchuan Ma
Honglie Chen
Rui-Cang Xie
Morrie Doulaty
Niko Moritz
J. Kolár
Stavros Petridis
Maja Pantic
Christian Fuegen
ArXiv (abs)PDFHTML

Papers citing "SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision"

13 / 13 papers shown
Phoneme-Level Visual Speech Recognition via Point-Visual Fusion and Language Model Reconstruction
Phoneme-Level Visual Speech Recognition via Point-Visual Fusion and Language Model Reconstruction
Matthew Kit Khinn Teng
Haibo Zhang
Takeshi Saitoh
116
1
0
25 Jul 2025
VALLR: Visual ASR Language Model for Lip Reading
VALLR: Visual ASR Language Model for Lip Reading
Marshall Thomas
Edward Fish
Richard Bowden
279
6
0
27 Mar 2025
LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech Recognition
LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Bowen Hao
Dongliang Zhou
Xiaojie Li
Xingyu Zhang
Liang Xie
Yue Yu
Erwei Yin
394
6
0
08 Jan 2025
Unified Speech Recognition: A Single Model for Auditory, Visual, and
  Audiovisual Inputs
Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual InputsNeural Information Processing Systems (NeurIPS), 2024
A. Haliassos
Rodrigo Mira
Honglie Chen
Zoe Landgraf
Stavros Petridis
Maja Pantic
SSL
335
14
0
04 Nov 2024
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
David Gimeno-Gómez
Carlos David Martínez Hinarejos
413
5
0
09 Jul 2024
Contrastive Learning from Synthetic Audio Doppelgängers
Contrastive Learning from Synthetic Audio DoppelgängersInternational Conference on Learning Representations (ICLR), 2024
Manuel Cherep
Nikhil Singh
344
1
0
09 Jun 2024
Exploring the Impact of Synthetic Data for Aerial-view Human Detection
Exploring the Impact of Synthetic Data for Aerial-view Human Detection
Hyungtae Lee
Yan Zhang
Yingzhe Shen
Heesung Kwon
Shuvra S. Bhattacharyya
318
2
0
24 May 2024
BRAVEn: Improving Self-Supervised Pre-training for Visual and Auditory
  Speech Recognition
BRAVEn: Improving Self-Supervised Pre-training for Visual and Auditory Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
A. Haliassos
Andreas Zinonos
Rodrigo Mira
Stavros Petridis
Maja Pantic
VLMSSLAI4TS
235
21
0
02 Apr 2024
LiteVSR: Efficient Visual Speech Recognition by Learning from Speech
  Representations of Unlabeled Data
LiteVSR: Efficient Visual Speech Recognition by Learning from Speech Representations of Unlabeled DataIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Hendrik Laux
Emil Mededovic
Ahmed Hallawa
Lukas Martin
A. Peine
Anke Schmeink
VLM
134
7
0
15 Dec 2023
Do VSR Models Generalize Beyond LRS3?
Do VSR Models Generalize Beyond LRS3?IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Y. A. D. Djilali
Sanath Narayan
Eustache Le Bihan
Haithem Boussaid
Ebtesam Almazrouei
Merouane Debbah
187
7
0
23 Nov 2023
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by
  Compressing Audio Knowledge of a Pretrained Model
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained ModelIEEE transactions on multimedia (IEEE TMM), 2023
Jeong Hun Yeo
Minsu Kim
J. Choi
Dae Hoe Kim
Y. Ro
184
26
0
15 Aug 2023
Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention
Visually-Aware Audio Captioning With Adaptive Audio-Visual AttentionInterspeech (Interspeech), 2022
Xubo Liu
Qiushi Huang
Xinhao Mei
Haohe Liu
Qiuqiang Kong
...
Yu Zhang
Lilian H. Y. Tang
Mark D. Plumbley
Volkan Kilicc
Wenwu Wang
394
25
0
28 Oct 2022
StyleGAN2 Distillation for Feed-forward Image Manipulation
StyleGAN2 Distillation for Feed-forward Image ManipulationEuropean Conference on Computer Vision (ECCV), 2020
Yuri Viazovetskyi
V. Ivashkin
Evgenii Kashin
472
148
0
07 Mar 2020
1