Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.04970
Cited By
Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers
9 December 2022
Yasheng Sun
Hang Zhou
Kaisiyuan Wang
Qianyi Wu
Zhibin Hong
Jingtuo Liu
Errui Ding
Jingdong Wang
Ziwei Liu
Koike Hideki
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers"
12 / 12 papers shown
Title
GenSync: A Generalized Talking Head Framework for Audio-driven Multi-Subject Lip-Sync using 3D Gaussian Splatting
Anushka Agarwal
Muhammad Yusuf Hassan
Talha Chafekar
3DGS
19
0
0
03 May 2025
FREAK: Frequency-modulated High-fidelity and Real-time Audio-driven Talking Portrait Synthesis
Ziqi Ni
Ao Fu
Yi Zhou
61
0
0
06 Mar 2025
GLDiTalker: Speech-Driven 3D Facial Animation with Graph Latent Diffusion Transformer
Yihong Lin
Zhaoxin Fan
Lingyu Xiong
Liang Peng
Xiandong Li
Wenxiong Kang
Xianjia Wu
Songju Lei
Huang Xu
34
3
0
03 Aug 2024
GMTalker: Gaussian Mixture-based Audio-Driven Emotional Talking Video Portraits
Yibo Xia
Lizhen Wang
Xiang Deng
Xiaoyan Luo
Yunhong Wang
Yebin Liu
VGen
33
1
0
12 Dec 2023
Audio-Driven 3D Facial Animation from In-the-Wild Videos
Liying Lu
Tianke Zhang
Yunfei Liu
Xuangeng Chu
Yu Li
VGen
40
3
0
20 Jun 2023
MyStyle++: A Controllable Personalized Generative Prior
Libing Zeng
Lele Chen
Yinghao Xu
N. Kalantari
24
4
0
08 Jun 2023
Memories are One-to-Many Mapping Alleviators in Talking Face Generation
Anni Tang
Tianyu He
Xuejiao Tan
Jun Ling
Liang Song
CVBM
10
23
0
09 Dec 2022
EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model
Xinya Ji
Hang Zhou
Kaisiyuan Wang
Qianyi Wu
Wayne Wu
Feng Xu
Xun Cao
CVBM
50
157
0
30 May 2022
One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning
Suzhe Wang
Lincheng Li
Yueqing Ding
Xin Yu
CVBM
59
116
0
06 Dec 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
214
2,224
0
14 Jun 2018
Image-to-Image Translation with Conditional Adversarial Networks
Phillip Isola
Jun-Yan Zhu
Tinghui Zhou
Alexei A. Efros
SSeg
212
19,191
0
21 Nov 2016
1