Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2212.04248
Cited By
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors
IEEE International Conference on Computer Vision (ICCV), 2022
7 December 2022
Zhentao Yu
Zixin Yin
Deyu Zhou
Duomin Wang
Finn Wong
Baoyuan Wang
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors"
36 / 36 papers shown
Title
LiveNeRF: Efficient Face Replacement Through Neural Radiance Fields Integration
Tung Vu
Hai Nguyen
Cong Tran
45
0
0
10 Nov 2025
THEval. Evaluation Framework for Talking Head Video Generation
Nabyl Quignon
Baptiste Chopin
Yaohui Wang
A. Dantcheva
EGVM
DiffM
VGen
363
1
0
06 Nov 2025
ConsistEdit: Highly Consistent and Precise Training-free Visual Editing
Zixin Yin
Ling-Hao Chen
Lionel M. Ni
Xili Dai
112
0
0
20 Oct 2025
LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence
Zixin Yin
Xili Dai
Duomin Wang
Xianfang Zeng
Lionel M. Ni
Gang Yu
H. Shum
DiffM
133
1
0
15 Sep 2025
UniVerse-1: Unified Audio-Video Generation via Stitching of Experts
Duomin Wang
W. Zuo
Aojie Li
L. Chen
Xinyao Liao
Deyu Zhou
Zixin Yin
Xili Dai
Daxin Jiang
Gang Yu
DiffM
VGen
120
9
0
07 Sep 2025
EDTalk++: Full Disentanglement for Controllable Talking Head Synthesis
Shuai Tan
Bin Ji
130
0
0
19 Aug 2025
Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer
Zixin Yin
Xili Dai
Ling Chen
Deyu Zhou
Jianan Wang
Duomin Wang
Gang Yu
Lionel M. Ni
Lei Zhang
H. Shum
DiffM
100
1
0
12 Aug 2025
Exploring Timeline Control for Facial Motion Generation
Computer Vision and Pattern Recognition (CVPR), 2025
Yifeng Ma
Jinwei Qi
Chaonan Ji
Peng Zhang
Bang Zhang
Zhidong Deng
Liefeng Bo
VGen
212
0
0
27 May 2025
MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation
Computer Vision and Pattern Recognition (CVPR), 2025
Yukang Lin
Hokit Fung
Jianjin Xu
Zeping Ren
Adela S.M. Lau
Guosheng Yin
Xiu Li
VGen
260
12
0
25 Mar 2025
InsTaG: Learning Personalized 3D Talking Head from Few-Second Video
Computer Vision and Pattern Recognition (CVPR), 2025
Jiahe Li
Jiawei Zhang
Xiao Bai
Jin Zheng
J. Zhou
L. Gu
335
6
0
27 Feb 2025
Dimitra: Audio-driven Diffusion model for Expressive Talking Head Generation
Baptiste Chopin
Tashvik Dhamija
P. Balaji
Yaohui Wang
A. Dantcheva
DiffM
VGen
245
2
0
24 Feb 2025
Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided Mixture-of-Experts
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Xiang Deng
Youxin Pang
Xiaochen Zhao
Chao Xu
Lizhen Wang
Hongjiang Xiao
Shi Yan
Hongwen Zhang
Yebin Liu
DiffM
VGen
195
3
0
31 Oct 2024
Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical and Landmark Loss Optimization
Bin Lin
Yanzhen Yu
Jianhao Ye
Ruitao Lv
Yue Yang
Ruoye Xie
Pan Yu
Hongbin Zhou
VGen
189
2
0
18 Oct 2024
Separation of Neural Drives to Muscles from Transferred Polyfunctional Nerves using Implanted Micro-electrode Arrays
Laura Ferrante
Anna Boesendorfer
D. Barsakcioglu
Benedikt Baumgartner
Yazan Al-Ajam
Alex Woollard
Norbert Venantius Kang
Oskar Aszmann
D. Farina
204
1
0
14 Oct 2024
ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer
European Conference on Computer Vision (ECCV), 2024
Jiazhi Guan
Zhiliang Xu
Hang Zhou
Kaisiyuan Wang
Shengyi He
...
Errui Ding
Jingtuo Liu
Jingdong Wang
Youjian Zhao
Ziwei Liu
VGen
175
10
0
06 Aug 2024
A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing
Ming Meng
Yufei Zhao
Bo Zhang
Yonggui Zhu
Weimin Shi
Maxwell Wen
Zhaoxin Fan
VGen
281
5
0
15 Jun 2024
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Sicheng Xu
Guojun Chen
Yu-Xiao Guo
Jiaolong Yang
Chong Li
Zhenyu Zang
Yizhong Zhang
Xin Tong
Baining Guo
216
168
0
16 Apr 2024
EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis
European Conference on Computer Vision (ECCV), 2024
Shuai Tan
Bin Ji
Mengxiao Bi
Ye Pan
206
63
0
02 Apr 2024
Deepfake Generation and Detection: A Benchmark and Survey
Gan Pei
Jiangning Zhang
Menghan Hu
Ying Tai
Chengjie Wang
Yunsheng Wu
Guangtao Zhai
Jian Yang
Chunhua Shen
Dacheng Tao
276
72
0
26 Mar 2024
VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis
Computer Vision and Pattern Recognition (CVPR), 2024
Enric Corona
Andrei Zanfir
Eduard Gabriel Bazavan
Nikos Kolotouros
Thiemo Alldieck
C. Sminchisescu
VGen
DiffM
164
44
0
13 Mar 2024
CustomListener: Text-guided Responsive Interaction for User-friendly Listening Head Generation
Xi Liu
Ying Guo
Cheng Zhen
Tong Li
Yingying Ao
Pengfei Yan
DiffM
295
13
0
01 Mar 2024
AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation
Yasheng Sun
Wenqing Chu
Hang Zhou
Kaisiyuan Wang
Hideki Koike
130
10
0
25 Feb 2024
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Computer Vision and Pattern Recognition (CVPR), 2024
Evonne Ng
Javier Romero
Timur M. Bagautdinov
Shaojie Bai
Trevor Darrell
Angjoo Kanazawa
Alexander Richard
VGen
169
67
0
03 Jan 2024
DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
Yifeng Ma
Shiwei Zhang
Jiayu Wang
Xiang Wang
Yingya Zhang
Zhidong Deng
DiffM
301
23
0
15 Dec 2023
GMTalker: Gaussian Mixture-based Audio-Driven Emotional Talking Video Portraits
Yibo Xia
Lizhen Wang
Xiang Deng
Xiaoyan Luo
Yunhong Wang
Yebin Liu
VGen
243
2
0
12 Dec 2023
AgentAvatar: Disentangling Planning, Driving and Rendering for Photorealistic Avatar Agents
Duomin Wang
Bin Dai
Yu Deng
Baoyuan Wang
VGen
396
10
0
29 Nov 2023
GAIA: Zero-shot Talking Avatar Generation
International Conference on Learning Representations (ICLR), 2023
Tianyu He
Junliang Guo
Runyi Yu
Yuchi Wang
Jialiang Zhu
...
Chunyu Wang
Han Hu
HsiangTao Wu
Sheng Zhao
Jiang Bian
338
43
0
26 Nov 2023
HumanTOMATO: Text-aligned Whole-body Motion Generation
Shunlin Lu
Ling-Hao Chen
Ailing Zeng
Jing Lin
Ruimao Zhang
Lei Zhang
H. Shum
VGen
221
96
0
19 Oct 2023
OSM-Net: One-to-Many One-shot Talking Head Generation with Spontaneous Head Motions
Jin Liu
Xi Wang
Xiaomeng Fu
Yesheng Chai
Cai Yu
Jiao Dai
Jizhong Han
110
5
0
28 Sep 2023
A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation
Li Liu
Lufei Gao
Wen-Ling Lei
Fengji Ma
Xiaotian Lin
Jin-Tao Wang
CVBM
180
9
0
17 Aug 2023
Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis
IEEE International Conference on Computer Vision (ICCV), 2023
Jiahe Li
Jiawei Zhang
Xiao Bai
Jun Zhou
L. Gu
3DH
183
105
0
18 Jul 2023
A Comprehensive Multi-scale Approach for Speech and Dynamics Synchrony in Talking Head Generation
Louis Airale
Dominique Vaufreydaz
Xavier Alameda-Pineda
135
1
0
04 Jul 2023
Segment Everything Everywhere All at Once
Neural Information Processing Systems (NeurIPS), 2023
Xueyan Zou
Jianwei Yang
Hao Zhang
Feng Li
Linjie Li
Jianfeng Wang
Lijuan Wang
Jianfeng Gao
Yong Jae Lee
MLLM
VLM
281
655
0
13 Apr 2023
TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking Styles
IEEE transactions on multimedia (IEEE TMM), 2023
Yifeng Ma
Suzhe Wang
Yu-qiong Ding
Lincheng Li
Bowen Ma
Tangjie Lv
Changjie Fan
Zhipeng Hu
Zhidong Deng
Xin Yu
CLIP
188
33
0
01 Apr 2023
HumanMAC: Masked Motion Completion for Human Motion Prediction
IEEE International Conference on Computer Vision (ICCV), 2023
Ling-Hao Chen
Jiawei Zhang
Ye-rong Li
Yiren Pang
Xiaobo Xia
Tongliang Liu
DiffM
VGen
233
93
0
07 Feb 2023
Memories are One-to-Many Mapping Alleviators in Talking Face Generation
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Anni Tang
Tianyu He
Xuejiao Tan
Jun Ling
Liang Song
CVBM
249
27
0
09 Dec 2022
1