ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.12992
  4. Cited By
MakeItTalk: Speaker-Aware Talking-Head Animation
v1v2v3 (latest)

MakeItTalk: Speaker-Aware Talking-Head Animation

27 April 2020
Yang Zhou
Xintong Han
Eli Shechtman
J. Echevarria
E. Kalogerakis
Dingzeyu Li
ArXiv (abs)PDFHTML

Papers citing "MakeItTalk: Speaker-Aware Talking-Head Animation"

50 / 258 papers shown
EmoCAST: Emotional Talking Portrait via Emotive Text Description
EmoCAST: Emotional Talking Portrait via Emotive Text Description
Yiguo Jiang
Xiaodong Cun
Yong Zhang
Yudian Zheng
Fan Tang
Chi-Man Pun
DiffM
246
1
0
24 Dec 2025
EvalTalker: Learning to Evaluate Real-Portrait-Driven Multi-Subject Talking Humans
EvalTalker: Learning to Evaluate Real-Portrait-Driven Multi-Subject Talking Humans
Yingjie Zhou
Xilei Zhu
Siyu Ren
Ziyi Zhao
Z. Wang
...
Fengjiao Chen
Xiaoyu Li
Xuezhi Cao
Guangtao Zhai
Xiaohong Liu
EGVM
312
0
0
01 Dec 2025
AI killed the video star. Audio-driven diffusion model for expressive talking head generation
AI killed the video star. Audio-driven diffusion model for expressive talking head generation
Baptiste Chopin
Tashvik Dhamija
P. Balaji
Yaohui Wang
A. Dantcheva
DiffMVGen
118
0
0
27 Nov 2025
Investigating self-supervised representations for audio-visual deepfake detection
Investigating self-supervised representations for audio-visual deepfake detection
Dragos-Alexandru Boldisor
Stefan Smeu
Dan Oneaţă
Elisabeta Oneata
SSL
365
0
0
21 Nov 2025
Is It Truly Necessary to Process and Fit Minutes-Long Reference Videos for Personalized Talking Face Generation?
Is It Truly Necessary to Process and Fit Minutes-Long Reference Videos for Personalized Talking Face Generation?
Rui-Qing Sun
Ang Li
Zhijing Wu
Tian Lan
Qianyu Lu
Xingshan Yao
C. Xu
Xian-Ling Mao
DiffMVGen
491
1
0
11 Nov 2025
LiveNeRF: Efficient Face Replacement Through Neural Radiance Fields Integration
LiveNeRF: Efficient Face Replacement Through Neural Radiance Fields Integration
Tung Vu
Hai Nguyen
Cong Tran
118
0
0
10 Nov 2025
Learning Disentangled Speech- and Expression-Driven Blendshapes for 3D Talking Face Animation
Learning Disentangled Speech- and Expression-Driven Blendshapes for 3D Talking Face Animation
Yuxiang Mao
Zhijie Zhang
Zhiheng Zhang
Jiawei Liu
Chen Zeng
Shihong Xia
146
0
0
29 Oct 2025
MAGIC-Talk: Motion-aware Audio-Driven Talking Face Generation with Customizable Identity Control
MAGIC-Talk: Motion-aware Audio-Driven Talking Face Generation with Customizable Identity Control
Fatemeh Nazarieh
Zhenhua Feng
Diptesh Kanojia
Muhammad Awais
J. Kittler
DiffMVGen
148
1
0
26 Oct 2025
Audio Driven Real-Time Facial Animation for Social Telepresence
Audio Driven Real-Time Facial Animation for Social Telepresence
Jiye Lee
Chenghui Li
Linh Tran
S. Wei
Jason M. Saragih
Alexander Richard
Hanbyul Joo
Shaojie Bai
VGen
193
2
0
01 Oct 2025
Human Motion Video Generation: A Survey
Human Motion Video Generation: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Haiwei Xue
Xiangyang Luo
Zhanghao Hu
Shu Zhang
Xunzhi Xiang
...
Fei Ma
Zhiyong Wu
Changpeng Yang
Zonghong Dai
Fei Richard Yu
EGVMVGen
280
35
0
04 Sep 2025
MIDAS: Multimodal Interactive Digital-humAn Synthesis via Real-time Autoregressive Video Generation
MIDAS: Multimodal Interactive Digital-humAn Synthesis via Real-time Autoregressive Video Generation
Ming Chen
Liyuan Cui
Wenyuan Zhang
Haoxian Zhang
Yan Zhou
...
Jiwen Liu
Borui Liao
Hejia Chen
Xiaoqiang Liu
Pengfei Wan
VGen
289
15
0
26 Aug 2025
Warm Chat: Diffuse Emotion-aware Interactive Talking Head Avatar with Tree-Structured Guidance
Warm Chat: Diffuse Emotion-aware Interactive Talking Head Avatar with Tree-Structured Guidance
Haijie Yang
Zhenyu Zhang
Hao Tang
Jianjun Qian
Jian Yang
254
0
0
25 Aug 2025
Audio2Face-3D: Audio-driven Realistic Facial Animation For Digital Avatars
Audio2Face-3D: Audio-driven Realistic Facial Animation For Digital Avatars
NVIDIA
Chaeyeon Chung
Ilya Fedorov
Michael Huang
Aleksey Karmanov
Dmitry Korobchenko
Roger Ribera
Yeongho Seol
CVBM
347
5
0
22 Aug 2025
Taming Transformer for Emotion-Controllable Talking Face Generation
Taming Transformer for Emotion-Controllable Talking Face Generation
Ziqi Zhang
Cheng Deng
CVBM
196
0
0
20 Aug 2025
EDTalk++: Full Disentanglement for Controllable Talking Head Synthesis
EDTalk++: Full Disentanglement for Controllable Talking Head Synthesis
Shuai Tan
Bin Ji
312
3
0
19 Aug 2025
RealTalk: Realistic Emotion-Aware Lifelike Talking-Head Synthesis
RealTalk: Realistic Emotion-Aware Lifelike Talking-Head Synthesis
Wenqing Wang
Yun Fu
173
0
0
16 Aug 2025
Text2Lip: Progressive Lip-Synced Talking Face Generation from Text via Viseme-Guided Rendering
Text2Lip: Progressive Lip-Synced Talking Face Generation from Text via Viseme-Guided Rendering
Xu Wang
Shengeng Tang
Fei Wang
L. T. Cheng
Dan Guo
Feng Xue
Richang Hong
203
3
0
04 Aug 2025
X-Actor: Emotional and Expressive Long-Range Portrait Acting from Audio
X-Actor: Emotional and Expressive Long-Range Portrait Acting from Audio
Chenxu Zhang
Zenan Li
Hongyi Xu
You Xie
Xiaochen Zhao
...
Guoxian Song
Xin Chen
C. Liang
Jianwen Jiang
Linjie Luo
VGen
202
5
0
04 Aug 2025
Who is a Better Talker: Subjective and Objective Quality Assessment for AI-Generated Talking Heads
Who is a Better Talker: Subjective and Objective Quality Assessment for AI-Generated Talking Heads
Yingjie Zhou
Jiezhang Cao
Zicheng Zhang
Farong Wen
Yanwei Jiang
Jun Jia
Xiaohong Liu
Xiongkuo Min
Guangtao Zhai
EGVM
86
3
0
31 Jul 2025
Mask-Free Audio-driven Talking Face Generation for Enhanced Visual Quality and Identity Preservation
Mask-Free Audio-driven Talking Face Generation for Enhanced Visual Quality and Identity Preservation
Dogucan Yaman
Fevziye Irem Eyiokur
Leonard Barmann
H. K. Ekenel
Alexander H. Waibel
CVBM
258
1
0
28 Jul 2025
Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation
Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Yuan Gan
Jiaxu Miao
Yunze Wang
Yi Yang
AAMLDiffM
261
4
0
02 Jun 2025
FaceEditTalker: Controllable Talking Head Generation with Facial Attribute Editing
FaceEditTalker: Controllable Talking Head Generation with Facial Attribute Editing
Guanwen Feng
Zhiyuan Ma
Yunan Li
Junwei Jing
Junwei Jing
Qiguang Miao
288
0
0
28 May 2025
CAD: A General Multimodal Framework for Video Deepfake Detection via Cross-Modal Alignment and Distillation
CAD: A General Multimodal Framework for Video Deepfake Detection via Cross-Modal Alignment and Distillation
Yuxuan Du
Zhendong Wang
Yuhao Luo
Caiyong Piao
Zhiyuan Yan
Hao Li
Lichao Sun
452
8
0
21 May 2025
Model See Model Do: Speech-Driven Facial Animation with Style Control
Model See Model Do: Speech-Driven Facial Animation with Style Control
Yifang Pan
Karan Singh
Luiz Gustavo Hafemann
DiffM
377
0
0
02 May 2025
Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation
Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation
Weipeng Tan
Chuming Lin
Chengming Xu
F. Xu
Xiaobin Hu
Xiaozhong Ji
Junwei Zhu
Chengjie Wang
Yanwei Fu
386
8
0
25 Apr 2025
Design Activity for Robot Faces: Evaluating Child Responses To Expressive Faces
Design Activity for Robot Faces: Evaluating Child Responses To Expressive Faces
Denielle Oliva
Joshua Knight
Tyler Becker
Heather Amistani
Monica Nicolescu
David Feil-Seifer
72
3
0
10 Apr 2025
Exploiting Temporal Audio-Visual Correlation Embedding for Audio-Driven One-Shot Talking Head Animation
Exploiting Temporal Audio-Visual Correlation Embedding for Audio-Driven One-Shot Talking Head AnimationIEEE transactions on multimedia (TMM), 2025
Zhihua Xu
Tianshui Chen
Zhijing Yang
Siyuan Peng
Keze Wang
Guanbin Li
235
3
0
08 Apr 2025
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
Maksim Siniukov
Di Chang
Minh Tran
Hongkun Gong
Ashutosh Chaubey
Mohammad Soleymani
DiffMVGen
369
7
0
05 Apr 2025
OmniTalker: One-shot Real-time Text-Driven Talking Audio-Video Generation With Multimodal Style Mimicking
OmniTalker: One-shot Real-time Text-Driven Talking Audio-Video Generation With Multimodal Style Mimicking
Zhongjian Wang
Peng Zhang
Jinwei Qi
Guangyuan Wang Sheng Xu
Chaonan Ji
Sheng Xu
Bang Zhang
Liefeng Bo
DiffMVGen
460
0
0
03 Apr 2025
Monocular and Generalizable Gaussian Talking Head Animation
Monocular and Generalizable Gaussian Talking Head AnimationComputer Vision and Pattern Recognition (CVPR), 2025
Shengjie Gong
Haoyang Li
Jiapeng Tang
Dongming Hu
Shuangping Huang
Hao Chen
Tianshui Chen
Zhuoman Liu
3DGS
259
10
0
01 Apr 2025
ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model
ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model
Jinwei Qi
Chaonan Ji
Sheng Xu
Peng Zhang
Bang Zhang
Liefeng Bo
DiffMVGen
275
8
0
27 Mar 2025
DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion Model
DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion Model
Kangwei Liu
Junwu Liu
Yun Cao
Jinlin Guo
Xiaowei Yi
DiffM
289
1
0
24 Mar 2025
Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation
Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Dingcheng Zhen
Shunshun Yin
Shiyang Qin
Hou Yi
Ziwei Zhang
Siyuan Liu
Gan Qi
Ming Tao
VGen
362
13
0
24 Mar 2025
3D Engine-ready Photorealistic Avatars via Dynamic Textures
3D Engine-ready Photorealistic Avatars via Dynamic Textures
Yifan Wang
Ivan Molodetskikh
Ondrej Texler
Dimitar Dinev
395
0
0
19 Mar 2025
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait
Chaolong Yang
Kai Yao
Yuyao Yan
Chenru Jiang
Weiguang Zhao
Jie Sun
Guangliang Cheng
Yifei Zhang
Bin Dong
K. Huang
DiffM
340
2
0
17 Mar 2025
SyncDiff: Diffusion-based Talking Head Synthesis with Bottlenecked Temporal Visual Prior for Improved Synchronization
SyncDiff: Diffusion-based Talking Head Synthesis with Bottlenecked Temporal Visual Prior for Improved SynchronizationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025
Xulin Fan
Heting Gao
Ziyi Chen
Peng Chang
Mei Han
Mark Hasegawa-Johnson
DiffM
370
2
0
17 Mar 2025
Removing Averaging: Personalized Lip-Sync Driven Characters Based on Identity Adapter
Removing Averaging: Personalized Lip-Sync Driven Characters Based on Identity Adapter
Yanyu Zhu
Licheng Bai
Jintao Xu
Jiwei Tang
428
1
0
09 Mar 2025
FREAK: Frequency-modulated High-fidelity and Real-time Audio-driven Talking Portrait Synthesis
FREAK: Frequency-modulated High-fidelity and Real-time Audio-driven Talking Portrait SynthesisInternational Conference on Multimedia Retrieval (ICMR), 2025
Ziqi Ni
Ao Fu
Yi Zhou
489
0
0
06 Mar 2025
FLAP: Fully-controllable Audio-driven Portrait Video Generation through 3D head conditioned diffusion model
FLAP: Fully-controllable Audio-driven Portrait Video Generation through 3D head conditioned diffusion model
Lingzhou Mu
Baiji Liu
Ruonan Zhang
Guiming Mo
Jiawei Jin
Kai Zhang
Haozhi Huang
DiffMVGen
626
0
0
26 Feb 2025
Dimitra: Audio-driven Diffusion model for Expressive Talking Head Generation
Dimitra: Audio-driven Diffusion model for Expressive Talking Head Generation
Baptiste Chopin
Tashvik Dhamija
P. Balaji
Yaohui Wang
A. Dantcheva
DiffMVGen
339
4
0
24 Feb 2025
Emotion Recognition and Generation: A Comprehensive Review of Face, Speech, and Text Modalities
Emotion Recognition and Generation: A Comprehensive Review of Face, Speech, and Text Modalities
Rebecca Mobbs
Dimitrios Makris
Vasileios Argyriou
236
9
0
02 Feb 2025
Joint Learning of Depth and Appearance for Portrait Image Animation
Joint Learning of Depth and Appearance for Portrait Image Animation
Xinya Ji
Gaspard Zoss
Prashanth Chandran
Lingchen Yang
Xun Cao
B. Solenthaler
D. Bradley
3DHMDE
392
2
0
15 Jan 2025
DEGSTalk: Decomposed Per-Embedding Gaussian Fields for Hair-Preserving Talking Face Synthesis
DEGSTalk: Decomposed Per-Embedding Gaussian Fields for Hair-Preserving Talking Face SynthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Kaijun Deng
Dezhi Zheng
Jindong Xie
Jinbao Wang
Weicheng Xie
Linlin Shen
Siyang Song
3DGS
258
5
0
31 Dec 2024
FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG Distillation
FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG DistillationComputer Vision and Pattern Recognition (CVPR), 2024
Tianyun Zhong
Chao Liang
Jianwen Jiang
Gaojie Lin
Jiaqi Yang
Zhou Zhao
DiffM
577
5
0
22 Dec 2024
INFP: Audio-Driven Interactive Head Generation in Dyadic Conversations
INFP: Audio-Driven Interactive Head Generation in Dyadic ConversationsComputer Vision and Pattern Recognition (CVPR), 2024
Yongming Zhu
Longhao Zhang
Zhengkun Rong
Tianshu Hu
Shuang Liang
Zhipeng Ge
VGen
281
29
0
05 Dec 2024
Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided
  Mixture-of-Experts
Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided Mixture-of-ExpertsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Xiang Deng
Youxin Pang
Xiaochen Zhao
Chao Xu
Lizhen Wang
Hongjiang Xiao
Shi Yan
Hongwen Zhang
Yebin Liu
DiffMVGen
358
4
0
31 Oct 2024
Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical
  and Landmark Loss Optimization
Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical and Landmark Loss Optimization
Bin Lin
Yanzhen Yu
Jianhao Ye
Ruitao Lv
Yue Yang
Ruoye Xie
Pan Yu
Hongbin Zhou
VGen
300
4
0
18 Oct 2024
DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation
DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video GenerationInternational Conference on Learning Representations (ICLR), 2024
Hanbo Cheng
Limin Lin
Chenyu Liu
Pengcheng Xia
Pengfei Hu
Jiefeng Ma
Jun Du
Jia Pan
DiffMVGen
1.1K
6
0
17 Oct 2024
MimicTalk: Mimicking a personalized and expressive 3D talking face in
  minutes
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutesNeural Information Processing Systems (NeurIPS), 2024
Zhenhui Ye
Tianyun Zhong
Yi Ren
Ziyue Jiang
Jiawei Huang
...
Chen Zhang
Zehan Wang
Xize Chen
Xiang Yin
Zhou Zhao
VGen
360
21
0
09 Oct 2024
EmoGene: Audio-Driven Emotional 3D Talking-Head Generation
EmoGene: Audio-Driven Emotional 3D Talking-Head GenerationIEEE International Conference on Automatic Face & Gesture Recognition (FG), 2024
Wenqing Wang
Yun Fu
VGen
411
1
0
07 Oct 2024
123456
Next
Page 1 of 6