ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.08801
  4. Cited By
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image
  Animation
v1v2 (latest)

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

13 June 2024
Mingwang Xu
Hui Li
Qingkun Su
Hanlin Shang
Liwei Zhang
Ce Liu
Jingdong Wang
Yao Yao
Siyu Zhu
    VGen
ArXiv (abs)PDFHTML

Papers citing "Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation"

49 / 99 papers shown
Title
Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation
Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation
Weipeng Tan
Chuming Lin
Chengming Xu
F. Xu
Xiaobin Hu
Xiaozhong Ji
Junwei Zhu
Chengjie Wang
Yanwei Fu
281
3
0
25 Apr 2025
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Mengchao Wang
Qiang Wang
Fan Jiang
Yaqi Fan
Yunpeng Zhang
Yonggang Qi
Kun Zhao
Mu Xu
DiffMVGen
184
39
0
07 Apr 2025
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
Maksim Siniukov
Di Chang
Minh Tran
Hongkun Gong
Ashutosh Chaubey
Mohammad Soleymani
DiffMVGen
290
2
0
05 Apr 2025
OmniTalker: One-shot Real-time Text-Driven Talking Audio-Video Generation With Multimodal Style Mimicking
OmniTalker: One-shot Real-time Text-Driven Talking Audio-Video Generation With Multimodal Style Mimicking
Zhongjian Wang
Peng Zhang
Jinwei Qi
Guangyuan Wang Sheng Xu
Chaonan Ji
Sheng Xu
Bang Zhang
Liefeng Bo
DiffMVGen
352
0
0
03 Apr 2025
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation
Fa-Ting Hong
Zunnan Xu
Zixiang Zhou
Zhiqiang Zhang
Xiu Li
Qin Lin
Qinglin Lu
D. Xu
DiffMVGen
427
8
0
03 Apr 2025
MoCha: Towards Movie-Grade Talking Character Synthesis
MoCha: Towards Movie-Grade Talking Character Synthesis
Cong Wei
Bo Sun
Haoyu Ma
Ji Hou
F. Xu
...
Kunpeng Li
Tingbo Hou
Animesh Sinha
Peter Vajda
Lei Ma
VGen
750
19
0
30 Mar 2025
MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation
MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait AnimationComputer Vision and Pattern Recognition (CVPR), 2025
Yukang Lin
Hokit Fung
Jianjin Xu
Zeping Ren
Adela S.M. Lau
Guosheng Yin
Xiu Li
VGen
273
12
0
25 Mar 2025
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion TransformersComputer Vision and Pattern Recognition (CVPR), 2025
Jiazhi Guan
Kaisiyuan Wang
Zhiliang Xu
Quanwei Yang
Yasheng Sun
...
Errui Ding
Jiadong Wang
Youjian Zhao
Hang Zhou
Ziwei Liu
VGen
232
1
0
25 Mar 2025
DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion Model
DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion Model
Kangwei Liu
Junwu Liu
Yun Cao
Jinlin Guo
Xiaowei Yi
DiffM
223
0
0
24 Mar 2025
Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation
Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Dingcheng Zhen
Shunshun Yin
Shiyang Qin
Hou Yi
Ziwei Zhang
Siyuan Liu
Gan Qi
Ming Tao
VGen
227
7
0
24 Mar 2025
Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model
Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion ModelComputer Vision and Pattern Recognition (CVPR), 2025
Yingying Fan
Quanwei Yang
Kaisiyuan Wang
Hang Zhou
Yingying Li
Haocheng Feng
Errui Ding
Y. Wu
Jiadong Wang
DiffM
321
6
0
21 Mar 2025
PoseTraj: Pose-Aware Trajectory Control in Video Diffusion
PoseTraj: Pose-Aware Trajectory Control in Video DiffusionComputer Vision and Pattern Recognition (CVPR), 2025
Longbin Ji
Lei Zhong
Pengfei Wei
Changjian Li
DiffMVGen
227
3
0
20 Mar 2025
ExDDV: A New Dataset for Explainable Deepfake Detection in Video
ExDDV: A New Dataset for Explainable Deepfake Detection in Video
Vlad Hondru
Eduard Hogea
Darian M. Onchis
Radu Tudor Ionescu
363
11
0
18 Mar 2025
SyncDiff: Diffusion-based Talking Head Synthesis with Bottlenecked Temporal Visual Prior for Improved Synchronization
SyncDiff: Diffusion-based Talking Head Synthesis with Bottlenecked Temporal Visual Prior for Improved SynchronizationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025
Xulin Fan
Heting Gao
Ziyi Chen
Peng Chang
Mei Han
Mark Hasegawa-Johnson
DiffM
293
1
0
17 Mar 2025
RASA: Replace Anyone, Say Anything -- A Training-Free Framework for Audio-Driven and Universal Portrait Video Editing
Tianrui Pan
Lin Liu
Jie Liu
Xinsong Zhang
J. Tang
Gangshan Wu
Q. Tian
DiffMVGen
257
0
0
14 Mar 2025
Versatile Multimodal Controls for Expressive Talking Human Animation
Versatile Multimodal Controls for Expressive Talking Human Animation
Zheng Qin
Ruobing Zheng
Yabing Wang
Tianqi Li
Zixin Zhu
Minghui Yang
Ming Yang
Le Wang
DiffMVGen
251
0
0
10 Mar 2025
FREAK: Frequency-modulated High-fidelity and Real-time Audio-driven Talking Portrait Synthesis
FREAK: Frequency-modulated High-fidelity and Real-time Audio-driven Talking Portrait SynthesisInternational Conference on Multimedia Retrieval (ICMR), 2025
Ziqi Ni
Ao Fu
Yi Zhou
411
0
0
06 Mar 2025
KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation
KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame InterpolationComputer Vision and Pattern Recognition (CVPR), 2025
Antoni Bigata
Michał Stypułkowski
Rodrigo Mira
Stella Bounareli
Konstantinos Vougioukas
Zoe Landgraf
Nikita Drobyshev
Maciej Ziȩba
Stavros Petridis
Maja Pantic
DiffMVGen
283
6
0
03 Mar 2025
LAM: Large Avatar Model for One-shot Animatable Gaussian Head
LAM: Large Avatar Model for One-shot Animatable Gaussian Head
Yisheng He
Xiaodong Gu
Xiaodan Ye
Chao Xu
Zhengyi Zhao
Yuan Dong
Weihao Yuan
Zilong Dong
Liefeng Bo
3DGS
501
4
0
25 Feb 2025
X-Dancer: Expressive Music to Human Dance Video Generation
X-Dancer: Expressive Music to Human Dance Video Generation
Zeyuan Chen
Hongyi Xu
Guoxian Song
You Xie
Chenxu Zhang
Xiusi Chen
Chao Wang
Di Chang
Linjie Luo
VGen
301
8
0
24 Feb 2025
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
Junxian Ma
Shiwen Wang
Jian Yang
Junyi Hu
Jian Liang
Guosheng Lin
Jingbo Chen
Kai Li
Yu Meng
DiffMVGen
305
7
0
17 Feb 2025
SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head Animation
SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head AnimationInternational Joint Conference on Artificial Intelligence (IJCAI), 2025
Yujian Liu
Shidang Xu
Jing Guo
Dingbin Wang
Zairan Wang
Xianfeng Tan
Xiaoli Liu
96
3
0
24 Jan 2025
Joint Learning of Depth and Appearance for Portrait Image Animation
Joint Learning of Depth and Appearance for Portrait Image Animation
Xinya Ji
Gaspard Zoss
Prashanth Chandran
Lingchen Yang
Xun Cao
B. Solenthaler
D. Bradley
3DHMDE
317
1
0
15 Jan 2025
DEGSTalk: Decomposed Per-Embedding Gaussian Fields for Hair-Preserving Talking Face Synthesis
DEGSTalk: Decomposed Per-Embedding Gaussian Fields for Hair-Preserving Talking Face SynthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Kaijun Deng
Dezhi Zheng
Jindong Xie
Jinbao Wang
Weicheng Xie
Linlin Shen
Siyang Song
3DGS
207
2
0
31 Dec 2024
FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG Distillation
FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG DistillationComputer Vision and Pattern Recognition (CVPR), 2024
Tianyun Zhong
Chao Liang
Jianwen Jiang
Gaojie Lin
Jiaqi Yang
Zhou Zhao
DiffM
427
4
0
22 Dec 2024
Real-time One-Step Diffusion-based Expressive Portrait Videos Generation
Real-time One-Step Diffusion-based Expressive Portrait Videos Generation
Hanzhong Guo
Hongwei Yi
Daquan Zhou
Alexander William Bergman
Michael Lingelbach
Yizhou Yu
DiffM
260
3
0
18 Dec 2024
INFP: Audio-Driven Interactive Head Generation in Dyadic Conversations
INFP: Audio-Driven Interactive Head Generation in Dyadic ConversationsComputer Vision and Pattern Recognition (CVPR), 2024
Yongming Zhu
Longhao Zhang
Zhengkun Rong
Tianshu Hu
Shuang Liang
Zhipeng Ge
VGen
194
14
0
05 Dec 2024
Playable Game Generation
Playable Game Generation
Mingyu Yang
Junyou Li
Zhongbin Fang
Sheng Chen
Yangbin Yu
Qiang Fu
Wei Yang
Deheng Ye
VGen
264
19
0
01 Dec 2024
Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer
Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion TransformerComputer Vision and Pattern Recognition (CVPR), 2024
Jiahao Cui
Hui Li
Yun Zhan
Hanlin Shang
K. Cheng
Yuqi Ma
Shan Mu
Hang Zhou
Jingdong Wang
Siyu Zhu
ViTVGen
485
73
0
01 Dec 2024
Deepfake Media Generation and Detection in the Generative AI Era: A
  Survey and Outlook
Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook
Florinel-Alin Croitoru
Andrei Iulian Hiji
Vlad Hondru
Nicolae-Cătălin Ristea
Paul Irofti
Marius Popescu
Cristian Rusu
Radu Tudor Ionescu
Fahad Shahbaz Khan
Mubarak Shah
377
15
0
29 Nov 2024
OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video GenerationComputer Vision and Pattern Recognition (CVPR), 2024
Hui Li
Mingwang Xu
Yun Zhan
Shan Mu
Jiaye Li
...
Yukang Chen
Tan Chen
Mao Ye
Jingdong Wang
Siyu Zhu
VGen
496
38
0
28 Nov 2024
Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Sonic: Shifting Focus to Global Audio Perception in Portrait AnimationComputer Vision and Pattern Recognition (CVPR), 2024
Xiaozhong Ji
Xiaobin Hu
Zhihong Xu
Junwei Zhu
Chuming Lin
...
Donghao Luo
Yi Chen
Qin Lin
Qinglin Lu
Chengjie Wang
VGen
377
43
0
25 Nov 2024
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human AnimationComputer Vision and Pattern Recognition (CVPR), 2024
Rang Meng
Xingyu Zhang
Yuming Li
Chenguang Ma
415
48
0
15 Nov 2024
JoyVASA: Portrait and Animal Image Animation with Diffusion-Based
  Audio-Driven Facial Dynamics and Head Motion Generation
JoyVASA: Portrait and Animal Image Animation with Diffusion-Based Audio-Driven Facial Dynamics and Head Motion Generation
Xuyang Cao
Guoxin Wang
Sheng Shi
Jun Zhao
Yang Yao
Jintao Fei
Minyu Gao
VGen
372
6
0
14 Nov 2024
DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation
DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video GenerationInternational Conference on Learning Representations (ICLR), 2024
Hanbo Cheng
Limin Lin
Chenyu Liu
Pengcheng Xia
Pengfei Hu
Jiefeng Ma
Jun Du
Jia Pan
DiffMVGen
973
2
0
17 Oct 2024
MuseTalk: Real-Time High-Fidelity Video Dubbing via Spatio-Temporal Sampling
MuseTalk: Real-Time High-Fidelity Video Dubbing via Spatio-Temporal Sampling
Yue Zhang
Minhao Liu
Zhaokang Chen
Bin Wu
Yubin Zeng
Chao Zhan
Yingjie He
Junxin Huang
Wenjiang Zhou
Wenjiang Zhou
363
3
0
14 Oct 2024
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image
  Animation
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image AnimationInternational Conference on Learning Representations (ICLR), 2024
Jiahao Cui
Hui Li
Yao Yao
Hao Zhu
Hanlin Shang
Kaihui Cheng
Hang Zhou
Siyu Zhu
Jingdong Wang
DiffMVGen
294
74
0
10 Oct 2024
JoyHallo: Digital human model for Mandarin
JoyHallo: Digital human model for Mandarin
Sheng Shi
Xuyang Cao
Jun Zhao
Guoxin Wang
VGen
145
3
0
20 Sep 2024
PainDiffusion: Learning to Express Pain
PainDiffusion: Learning to Express Pain
Quang Tien Dam
Tri Tung Nguyen Nguyen
Yuki Endo
D. Tran
Joo-Ho Lee
VGen
306
0
0
18 Sep 2024
SVP: Style-Enhanced Vivid Portrait Talking Head Diffusion Model
SVP: Style-Enhanced Vivid Portrait Talking Head Diffusion Model
Weipeng Tan
Chuming Lin
Chengming Xu
Xiaozhong Ji
Junwei Zhu
Chengjie Wang
Yanwei Fu
DiffM
131
2
0
05 Sep 2024
CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention
CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention
Gaojie Lin
Jianwen Jiang
Chao Liang
Tianyun Zhong
Jiaqi Yang
Yanbo Zheng
VGenDiffM
502
32
0
03 Sep 2024
MegActor-$Σ$: Unlocking Flexible Mixed-Modal Control in Portrait
  Animation with Diffusion Transformer
MegActor-ΣΣΣ: Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer
Shurong Yang
Huadong Li
Juhao Wu
Minhao Jing
Linze Li
Renhe Ji
Jiajun Liang
Haoqiang Fan
Jin Wang
VGenDiffM
204
12
0
27 Aug 2024
4D Diffusion for Dynamic Protein Structure Prediction with Reference
  Guided Motion Alignment
4D Diffusion for Dynamic Protein Structure Prediction with Reference Guided Motion Alignment
Kaihui Cheng
Ce Liu
Qingkun Su
Jun Wang
Liwei Zhang
Yining Tang
Yao Yao
Siyu Zhu
Yuan Qi
DiffM
147
0
0
22 Aug 2024
Dynamic PDB: A New Dataset and a SE(3) Model Extension by Integrating
  Dynamic Behaviors and Physical Properties in Protein Structures
Dynamic PDB: A New Dataset and a SE(3) Model Extension by Integrating Dynamic Behaviors and Physical Properties in Protein Structures
Ce Liu
Jun Wang
Zhiqiang Cai
Yingxu Wang
Huizhen Kuang
...
Yining Tang
Fenglei Cao
Limei Han
Siyu Zhu
Yuan Qi
3DV
214
9
0
22 Aug 2024
DEGAS: Detailed Expressions on Full-Body Gaussian Avatars
DEGAS: Detailed Expressions on Full-Body Gaussian AvatarsInternational Conference on 3D Vision (3DV), 2024
Zhijing Shao
D. B. Wang
Qing-Yao Tian
Yao-Dong Yang
Hengyu Meng
Zeyu Cai
Bo Dong
Yu Zhang
Kang Zhang
Zhaoxiang Wang
3DGS
256
7
0
20 Aug 2024
JambaTalk: Speech-Driven 3D Talking Head Generation Based on Hybrid Transformer-Mamba Model
JambaTalk: Speech-Driven 3D Talking Head Generation Based on Hybrid Transformer-Mamba Model
Farzaneh Jafari
Stefano Berretti
Anup Basu
Mamba
355
1
0
03 Aug 2024
LinguaLinker: Audio-Driven Portraits Animation with Implicit Facial
  Control Enhancement
LinguaLinker: Audio-Driven Portraits Animation with Implicit Facial Control Enhancement
Rui Zhang
Yixiao Fang
Zhen-Zhong Lu
Pei Cheng
Zebiao Huang
Bin-Bin Fu
DiffMVGen
160
1
0
26 Jul 2024
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable
  Landmark Conditions
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions
Zhiyuan Chen
Jiajiong Cao
Zhiquan Chen
Yuming Li
Chenguang Ma
VGen
242
154
0
11 Jul 2024
SingingHead: A Large-scale 4D Dataset for Singing Head Animation
SingingHead: A Large-scale 4D Dataset for Singing Head Animation
Sijing Wu
Yunhao Li
Weitian Zhang
Jun Jia
Yucheng Zhu
Manwen Liao
Guangtao Zhai
Xiaokang Yang
196
5
0
07 Dec 2023
Previous
12