ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.14613
  4. Cited By
GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents
v1v2v3v4 (latest)

GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents

ACM Transactions on Graphics (TOG), 2023
26 March 2023
Tenglong Ao
Zeyi Zhang
Libin Liu
    DiffMVGen
ArXiv (abs)PDFHTMLGithub

Papers citing "GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents"

50 / 60 papers shown
CoordSpeaker: Exploiting Gesture Captioning for Coordinated Caption-Empowered Co-Speech Gesture Generation
CoordSpeaker: Exploiting Gesture Captioning for Coordinated Caption-Empowered Co-Speech Gesture Generation
Fengyi Fang
Sicheng Yang
Wenming Yang
SLR
316
0
0
28 Nov 2025
Towards Reliable Human Evaluations in Gesture Generation: Insights from a Community-Driven State-of-the-Art Benchmark
Towards Reliable Human Evaluations in Gesture Generation: Insights from a Community-Driven State-of-the-Art Benchmark
Rajmund Nagy
Hendric Voss
Thanh Hoang-Minh
Mihail Tsakov
Teodor Nikolov
...
R. Mcdonnell
Michael Neff
Taras Kucherenko
Youngwoo Yoon
G. Henter
EGVMVGen
461
0
0
03 Nov 2025
Gestura: A LVLM-Powered System Bridging Motion and Semantics for Real-Time Free-Form Gesture Understanding
Gestura: A LVLM-Powered System Bridging Motion and Semantics for Real-Time Free-Form Gesture Understanding
Zhuoming Li
Aitong Liu
Mengxi Jia
Yubi Lu
T. Zhang
Changzhi Sun
Dell Zhang
Xuelong Li
247
0
0
21 Oct 2025
MimicParts: Part-aware Style Injection for Speech-Driven 3D Motion Generation
MimicParts: Part-aware Style Injection for Speech-Driven 3D Motion Generation
Lianlian Liu
YongKang He
Zhaojie Chu
Xiaofen Xing
Xiangmin Xu
160
1
0
15 Oct 2025
Social Agent: Mastering Dyadic Nonverbal Behavior Generation via Conversational LLM Agents
Social Agent: Mastering Dyadic Nonverbal Behavior Generation via Conversational LLM Agents
Zeyi Zhang
Yanju Zhou
Heyuan Yao
Tenglong Ao
Xiaohang Zhan
Libin Liu
LLMAG
219
6
0
06 Oct 2025
InterAct: A Large-Scale Dataset of Dynamic, Expressive and Interactive Activities between Two People in Daily Scenarios
InterAct: A Large-Scale Dataset of Dynamic, Expressive and Interactive Activities between Two People in Daily ScenariosProceedings of the ACM on Computer Graphics and Interactive Techniques (PACMCGIT), 2025
Leo Ho
Yinghao Huang
Dafei Qin
Mingyi Shi
Wangpok Tse
Wei Liu
Junichi Yamagishi
Taku Komura
VGen
159
2
0
06 Sep 2025
Motion-example-controlled Co-speech Gesture Generation Leveraging Large Language Models
Motion-example-controlled Co-speech Gesture Generation Leveraging Large Language Models
Bohong Chen
Yumeng Li
Youyi Zheng
Yao-Xiang Ding
Kun Zhou
313
2
0
27 Jul 2025
SemGes: Semantics-aware Co-Speech Gesture Generation using Semantic Coherence and Relevance Learning
SemGes: Semantics-aware Co-Speech Gesture Generation using Semantic Coherence and Relevance Learning
Lanmiao Liu
E. Ghaleb
Aslı Özyürek
Zerrin Yumak
SLR
315
5
0
25 Jul 2025
MOSPA: Human Motion Generation Driven by Spatial Audio
MOSPA: Human Motion Generation Driven by Spatial Audio
Shuyang Xu
Zhiyang Dou
Mingyi Shi
Liang Pan
Leo Ho
...
Yuan Liu
Cheng Lin
Y. Ma
Wenping Wang
Taku Komura
272
10
0
16 Jul 2025
Semantics-Aware Human Motion Generation from Audio Instructions
Semantics-Aware Human Motion Generation from Audio InstructionsGraphical Models (GM), 2025
Zi-An Wang
Shihao Zou
Shiyao Yu
Mingyuan Zhang
Chao Dong
VGen
180
2
0
29 May 2025
Wav2Sem: Plug-and-Play Audio Semantic Decoupling for 3D Speech-Driven Facial Animation
Wav2Sem: Plug-and-Play Audio Semantic Decoupling for 3D Speech-Driven Facial AnimationComputer Vision and Pattern Recognition (CVPR), 2025
Hao Li
Ju Dai
Xin Zhao
Feng Zhou
Junjun Pan
Lei Li
219
5
0
29 May 2025
M3G: Multi-Granular Gesture Generator for Audio-Driven Full-Body Human Motion Synthesis
M3G: Multi-Granular Gesture Generator for Audio-Driven Full-Body Human Motion Synthesis
Zhizhuo Yin
Yuk Hang Tsui
Pan Hui
SLRVGen
389
2
0
13 May 2025
ReactDance: Hierarchical Representation for High-Fidelity and Coherent Long-Form Reactive Dance Generation
ReactDance: Hierarchical Representation for High-Fidelity and Coherent Long-Form Reactive Dance Generation
Jingzhong Lin
Xinru Li
Yuanyuan Qi
Hao Wu
Wenxiang Liu
...
Xuejiao Wang
Xiangfeng Xu
Bangyan Li
Changbo Wang
Gaoqi He
297
0
0
08 May 2025
Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
Co3^{3}3Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive DiffusionInternational Conference on Learning Representations (ICLR), 2025
Xingqun Qi
Yatian Wang
Hengyuan Zhang
J. Pan
Wei Xue
Shanghang Zhang
Wenhan Luo
Qifeng Liu
Wenhan Luo
SLR
359
10
0
03 May 2025
EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation
EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation
Xiangyue Zhang
Jianfang Li
Jiaxu Zhang
Jianqiang Ren
Liefeng Bo
Zhigang Tu
398
8
0
12 Apr 2025
Understanding Co-speech Gestures in-the-wild
Understanding Co-speech Gestures in-the-wild
Sindhu B. Hegde
KR Prajwal
Taein Kwon
Andrew Zisserman
SLR
450
3
0
28 Mar 2025
ReCoM: Realistic Co-Speech Motion Generation with Recurrent Embedded Transformer
ReCoM: Realistic Co-Speech Motion Generation with Recurrent Embedded Transformer
Yong Xie
Yunlian Sun
Hongwen Zhang
Zichen Liu
Jinhui Tang
VGen
348
0
0
27 Mar 2025
SARGes: Semantically Aligned Reliable Gesture Generation via Intent Chain
SARGes: Semantically Aligned Reliable Gesture Generation via Intent Chain
Nan Gao
Yihua Bao
Dongdong Weng
Jiayi Zhao
Jia Li
Yan Zhou
Pengfei Wan
Di Zhang
SLR
391
1
0
26 Mar 2025
DIDiffGes: Decoupled Semi-Implicit Diffusion Models for Real-time Gesture Generation from Speech
DIDiffGes: Decoupled Semi-Implicit Diffusion Models for Real-time Gesture Generation from SpeechAAAI Conference on Artificial Intelligence (AAAI), 2025
Yongkang Cheng
Shaoli Huang
Xuelin Chen
J. Ning
Biwei Huang
DiffM
363
3
0
21 Mar 2025
HERO: Human Reaction Generation from Videos
HERO: Human Reaction Generation from Videos
Chengjun Yu
Wei-dong Zhai
Yuhang Yang
Yang Cao
Zheng-jun Zha
VGen
346
11
0
11 Mar 2025
ExGes: Expressive Human Motion Retrieval and Modulation for Audio-Driven Gesture Synthesis
ExGes: Expressive Human Motion Retrieval and Modulation for Audio-Driven Gesture Synthesis
Xukun Zhou
Fengxin Li
Ming Chen
Yan Zhou
Pengfei Wan
Di Zhang
Yeying Jin
Zhaoxin Fan
Hongyan Liu
Jun He
DiffMVGen
482
1
0
09 Mar 2025
Maximizing Signal in Human-Model Preference Alignment
Maximizing Signal in Human-Model Preference AlignmentAAAI Conference on Artificial Intelligence (AAAI), 2025
Kelsey Kraus
Margaret Kroll
ALM
278
7
0
06 Mar 2025
ExpGest: Expressive Speaker Generation Using Diffusion Model and Hybrid Audio-Text Guidance
ExpGest: Expressive Speaker Generation Using Diffusion Model and Hybrid Audio-Text GuidanceIEEE International Conference on Multimedia and Expo (ICME), 2024
Yongkang Cheng
Mingjiang Liang
Shaoli Huang
J. Ning
Wei Liu
Wei Liu
DiffM
318
2
0
12 Oct 2024
Enabling Synergistic Full-Body Control in Prompt-Based Co-Speech Motion
  Generation
Enabling Synergistic Full-Body Control in Prompt-Based Co-Speech Motion GenerationACM Multimedia (MM), 2024
Bohong Chen
Yumeng Li
Yao-Xiang Ding
Tianjia Shao
Kun Zhou
296
32
0
01 Oct 2024
MM-Conv: A Multi-modal Conversational Dataset for Virtual Humans
MM-Conv: A Multi-modal Conversational Dataset for Virtual Humans
Anna Deichler
Jim O'Regan
Jonas Beskow
284
3
0
30 Sep 2024
ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE
ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAEMotion in Games (MIG), 2024
Sichun Wu
Kazi Injamamul Haque
Zerrin Yumak
VGen
659
14
0
12 Sep 2024
Lagrangian Motion Fields for Long-term Motion Generation
Lagrangian Motion Fields for Long-term Motion GenerationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Yifei Yang
Zikai Huang
C. Xu
Shengfeng He
371
2
0
03 Sep 2024
Combo: Co-speech holistic 3D human motion generation and efficient customizable adaptation in harmony
Combo: Co-speech holistic 3D human motion generation and efficient customizable adaptation in harmonyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Chao Xu
Mingze Sun
Zhi-Qi Cheng
Haiwei Yang
Yang Liu
Baigui Sun
Ruqi Huang
Alexander G. Hauptmann
VGen
411
6
0
18 Aug 2024
Body of Her: A Preliminary Study on End-to-End Humanoid Agent
Body of Her: A Preliminary Study on End-to-End Humanoid Agent
Tenglong Ao
LM&Ro
213
10
0
06 Aug 2024
Investigating the impact of 2D gesture representation on co-speech
  gesture generation
Investigating the impact of 2D gesture representation on co-speech gesture generation
Teo Guichoux
Laure Soulier
Nicolas Obin
Catherine Pelachaud
SLR
308
0
0
21 Jun 2024
Exploiting LMM-based knowledge for image classification tasks
Exploiting LMM-based knowledge for image classification tasks
Maria Tzelepi
Vasileios Mezaris
VLM
258
4
0
05 Jun 2024
CoCoGesture: Toward Coherent Co-speech 3D Gesture Generation in the Wild
CoCoGesture: Toward Coherent Co-speech 3D Gesture Generation in the Wild
Xingqun Qi
Hengyuan Zhang
Yatian Wang
J. Pan
Chen Liu
...
Qixun Zhang
Shanghang Zhang
Wenhan Luo
Qifeng Liu
Qi-fei Liu
DiffMSLR
588
8
0
27 May 2024
InterAct: Capture and Modelling of Realistic, Expressive and Interactive
  Activities between Two Persons in Daily Scenarios
InterAct: Capture and Modelling of Realistic, Expressive and Interactive Activities between Two Persons in Daily Scenarios
Yinghao Huang
Leo Ho
Dafei Qin
Mingyi Shi
Taku Komura
VGen
268
7
0
19 May 2024
Semantic Gesticulator: Semantics-Aware Co-Speech Gesture Synthesis
Semantic Gesticulator: Semantics-Aware Co-Speech Gesture SynthesisACM Transactions on Graphics (TOG), 2024
Zeyi Zhang
Tenglong Ao
Yuyao Zhang
Qingzhe Gao
Chuan Lin
Baoquan Chen
Libin Liu
SLR
356
30
0
16 May 2024
Bridge to Non-Barrier Communication: Gloss-Prompted Fine-grained Cued
  Speech Gesture Generation with Diffusion Model
Bridge to Non-Barrier Communication: Gloss-Prompted Fine-grained Cued Speech Gesture Generation with Diffusion Model
Wen-Ling Lei
Li Liu
Jun Wang
DiffM
302
6
0
30 Apr 2024
Large Motion Model for Unified Multi-Modal Motion Generation
Large Motion Model for Unified Multi-Modal Motion Generation
Mingyuan Zhang
Daisheng Jin
Chenyang Gu
Fangzhou Hong
Zhongang Cai
...
Chongzhi Zhang
Xinying Guo
Lei Yang
Ying He
Ziwei Liu
VGen
369
74
0
01 Apr 2024
ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture
  Synthesis
ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis
Muhammad Hamza Mughal
Rishabh Dabral
I. Habibie
Lucia Donatelli
Marc Habermann
Christian Theobalt
SLR
174
41
0
26 Mar 2024
Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation
  Guided by the Characteristic Dance Primitives
Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives
Ronghui Li
YuXiang Zhang
Yachao Zhang
Hongwen Zhang
Jie Guo
Yan Zhang
Yebin Liu
Xiu Li
DiffM
457
65
0
15 Mar 2024
MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models
MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space ModelsNeural Information Processing Systems (NeurIPS), 2024
Zunnan Xu
Yukang Lin
Haonan Han
Sicheng Yang
Ronghui Li
Yachao Zhang
Xiu Li
Mamba
715
46
0
14 Mar 2024
DiffSal: Joint Audio and Video Learning for Diffusion Saliency
  Prediction
DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction
Jun Xiong
Peng Zhang
Tao You
Chuanyue Li
Wei Huang
Yufei Zha
DiffM
261
18
0
02 Mar 2024
Robot Interaction Behavior Generation based on Social Motion Forecasting
  for Human-Robot Interaction
Robot Interaction Behavior Generation based on Social Motion Forecasting for Human-Robot Interaction
Esteve Valls Mascaro
Yashuai Yan
Dongheui Lee
400
7
0
07 Feb 2024
DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven
  Holistic 3D Expression and Gesture Generation
DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture GenerationComputer Vision and Pattern Recognition (CVPR), 2024
Junming Chen
Yunfei Liu
Jianan Wang
Ailing Zeng
Yu Li
Qifeng Chen
VGen
319
69
0
09 Jan 2024
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
From Audio to Photoreal Embodiment: Synthesizing Humans in ConversationsComputer Vision and Pattern Recognition (CVPR), 2024
Evonne Ng
Javier Romero
Timur M. Bagautdinov
Shaojie Bai
Trevor Darrell
Angjoo Kanazawa
Alexander Richard
VGen
299
81
0
03 Jan 2024
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via
  Expressive Masked Audio Gesture Modeling
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture ModelingComputer Vision and Pattern Recognition (CVPR), 2023
Haiyang Liu
Zihao Zhu
Giorgio Becherini
Yichen Peng
Mingyang Su
You Zhou
Xuefei Zhe
Naoya Iwamoto
Bo Zheng
Michael J. Black
SLR
1.0K
96
0
31 Dec 2023
Inter-X: Towards Versatile Human-Human Interaction Analysis
Inter-X: Towards Versatile Human-Human Interaction Analysis
Liang Xu
Xintao Lv
Manwen Liao
Xin Jin
Shuwen Wu
...
Fengyun Rao
Xingdong Sheng
Yunhui Liu
Wenjun Zeng
Yunbo Wang
387
92
0
26 Dec 2023
MotionScript: Natural Language Descriptions for Expressive 3D Human Motions
MotionScript: Natural Language Descriptions for Expressive 3D Human Motions
Payam Jome Yazdian
Rachel Lagasse
Hamid Mohammadi
Eric Liu
Li Cheng
Angelica Lim
538
23
0
19 Dec 2023
BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics
BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body DynamicsComputer Vision and Pattern Recognition (CVPR), 2023
Wenqian Zhang
Molin Huang
Yuxuan Zhou
Juze Zhang
Jingyi Yu
Jingya Wang
Lan Xu
495
10
0
13 Dec 2023
Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion
Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion
Kiran Chhatre
Radek Danvevcek
Nikos Athanasiou
Giorgio Becherini
Christopher Peters
Michael J. Black
Timo Bolkart
593
45
0
07 Dec 2023
QPoser: Quantized Explicit Pose Prior Modeling for Controllable Pose
  Generation
QPoser: Quantized Explicit Pose Prior Modeling for Controllable Pose Generation
Yumeng Li
Zhexu Luo
Zhong Ren
Kun Zhou
402
2
0
02 Dec 2023
SpeechAct: Towards Generating Whole-body Motion from SpeechIEEE Transactions on Visualization and Computer Graphics (TVCG), 2023
Jinsong Zhang
Minjie Zhu
Yuxiang Zhang
Yebin Liu
Kun Li
378
5
0
29 Nov 2023
12
Next
Page 1 of 2