v1v2v3v4 (latest)

GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents

ACM Transactions on Graphics (TOG), 2023

26 March 2023

ArXiv (abs)PDF HTML Github

Papers citing "GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents"

50 / 60 papers shown

CoordSpeaker: Exploiting Gesture Captioning for Coordinated Caption-Empowered Co-Speech Gesture Generation

316

28 Nov 2025

Towards Reliable Human Evaluations in Gesture Generation: Insights from a Community-Driven State-of-the-Art Benchmark

...

461

03 Nov 2025

Gestura: A LVLM-Powered System Bridging Motion and Semantics for Real-Time Free-Form Gesture Understanding

247

21 Oct 2025

MimicParts: Part-aware Style Injection for Speech-Driven 3D Motion Generation

160

15 Oct 2025

Social Agent: Mastering Dyadic Nonverbal Behavior Generation via Conversational LLM Agents

219

06 Oct 2025

InterAct: A Large-Scale Dataset of Dynamic, Expressive and Interactive Activities between Two People in Daily ScenariosProceedings of the ACM on Computer Graphics and Interactive Techniques (PACMCGIT), 2025

159

06 Sep 2025

Motion-example-controlled Co-speech Gesture Generation Leveraging Large Language Models

313

27 Jul 2025

SemGes: Semantics-aware Co-Speech Gesture Generation using Semantic Coherence and Relevance Learning

315

25 Jul 2025

MOSPA: Human Motion Generation Driven by Spatial Audio

...

272

16 Jul 2025

Semantics-Aware Human Motion Generation from Audio InstructionsGraphical Models (GM), 2025

180

29 May 2025

Wav2Sem: Plug-and-Play Audio Semantic Decoupling for 3D Speech-Driven Facial AnimationComputer Vision and Pattern Recognition (CVPR), 2025

219

29 May 2025

M3G: Multi-Granular Gesture Generator for Audio-Driven Full-Body Human Motion Synthesis

389

13 May 2025

ReactDance: Hierarchical Representation for High-Fidelity and Coherent Long-Form Reactive Dance Generation

...

297

08 May 2025

$Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion$

^{3}

Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive DiffusionInternational Conference on Learning Representations (ICLR), 2025

359

03 May 2025

EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation

398

12 Apr 2025

Understanding Co-speech Gestures in-the-wild

450

28 Mar 2025

ReCoM: Realistic Co-Speech Motion Generation with Recurrent Embedded Transformer

348

27 Mar 2025

SARGes: Semantically Aligned Reliable Gesture Generation via Intent Chain

391

26 Mar 2025

DIDiffGes: Decoupled Semi-Implicit Diffusion Models for Real-time Gesture Generation from SpeechAAAI Conference on Artificial Intelligence (AAAI), 2025

363

21 Mar 2025

HERO: Human Reaction Generation from Videos

346

11 Mar 2025

ExGes: Expressive Human Motion Retrieval and Modulation for Audio-Driven Gesture Synthesis

482

09 Mar 2025

Maximizing Signal in Human-Model Preference AlignmentAAAI Conference on Artificial Intelligence (AAAI), 2025

Kelsey Kraus

Margaret Kroll

ALM

278

06 Mar 2025

ExpGest: Expressive Speaker Generation Using Diffusion Model and Hybrid Audio-Text GuidanceIEEE International Conference on Multimedia and Expo (ICME), 2024

318

12 Oct 2024

Enabling Synergistic Full-Body Control in Prompt-Based Co-Speech Motion GenerationACM Multimedia (MM), 2024

Kun Zhou

296

01 Oct 2024

MM-Conv: A Multi-modal Conversational Dataset for Virtual Humans

Anna Deichler

Jim O'Regan

Jonas Beskow

284

30 Sep 2024

ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAEMotion in Games (MIG), 2024

659

12 Sep 2024

Lagrangian Motion Fields for Long-term Motion GenerationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

371

03 Sep 2024

Combo: Co-speech holistic 3D human motion generation and efficient customizable adaptation in harmonyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

Alexander G. Hauptmann

VGen

411

18 Aug 2024

Body of Her: A Preliminary Study on End-to-End Humanoid Agent

Tenglong Ao

LM&Ro

213

06 Aug 2024

Investigating the impact of 2D gesture representation on co-speech gesture generation

Catherine Pelachaud

308

21 Jun 2024

Exploiting LMM-based knowledge for image classification tasks

Maria Tzelepi

Vasileios Mezaris

VLM

258

05 Jun 2024

CoCoGesture: Toward Coherent Co-speech 3D Gesture Generation in the Wild

...

588

27 May 2024

InterAct: Capture and Modelling of Realistic, Expressive and Interactive Activities between Two Persons in Daily Scenarios

Dafei Qin

Taku Komura

268

19 May 2024

Semantic Gesticulator: Semantics-Aware Co-Speech Gesture SynthesisACM Transactions on Graphics (TOG), 2024

356

16 May 2024

Bridge to Non-Barrier Communication: Gloss-Prompted Fine-grained Cued Speech Gesture Generation with Diffusion Model

302

30 Apr 2024

Large Motion Model for Unified Multi-Modal Motion Generation

...

Ziwei Liu

369

01 Apr 2024

ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis

Muhammad Hamza Mughal

174

26 Mar 2024

Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives

YuXiang Zhang

Hongwen Zhang

Yebin Liu

457

15 Mar 2024

MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space ModelsNeural Information Processing Systems (NeurIPS), 2024

715

14 Mar 2024

DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction

Wei Huang

261

02 Mar 2024

Robot Interaction Behavior Generation based on Social Motion Forecasting for Human-Robot Interaction

Esteve Valls Mascaro

Yashuai Yan

Dongheui Lee

400

07 Feb 2024

DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture GenerationComputer Vision and Pattern Recognition (CVPR), 2024

Junming Chen

Yunfei Liu

Jianan Wang

Ailing Zeng

Yu Li

Qifeng Chen

VGen

319

09 Jan 2024

From Audio to Photoreal Embodiment: Synthesizing Humans in ConversationsComputer Vision and Pattern Recognition (CVPR), 2024

299

03 Jan 2024

EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture ModelingComputer Vision and Pattern Recognition (CVPR), 2023

1.0K

31 Dec 2023

Inter-X: Towards Versatile Human-Human Interaction Analysis

...

387

26 Dec 2023

MotionScript: Natural Language Descriptions for Expressive 3D Human Motions

538

19 Dec 2023

BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body DynamicsComputer Vision and Pattern Recognition (CVPR), 2023

Wenqian Zhang

Jingyi Yu

495

13 Dec 2023

Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion

593

07 Dec 2023

QPoser: Quantized Explicit Pose Prior Modeling for Controllable Pose Generation

402

02 Dec 2023

SpeechAct: Towards Generating Whole-body Motion from SpeechIEEE Transactions on Visualization and Computer Graphics (TVCG), 2023

Jinsong Zhang

Minjie Zhu

Yuxiang Zhang

Yebin Liu

Kun Li

378

29 Nov 2023