Speech Gesture Generation from the Trimodal Context of Text, Audio, and Speaker Identity

ACM Transactions on Graphics (TOG), 2020

4 September 2020

ArXiv (abs)PDF HTML Github (264★)

Papers citing "Speech Gesture Generation from the Trimodal Context of Text, Audio, and Speaker Identity"

50 / 168 papers shown

Gelina: Unified Speech and Gesture Synthesis via Interleaved Token Prediction

232

30 Mar 2026

Co-speech Gesture Video Generation via Motion-Based Graph Retrieval

556

02 Dec 2025

fMRI2GES: Co-speech Gesture Reconstruction from fMRI Signal with Dual Brain Decoding Alignment

124

01 Dec 2025

CoordSpeaker: Exploiting Gesture Captioning for Coordinated Caption-Empowered Co-Speech Gesture Generation

318

28 Nov 2025

Towards Reliable Human Evaluations in Gesture Generation: Insights from a Community-Driven State-of-the-Art Benchmark

...

465

03 Nov 2025

Conveying Meaning through Gestures: An Investigation into Semantic Co-Speech Gesture Generation

Hendric Voss

Lisa Michelle Bohnenkamp

Stefan Kopp

SLR

266

20 Oct 2025

ImaGGen: Zero-Shot Generation of Co-Speech Semantic Gestures Grounded in Language and Image Input

Hendric Voss

Stefan Kopp

SLR

331

20 Oct 2025

MimicParts: Part-aware Style Injection for Speech-Driven 3D Motion Generation

162

15 Oct 2025

Social Agent: Mastering Dyadic Nonverbal Behavior Generation via Conversational LLM Agents

224

06 Oct 2025

SIG-Chat: Spatial Intent-Guided Conversational Gesture Generation Involving How, When and Where

...

320

28 Sep 2025

Towards Context-Aware Human-like Pointing Gestures with RL Motion Imitation

141

16 Sep 2025

Learning to Generate Pointing Gestures in Situated Embodied Conversational AgentsFrontiers in Robotics and AI (Front. Robot. AI), 2023

271

15 Sep 2025

SiLVERScore: Semantically-Aware Embeddings for Sign Language Generation Evaluation

292

04 Sep 2025

Human Motion Video Generation: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025

...

280

04 Sep 2025

PP-Motion: Physical-Perceptual Fidelity Evaluation for Human Motion Generation

246

11 Aug 2025

Multimodal Quantitative Measures for Multiparty Behaviour EvaluationInternational Conference on Multimodal Interaction (ICMI), 2025

133

01 Aug 2025

Motion-example-controlled Co-speech Gesture Generation Leveraging Large Language Models

317

27 Jul 2025

SemGes: Semantics-aware Co-Speech Gesture Generation using Semantic Coherence and Relevance Learning

318

25 Jul 2025

MOSPA: Human Motion Generation Driven by Spatial Audio

...

286

16 Jul 2025

Multimodal Generative AI with Autoregressive LLMs for Human Motion Understanding and Generation: A Way Forward

359

31 May 2025

MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation

328

29 May 2025

Intentional Gesture: Deliver Your Intentions with Gestures for Speech

391

21 May 2025

AsynFusion: Towards Asynchronous Latent Consistency Models for Decoupled Whole-Body Audio-Driven Avatars

292

21 May 2025

M3G: Multi-Granular Gesture Generator for Audio-Driven Full-Body Human Motion Synthesis

402

13 May 2025

Inter-Diffusion Generation Model of Speakers and Listeners for Effective CommunicationInternational Conference on Multimedia Retrieval (ICMR), 2025

400

08 May 2025

$Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion$

^{3}

Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive DiffusionInternational Conference on Learning Representations (ICLR), 2025

360

03 May 2025

EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation

399

12 Apr 2025

Understanding Co-speech Gestures in-the-wild

456

28 Mar 2025

Audio-driven Gesture Generation via Deviation Feature in the Latent Space

303

27 Mar 2025

ReCoM: Realistic Co-Speech Motion Generation with Recurrent Embedded Transformer

349

27 Mar 2025

SARGes: Semantically Aligned Reliable Gesture Generation via Intent Chain

391

26 Mar 2025

DIDiffGes: Decoupled Semi-Implicit Diffusion Models for Real-time Gesture Generation from SpeechAAAI Conference on Artificial Intelligence (AAAI), 2025

363

21 Mar 2025

MAG: Multi-Modal Aligned Autoregressive Co-Speech Gesture Generation without Vector Quantization

319

18 Mar 2025

Cosh-DiT: Co-Speech Gesture Video Synthesis via Hybrid Audio-Visual Diffusion Transformers

...

442

13 Mar 2025

ExGes: Expressive Human Motion Retrieval and Modulation for Audio-Driven Gesture Synthesis

486

09 Mar 2025

Maximizing Signal in Human-Model Preference AlignmentAAAI Conference on Artificial Intelligence (AAAI), 2025

Kelsey Kraus

Margaret Kroll

ALM

284

06 Mar 2025

HOP: Heterogeneous Topology-based Multimodal Entanglement for Co-Speech Gesture GenerationComputer Vision and Pattern Recognition (CVPR), 2025

320

03 Mar 2025

Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with AdaptersIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

396

18 Dec 2024

The Language of Motion: Unifying Verbal and Non-verbal Language of 3D Human MotionComputer Vision and Pattern Recognition (CVPR), 2024

435

13 Dec 2024

Multi-Resolution Generative Modeling of Human Motion from Limited Data

David Eduardo Moreno-Villamarín

Anna Hilsmann

Peter Eisert

DiffM 3DH

296

25 Nov 2024

MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and CorrespondenceNeural Information Processing Systems (NeurIPS), 2024

308

04 Nov 2024

Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided Mixture-of-ExpertsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

Yebin Liu

358

31 Oct 2024

Large Body Language Models

Saif Punjwani

Larry Heck

230

21 Oct 2024

Allo-AVA: A Large-Scale Multimodal Conversational AI Dataset for Allocentric Avatar Gesture Animation

Saif Punjwani

Larry Heck

SLR VGen

222

21 Oct 2024

Emphasizing Semantic Consistency of Salient Posture for Speech-Driven Gesture GenerationACM Multimedia (MM), 2024

Lizhuang Ma

222

17 Oct 2024

ExpGest: Expressive Speaker Generation Using Diffusion Model and Hybrid Audio-Text GuidanceIEEE International Conference on Multimedia and Expo (ICME), 2024

319

12 Oct 2024

Towards a GENEA Leaderboard -- an Extended, Living Benchmark for Evaluating and Advancing Conversational Motion Synthesis

313

08 Oct 2024

LLM Gesticulator: Leveraging Large Language Models for Scalable and Controllable Co-Speech Gesture Synthesis

289

06 Oct 2024

TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio Motion Embedding and Diffusion InterpolationInternational Conference on Learning Representations (ICLR), 2024

376

05 Oct 2024

Enabling Synergistic Full-Body Control in Prompt-Based Co-Speech Motion GenerationACM Multimedia (MM), 2024

Kun Zhou

298

01 Oct 2024