ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.02119
  4. Cited By
Speech Gesture Generation from the Trimodal Context of Text, Audio, and
  Speaker Identity

Speech Gesture Generation from the Trimodal Context of Text, Audio, and Speaker Identity

ACM Transactions on Graphics (TOG), 2020
4 September 2020
Youngwoo Yoon
Bok Cha
Joo-Haeng Lee
Minsu Jang
Jaeyeon Lee
Jaehong Kim
Geehyuk Lee
ArXiv (abs)PDFHTMLGithub (264★)

Papers citing "Speech Gesture Generation from the Trimodal Context of Text, Audio, and Speaker Identity"

50 / 168 papers shown
Title
Co-speech Gesture Video Generation via Motion-Based Graph Retrieval
Co-speech Gesture Video Generation via Motion-Based Graph Retrieval
Yafei Song
Peng Zhang
Bang Zhang
DiffMSLR
356
0
0
02 Dec 2025
fMRI2GES: Co-speech Gesture Reconstruction from fMRI Signal with Dual Brain Decoding Alignment
fMRI2GES: Co-speech Gesture Reconstruction from fMRI Signal with Dual Brain Decoding Alignment
Chunzheng Zhu
Jialin Shao
Jianxin Lin
Yijun Wang
Jing Wang
Jinhui Tang
Kenli Li
28
0
0
01 Dec 2025
CoordSpeaker: Exploiting Gesture Captioning for Coordinated Caption-Empowered Co-Speech Gesture Generation
CoordSpeaker: Exploiting Gesture Captioning for Coordinated Caption-Empowered Co-Speech Gesture Generation
Fengyi Fang
Sicheng Yang
Wenming Yang
SLR
140
0
0
28 Nov 2025
Towards Reliable Human Evaluations in Gesture Generation: Insights from a Community-Driven State-of-the-Art Benchmark
Towards Reliable Human Evaluations in Gesture Generation: Insights from a Community-Driven State-of-the-Art Benchmark
Rajmund Nagy
Hendric Voss
Thanh Hoang-Minh
Mihail Tsakov
Teodor Nikolov
...
R. Mcdonnell
Michael Neff
Taras Kucherenko
Youngwoo Yoon
G. Henter
EGVMVGen
342
0
0
03 Nov 2025
Conveying Meaning through Gestures: An Investigation into Semantic Co-Speech Gesture Generation
Conveying Meaning through Gestures: An Investigation into Semantic Co-Speech Gesture Generation
Hendric Voss
Lisa Michelle Bohnenkamp
Stefan Kopp
SLR
164
0
0
20 Oct 2025
ImaGGen: Zero-Shot Generation of Co-Speech Semantic Gestures Grounded in Language and Image Input
ImaGGen: Zero-Shot Generation of Co-Speech Semantic Gestures Grounded in Language and Image Input
Hendric Voss
Stefan Kopp
SLR
244
0
0
20 Oct 2025
MimicParts: Part-aware Style Injection for Speech-Driven 3D Motion Generation
MimicParts: Part-aware Style Injection for Speech-Driven 3D Motion Generation
Lianlian Liu
YongKang He
Zhaojie Chu
Xiaofen Xing
Xiangmin Xu
100
1
0
15 Oct 2025
Gelina: Unified Speech and Gesture Synthesis via Interleaved Token Prediction
Gelina: Unified Speech and Gesture Synthesis via Interleaved Token Prediction
Teo Guichoux
Théodor Lemerle
Shivam Mehta
Jonas Beskow
G. Henter
Laure Soulier
Catherine Pelachaud
Nicolas Obin
107
0
0
13 Oct 2025
Social Agent: Mastering Dyadic Nonverbal Behavior Generation via Conversational LLM Agents
Social Agent: Mastering Dyadic Nonverbal Behavior Generation via Conversational LLM Agents
Zeyi Zhang
Yanju Zhou
Heyuan Yao
Tenglong Ao
Xiaohang Zhan
Libin Liu
LLMAG
154
1
0
06 Oct 2025
SIG-Chat: Spatial Intent-Guided Conversational Gesture Generation Involving How, When and Where
SIG-Chat: Spatial Intent-Guided Conversational Gesture Generation Involving How, When and Where
Yiheng Huang
Junran Peng
Silei Shen
Jingwei Yang
ZeJi Wei
...
Yan Liu
Xu-Cheng Yin
Man Zhang
Zhaoxiang Zhang
Chuanchen Luo
199
0
0
28 Sep 2025
Towards Context-Aware Human-like Pointing Gestures with RL Motion Imitation
Towards Context-Aware Human-like Pointing Gestures with RL Motion Imitation
Anna Deichler
Siyang Wang
Simon Alexanderson
Jonas Beskow
72
7
0
16 Sep 2025
Learning to Generate Pointing Gestures in Situated Embodied Conversational Agents
Learning to Generate Pointing Gestures in Situated Embodied Conversational AgentsFrontiers in Robotics and AI (Front. Robot. AI), 2023
Anna Deichler
Siyang Wang
Simon Alexanderson
Jonas Beskow
96
12
0
15 Sep 2025
SiLVERScore: Semantically-Aware Embeddings for Sign Language Generation Evaluation
SiLVERScore: Semantically-Aware Embeddings for Sign Language Generation Evaluation
Saki Imai
Mert Inan
Anthony Sicilia
Malihe Alikhani
SLR
168
1
0
04 Sep 2025
Human Motion Video Generation: A Survey
Human Motion Video Generation: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Haiwei Xue
Xiangyang Luo
Zhanghao Hu
Shu Zhang
Xunzhi Xiang
...
Fei Ma
Zhiyong Wu
Changpeng Yang
Zonghong Dai
Fei Richard Yu
EGVMVGen
225
23
0
04 Sep 2025
PP-Motion: Physical-Perceptual Fidelity Evaluation for Human Motion Generation
PP-Motion: Physical-Perceptual Fidelity Evaluation for Human Motion Generation
Sihan Zhao
Zixuan Wang
Tianyu Luan
Jia Jia
Wentao Zhu
Jiebo Luo
Junsong Yuan
Nan Xi
EGVM
201
0
0
11 Aug 2025
Multimodal Quantitative Measures for Multiparty Behaviour Evaluation
Multimodal Quantitative Measures for Multiparty Behaviour EvaluationInternational Conference on Multimodal Interaction (ICMI), 2025
Ojas Shirekar
Wim Pouw
Chenxu Hao
Vrushank Phadnis
Thabo Beeler
Chirag Raman
56
0
0
01 Aug 2025
Motion-example-controlled Co-speech Gesture Generation Leveraging Large Language Models
Motion-example-controlled Co-speech Gesture Generation Leveraging Large Language Models
Bohong Chen
Yumeng Li
Youyi Zheng
Yao-Xiang Ding
Kun Zhou
219
1
0
27 Jul 2025
SemGes: Semantics-aware Co-Speech Gesture Generation using Semantic Coherence and Relevance Learning
SemGes: Semantics-aware Co-Speech Gesture Generation using Semantic Coherence and Relevance Learning
Lanmiao Liu
E. Ghaleb
Aslı Özyürek
Zerrin Yumak
SLR
156
3
0
25 Jul 2025
MOSPA: Human Motion Generation Driven by Spatial Audio
MOSPA: Human Motion Generation Driven by Spatial Audio
Shuyang Xu
Zhiyang Dou
Mingyi Shi
Liang Pan
Leo Ho
...
Yuan Liu
Cheng Lin
Y. Ma
Wenping Wang
Taku Komura
189
3
0
16 Jul 2025
Multimodal Generative AI with Autoregressive LLMs for Human Motion Understanding and Generation: A Way Forward
Multimodal Generative AI with Autoregressive LLMs for Human Motion Understanding and Generation: A Way Forward
Muhammad Islam
Tao Huang
Euijoon Ahn
Usman Naseem
VGen
226
4
0
31 May 2025
MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation
MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation
Siyuan Wang
Jiawei Liu
Wei Wang
Yeying Jin
Jinsong Du
Zhi Han
SLRVGen
225
0
0
29 May 2025
Intentional Gesture: Deliver Your Intentions with Gestures for Speech
Intentional Gesture: Deliver Your Intentions with Gestures for Speech
Pinxin Liu
Haiyang Liu
Luchuan Song
Chenliang Xu
Chenliang Xu
SLR
298
7
0
21 May 2025
AsynFusion: Towards Asynchronous Latent Consistency Models for Decoupled Whole-Body Audio-Driven Avatars
AsynFusion: Towards Asynchronous Latent Consistency Models for Decoupled Whole-Body Audio-Driven Avatars
T. Zhang
Jian Zhao
Yuer Li
Zheng Zhu
Ping Hu
Zhaoxin Fan
Wenjun Wu
Xuelong Li
220
0
0
21 May 2025
M3G: Multi-Granular Gesture Generator for Audio-Driven Full-Body Human Motion Synthesis
M3G: Multi-Granular Gesture Generator for Audio-Driven Full-Body Human Motion Synthesis
Zhizhuo Yin
Yuk Hang Tsui
Pan Hui
SLRVGen
253
0
0
13 May 2025
Inter-Diffusion Generation Model of Speakers and Listeners for Effective Communication
Inter-Diffusion Generation Model of Speakers and Listeners for Effective CommunicationInternational Conference on Multimedia Retrieval (ICMR), 2025
Jinhe Huang
Yongkang Cheng
Yuming Hang
Gaoge Han
Jiajian Li
Jing Zhang
Xingjian Gu
223
0
0
08 May 2025
Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
Co3^{3}3Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive DiffusionInternational Conference on Learning Representations (ICLR), 2025
Xingqun Qi
Yatian Wang
Hengyuan Zhang
J. Pan
Wei Xue
Shanghang Zhang
Wenhan Luo
Qifeng Liu
Wenhan Luo
SLR
280
5
0
03 May 2025
EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation
EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation
Xiangyue Zhang
Jianfang Li
Jiaxu Zhang
Jianqiang Ren
Liefeng Bo
Zhigang Tu
259
2
0
12 Apr 2025
Understanding Co-speech Gestures in-the-wild
Understanding Co-speech Gestures in-the-wild
Sindhu B. Hegde
KR Prajwal
Taein Kwon
Andrew Zisserman
SLR
351
2
0
28 Mar 2025
Audio-driven Gesture Generation via Deviation Feature in the Latent Space
Audio-driven Gesture Generation via Deviation Feature in the Latent Space
Jiahui Chen
Yang Huan
Runhua Shi
Chanfan Ding
Xiaoqi Mo
Siyu Xiong
Yinong He
205
0
0
27 Mar 2025
ReCoM: Realistic Co-Speech Motion Generation with Recurrent Embedded Transformer
ReCoM: Realistic Co-Speech Motion Generation with Recurrent Embedded Transformer
Yong Xie
Yunlian Sun
Hongwen Zhang
Zichen Liu
Jinhui Tang
VGen
276
0
0
27 Mar 2025
SARGes: Semantically Aligned Reliable Gesture Generation via Intent Chain
SARGes: Semantically Aligned Reliable Gesture Generation via Intent Chain
Nan Gao
Yihua Bao
Dongdong Weng
Jiayi Zhao
Jia Li
Yan Zhou
Pengfei Wan
Di Zhang
SLR
284
1
0
26 Mar 2025
DIDiffGes: Decoupled Semi-Implicit Diffusion Models for Real-time Gesture Generation from Speech
DIDiffGes: Decoupled Semi-Implicit Diffusion Models for Real-time Gesture Generation from SpeechAAAI Conference on Artificial Intelligence (AAAI), 2025
Yongkang Cheng
Shaoli Huang
Xuelin Chen
J. Ning
Biwei Huang
DiffM
190
1
0
21 Mar 2025
MAG: Multi-Modal Aligned Autoregressive Co-Speech Gesture Generation without Vector Quantization
MAG: Multi-Modal Aligned Autoregressive Co-Speech Gesture Generation without Vector Quantization
Binjie Liu
Lina Liu
Sanyi Zhang
Songen Gu
Yihao Zhi
Tianyi Zhu
Lei Yang
Long Ye
SLR
206
0
0
18 Mar 2025
Cosh-DiT: Co-Speech Gesture Video Synthesis via Hybrid Audio-Visual Diffusion Transformers
Yasheng Sun
Zhiliang Xu
Hang Zhou
Jiazhi Guan
Quanwei Yang
...
Yingying Li
Haocheng Feng
Jiadong Wang
Ziwei Liu
Koike Hideki
VGen
309
2
0
13 Mar 2025
ExGes: Expressive Human Motion Retrieval and Modulation for Audio-Driven Gesture Synthesis
ExGes: Expressive Human Motion Retrieval and Modulation for Audio-Driven Gesture Synthesis
Xukun Zhou
Fengxin Li
Ming Chen
Yan Zhou
Pengfei Wan
Di Zhang
Yeying Jin
Zhaoxin Fan
Hongyan Liu
Jun He
DiffMVGen
352
1
0
09 Mar 2025
Maximizing Signal in Human-Model Preference AlignmentAAAI Conference on Artificial Intelligence (AAAI), 2025
Kelsey Kraus
Margaret Kroll
ALM
209
3
0
06 Mar 2025
HOP: Heterogeneous Topology-based Multimodal Entanglement for Co-Speech Gesture GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Hongye Cheng
Tianyu Wang
Guangsi Shi
Zexing Zhao
Yanwei Fu
SLR
231
3
0
03 Mar 2025
Joint Co-Speech Gesture and Expressive Talking Face Generation using
  Diffusion with Adapters
Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with AdaptersIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
S. Hogue
Chenxu Zhang
Yapeng Tian
Xiaohu Guo
DiffM
282
0
0
18 Dec 2024
The Language of Motion: Unifying Verbal and Non-verbal Language of 3D
  Human Motion
The Language of Motion: Unifying Verbal and Non-verbal Language of 3D Human MotionComputer Vision and Pattern Recognition (CVPR), 2024
Changan Chen
Juze Zhang
S. K. Lakshmikanth
Yusu Fang
Ruizhi Shao
Gordon Wetzstein
L. Fei-Fei
Ehsan Adeli
VGen
324
15
0
13 Dec 2024
Multi-Resolution Generative Modeling of Human Motion from Limited Data
Multi-Resolution Generative Modeling of Human Motion from Limited Data
David Eduardo Moreno-Villamarín
Anna Hilsmann
Peter Eisert
DiffM3DH
211
0
0
25 Nov 2024
MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and
  Correspondence
MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and CorrespondenceNeural Information Processing Systems (NeurIPS), 2024
Fuming You
Minghui Fang
Li Tang
Rongjie Huang
Yongqi Wang
Zhou Zhao
228
4
0
04 Nov 2024
Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided
  Mixture-of-Experts
Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided Mixture-of-ExpertsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Xiang Deng
Youxin Pang
Xiaochen Zhao
Chao Xu
Lizhen Wang
Hongjiang Xiao
Shi Yan
Hongwen Zhang
Yebin Liu
DiffMVGen
211
3
0
31 Oct 2024
Large Body Language Models
Large Body Language Models
Saif Punjwani
Larry Heck
149
0
0
21 Oct 2024
Allo-AVA: A Large-Scale Multimodal Conversational AI Dataset for
  Allocentric Avatar Gesture Animation
Allo-AVA: A Large-Scale Multimodal Conversational AI Dataset for Allocentric Avatar Gesture Animation
Saif Punjwani
Larry Heck
SLRVGen
149
3
0
21 Oct 2024
Emphasizing Semantic Consistency of Salient Posture for Speech-Driven
  Gesture Generation
Emphasizing Semantic Consistency of Salient Posture for Speech-Driven Gesture GenerationACM Multimedia (MM), 2024
Fengqi Liu
Hexiang Wang
Jingyu Gong
Ran Yi
Qianyu Zhou
Xuequan Lu
Jiangbo Lu
Lizhuang Ma
169
2
0
17 Oct 2024
ExpGest: Expressive Speaker Generation Using Diffusion Model and Hybrid
  Audio-Text Guidance
ExpGest: Expressive Speaker Generation Using Diffusion Model and Hybrid Audio-Text GuidanceIEEE International Conference on Multimedia and Expo (ICME), 2024
Yongkang Cheng
Mingjiang Liang
Shaoli Huang
J. Ning
Wei Liu
DiffM
169
2
0
12 Oct 2024
Towards a GENEA Leaderboard -- an Extended, Living Benchmark for
  Evaluating and Advancing Conversational Motion Synthesis
Towards a GENEA Leaderboard -- an Extended, Living Benchmark for Evaluating and Advancing Conversational Motion Synthesis
Rajmund Nagy
Hendric Voss
Youngwoo Yoon
Taras Kucherenko
Teodor Nikolov
Thanh Hoang-Minh
R. Mcdonnell
Stefan Kopp
Michael Neff
G. Henter
191
4
0
08 Oct 2024
LLM Gesticulator: Leveraging Large Language Models for Scalable and
  Controllable Co-Speech Gesture Synthesis
LLM Gesticulator: Leveraging Large Language Models for Scalable and Controllable Co-Speech Gesture Synthesis
Haozhou Pang
Tianwei Ding
Lanshan He
Ming Tao
Lu Zhang
Qi Gan
223
5
0
06 Oct 2024
TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio
  Motion Embedding and Diffusion Interpolation
TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio Motion Embedding and Diffusion InterpolationInternational Conference on Learning Representations (ICLR), 2024
Haiyang Liu
Xingchao Yang
Tomoya Akiyama
Yuantian Huang
Qiaoge Li
Shigeru Kuriyama
Takafumi Taketomi
VGenSLR
223
22
0
05 Oct 2024
Enabling Synergistic Full-Body Control in Prompt-Based Co-Speech Motion
  Generation
Enabling Synergistic Full-Body Control in Prompt-Based Co-Speech Motion GenerationACM Multimedia (MM), 2024
Bohong Chen
Yumeng Li
Yao-Xiang Ding
Tianjia Shao
Kun Zhou
193
26
0
01 Oct 2024
1234
Next