v1v2v3v4 (latest)

Hierarchical Intention-Aware Expressive Motion Generation for Humanoid Robots

2 June 2025

ArXiv (abs)PDF HTML Github

Main:7 Pages

5 Figures

Bibliography:1 Pages

5 Tables

Abstract

Effective human-robot interaction requires robots to identify human intentions and generate expressive, socially appropriate motions in real-time. Existing approaches often rely on fixed motion libraries or computationally expensive generative models. We propose a hierarchical framework that combines intention-aware reasoning via in-context learning (ICL) with real-time motion generation using diffusion models. Our system introduces structured prompting with confidence scoring, fallback behaviors, and social context awareness to enable intention refinement and adaptive response. Leveraging large-scale motion datasets and efficient latent-space denoising, the framework generates diverse, physically plausible gestures suitable for dynamic humanoid interactions. Experimental validation on a physical platform demonstrates the robustness and social alignment of our method in realistic scenarios.

View on arXiv

Comments on this paper