ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.19157
54
0

HOIGPT: Learning Long Sequence Hand-Object Interaction with Language Models

24 March 2025
Mingzhen Huang
Fu-Jen Chu
Bugra Tekin
Kevin J Liang
Haoyu Ma
Weiyao Wang
Xingyu Chen
Pierre Gleize
Hongfei Xue
Siwei Lyu
Kris M. Kitani
Matt Feiszli
Hao Tang
    VLM
ArXivPDFHTML
Abstract

We introduce HOIGPT, a token-based generative method that unifies 3D hand-object interactions (HOI) perception and generation, offering the first comprehensive solution for captioning and generating high-quality 3D HOI sequences from a diverse range of conditional signals (\eg text, objects, partial sequences). At its core, HOIGPT utilizes a large language model to predict the bidrectional transformation between HOI sequences and natural language descriptions. Given text inputs, HOIGPT generates a sequence of hand and object meshes; given (partial) HOI sequences, HOIGPT generates text descriptions and completes the sequences. To facilitate HOI understanding with a large language model, this paper introduces two key innovations: (1) a novel physically grounded HOI tokenizer, the hand-object decomposed VQ-VAE, for discretizing HOI sequences, and (2) a motion-aware language model trained to process and generate both text and HOI tokens. Extensive experiments demonstrate that HOIGPT sets new state-of-the-art performance on both text generation (+2.01% R Precision) and HOI generation (-2.56 FID) across multiple tasks and benchmarks.

View on arXiv
@article{huang2025_2503.19157,
  title={ HOIGPT: Learning Long Sequence Hand-Object Interaction with Language Models },
  author={ Mingzhen Huang and Fu-Jen Chu and Bugra Tekin and Kevin J Liang and Haoyu Ma and Weiyao Wang and Xingyu Chen and Pierre Gleize and Hongfei Xue and Siwei Lyu and Kris Kitani and Matt Feiszli and Hao Tang },
  journal={arXiv preprint arXiv:2503.19157},
  year={ 2025 }
}
Comments on this paper