Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches

8 May 2024

Papers citing "Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches"

8 / 8 papers shown

Title
TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion Synthesis Mathis Petrovich Michael J. Black Gül Varol VGen 62 40 0 02 May 2023
Human Motion Diffusion Model Guy Tevet Sigal Raab Brian Gordon Yonatan Shafir Daniel Cohen-Or Amit H. Bermano DiffM VGen 177 713 0 29 Sep 2022
TEACH: Temporal Action Composition for 3D Humans Nikos Athanasiou Mathis Petrovich Michael J. Black Gül Varol 72 138 0 09 Sep 2022
TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts Chuan Guo Xinxin Xuo Sen Wang Li Cheng VGen 60 138 0 04 Jul 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Junnan Li Dongxu Li Caiming Xiong S. Hoi MLLM BDL VLM CLIP 380 4,010 0 28 Jan 2022
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation Yuan Gong Yu-An Chung James R. Glass VLM 94 120 0 02 Feb 2021
The KIT Motion-Language Dataset Matthias Plappert Christian Mandery Tamim Asfour 166 267 0 13 Jul 2016
ImageNet Large Scale Visual Recognition Challenge Olga Russakovsky Jia Deng Hao Su J. Krause S. Satheesh ... A. Karpathy A. Khosla Michael S. Bernstein Alexander C. Berg Li Fei-Fei VLM ObjD 279 39,083 0 01 Sep 2014