ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.01955
16
2

Learning to Forecast Videos of Human Activity with Multi-granularity Models and Adaptive Rendering

5 December 2017
Mengyao Zhai
Jiacheng Chen
Ruizhi Deng
Lei Chen
Ligeng Zhu
Greg Mori
    3DH
ArXivPDFHTML
Abstract

We propose an approach for forecasting video of complex human activity involving multiple people. Direct pixel-level prediction is too simple to handle the appearance variability in complex activities. Hence, we develop novel intermediate representations. An architecture combining a hierarchical temporal model for predicting human poses and encoder-decoder convolutional neural networks for rendering target appearances is proposed. Our hierarchical model captures interactions among people by adopting a dynamic group-based interaction mechanism. Next, our appearance rendering network encodes the targets' appearances by learning adaptive appearance filters using a fully convolutional network. Finally, these filters are placed in encoder-decoder neural networks to complete the rendering. We demonstrate that our model can generate videos that are superior to state-of-the-art methods, and can handle complex human activity scenarios in video forecasting.

View on arXiv
Comments on this paper