ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.05831
  4. Cited By
Learning to Generate Long-term Future via Hierarchical Prediction
v1v2v3v4v5 (latest)

Learning to Generate Long-term Future via Hierarchical Prediction

19 April 2017
Ruben Villegas
Jimei Yang
Yuliang Zou
Sungryull Sohn
Xunyu Lin
Honglak Lee
ArXiv (abs)PDFHTML

Papers citing "Learning to Generate Long-term Future via Hierarchical Prediction"

50 / 211 papers shown
Bridging Text and Video Generation: A Survey
Bridging Text and Video Generation: A Survey
Nilay Kumar
Priyansh Bhandari
G. Maragatham
VGen
264
0
0
06 Oct 2025
MoReFlow: Motion Retargeting Learning through Unsupervised Flow Matching
MoReFlow: Motion Retargeting Learning through Unsupervised Flow Matching
Wontaek Kim
Tianyu Li
Sehoon Ha
173
0
0
29 Sep 2025
Integrating Reinforcement Learning with Visual Generative Models: Foundations and Advances
Integrating Reinforcement Learning with Visual Generative Models: Foundations and Advances
Yuanzhi Liang
Yijie Fang
Rui Li
Ziqi Ni
Ruijie Su
Chi Zhang
EGVM
307
2
0
14 Aug 2025
FG-DFPN: Flow Guided Deformable Frame Prediction Network
M. Akın Yılmaz
Ahmet Bilican
A. Murat Tekalp
248
0
0
14 Mar 2025
Continuous Video Process: Modeling Videos as Continuous
  Multi-Dimensional Processes for Video Prediction
Continuous Video Process: Modeling Videos as Continuous Multi-Dimensional Processes for Video Prediction
Gaurav Shrivastava
Abhinav Shrivastava
VGenDiffM
274
0
0
06 Dec 2024
COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion
  Estimation
COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion EstimationEuropean Conference on Computer Vision (ECCV), 2024
Jiefeng Li
Ye Yuan
Davis Rempe
Haotian Zhang
Pavlo Molchanov
Cewu Lu
Jan Kautz
Umar Iqbal
DiffMVGen
267
8
0
29 Aug 2024
Enhancing Bandwidth Efficiency for Video Motion Transfer Applications
  using Deep Learning Based Keypoint Prediction
Enhancing Bandwidth Efficiency for Video Motion Transfer Applications using Deep Learning Based Keypoint Prediction
Xue Bai
Tasmiah Haque
S. Mohan
Yuliang Cai
Byungheon Jeong
Adam Halasz
Srinjoy Das
214
1
0
17 Mar 2024
Predictive Temporal Attention on Event-based Video Stream for
  Energy-efficient Situation Awareness
Predictive Temporal Attention on Event-based Video Stream for Energy-efficient Situation Awareness
Yiming Bu
Jiayang Liu
Qinru Qiu
163
2
0
14 Feb 2024
Modeling Spatio-temporal Dynamical Systems with Neural Discrete Learning
  and Levels-of-Experts
Modeling Spatio-temporal Dynamical Systems with Neural Discrete Learning and Levels-of-ExpertsIEEE Transactions on Knowledge and Data Engineering (TKDE), 2024
Kun Wang
Hao Wu
Guibin Zhang
Cunchun Li
Yuxuan Liang
Yuankai Wu
Roger Zimmermann
Yang Wang
186
18
0
06 Feb 2024
SFGANS Self-supervised Future Generator for human ActioN Segmentation
SFGANS Self-supervised Future Generator for human ActioN Segmentation
Or Berman
Adam Goldbraikh
S. Laufer
241
0
0
31 Dec 2023
HMP: Hand Motion Priors for Pose and Shape Estimation from Video
HMP: Hand Motion Priors for Pose and Shape Estimation from Video
Enes Duran
Muhammed Kocabas
Vasileios Choutas
Zicong Fan
Michael J. Black
3DH
183
15
0
27 Dec 2023
Earthfarseer: Versatile Spatio-Temporal Dynamical Systems Modeling in
  One Model
Earthfarseer: Versatile Spatio-Temporal Dynamical Systems Modeling in One Model
Hao Wu
Yuxuan Liang
Wei Xiong
Zhengyang Zhou
Wei-Ming Huang
Shilong Wang
Kun Wang
AI4TS
427
19
0
13 Dec 2023
PACE: Human and Camera Motion Estimation from in-the-wild Videos
PACE: Human and Camera Motion Estimation from in-the-wild Videos
Muhammed Kocabas
Ye Yuan
Pavlo Molchanov
Yunrong Guo
Michael J. Black
Otmar Hilliges
Jan Kautz
Umar Iqbal
3DH
216
32
0
20 Oct 2023
Predicting Future Spatiotemporal Occupancy Grids with Semantics for
  Autonomous Driving
Predicting Future Spatiotemporal Occupancy Grids with Semantics for Autonomous Driving
Maneekwan Toyungyernsub
Esen Yel
Jiachen Li
Mykel J. Kochenderfer
193
4
0
03 Oct 2023
Dancing Avatar: Pose and Text-Guided Human Motion Videos Synthesis with
  Image Diffusion Model
Dancing Avatar: Pose and Text-Guided Human Motion Videos Synthesis with Image Diffusion Model
Bosheng Qin
Wentao Ye
Qifan Yu
Siliang Tang
Yueting Zhuang
DiffMVGen
162
18
0
15 Aug 2023
Does Unpredictability Influence Driving Behavior?
Does Unpredictability Influence Driving Behavior?IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Sepehr Samavi
Florian Shkurti
Angela P. Schoellig
128
1
0
28 Jul 2023
DDLP: Unsupervised Object-Centric Video Prediction with Deep Dynamic
  Latent Particles
DDLP: Unsupervised Object-Centric Video Prediction with Deep Dynamic Latent Particles
Tal Daniel
Aviv Tamar
DiffM
254
13
0
09 Jun 2023
Putting People in Their Place: Affordance-Aware Human Insertion into
  Scenes
Putting People in Their Place: Affordance-Aware Human Insertion into ScenesComputer Vision and Pattern Recognition (CVPR), 2023
Sumith Kulal
Tim Brooks
A. Aiken
Jiajun Wu
Jimei Yang
Jingwan Lu
Alexei A. Efros
Krishna Kumar Singh
DiffM
177
56
0
27 Apr 2023
Combining Vision and Tactile Sensation for Video Prediction
Combining Vision and Tactile Sensation for Video Prediction
Willow Mandil
Amir M. Ghalamzan-E.
110
4
0
21 Apr 2023
Prior based Sampling for Adaptive LiDAR
Prior based Sampling for Adaptive LiDAR
Amit Shomer
S. Avidan
3DV3DPCMDE
247
1
0
14 Apr 2023
Model-Based Reinforcement Learning with Isolated Imaginations
Model-Based Reinforcement Learning with Isolated ImaginationsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Minting Pan
Geng Chen
Yitao Zheng
Yunbo Wang
Xiaokang Yang
338
3
0
27 Mar 2023
Towards End-to-End Generative Modeling of Long Videos with
  Memory-Efficient Bidirectional Transformers
Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional TransformersComputer Vision and Pattern Recognition (CVPR), 2023
Jaehoon Yoo
Semin Kim
Doyup Lee
Chiheon Kim
Seunghoon Hong
225
6
0
20 Mar 2023
Implicit Stacked Autoregressive Model for Video Prediction
Implicit Stacked Autoregressive Model for Video Prediction
Min-seok Seo
Hakjin Lee
Do-Yeon Kim
Junghoon Seo
VGen
143
20
0
14 Mar 2023
Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in
  Complex 3D Environments
Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in Complex 3D EnvironmentsIEEE International Conference on Computer Vision (ICCV), 2023
Jiye Lee
Hanbyul Joo
336
51
0
09 Jan 2023
Motion and Context-Aware Audio-Visual Conditioned Video Prediction
Motion and Context-Aware Audio-Visual Conditioned Video PredictionBritish Machine Vision Conference (BMVC), 2022
Yating Xu
Conghui Hu
G. Lee
VGen
382
1
0
09 Dec 2022
MIMO Is All You Need : A Strong Multi-In-Multi-Out Baseline for Video
  Prediction
MIMO Is All You Need : A Strong Multi-In-Multi-Out Baseline for Video Prediction
Shuliang Ning
Mengcheng Lan
Yanran Li
Chaofeng Chen
Qian Chen
Xunlai Chen
Xiaoguang Han
Shuguang Cui
203
28
0
09 Dec 2022
SimVP: Towards Simple yet Powerful Spatiotemporal Predictive Learning
SimVP: Towards Simple yet Powerful Spatiotemporal Predictive LearningIEEE transactions on multimedia (IEEE TMM), 2022
Cheng Tan
Zhangyang Gao
Siyuan Li
Stan Z. Li
VLMAI4TS
273
40
0
22 Nov 2022
Autoregressive GAN for Semantic Unconditional Head Motion Generation
Autoregressive GAN for Semantic Unconditional Head Motion Generation
Louis Airale
Xavier Alameda-Pineda
Stéphane Lathuilière
Dominique Vaufreydaz
217
4
0
02 Nov 2022
SlotFormer: Unsupervised Visual Dynamics Simulation with Object-Centric
  Models
SlotFormer: Unsupervised Visual Dynamics Simulation with Object-Centric ModelsInternational Conference on Learning Representations (ICLR), 2022
Ziyi Wu
Nikita Dvornik
Klaus Greff
Thomas Kipf
Animesh Garg
OCLBDL
362
116
0
12 Oct 2022
Hierarchical Capsule Prediction Network for Marketing Campaigns Effect
Hierarchical Capsule Prediction Network for Marketing Campaigns EffectInternational Conference on Information and Knowledge Management (CIKM), 2022
Zhixuan Chu
Hui Ding
Guang Zeng
Yuchen Huang
T. Yan
Yulin Kang
Sheng Li
156
9
0
22 Aug 2022
A new way of video compression via forward-referencing using deep
  learning
A new way of video compression via forward-referencing using deep learning
S. Rajin
M. Murshed
M. Paul
S. Teng
J. Ma
82
0
0
13 Aug 2022
Large-scale Knowledge Distillation with Elastic Heterogeneous Computing
  Resources
Large-scale Knowledge Distillation with Elastic Heterogeneous Computing ResourcesConcurrency and Computation (CCPE), 2022
Ji Liu
Daxiang Dong
Xi Wang
An Qin
Xingjian Li
P. Valduriez
Dejing Dou
Dianhai Yu
185
8
0
14 Jul 2022
Snipper: A Spatiotemporal Transformer for Simultaneous Multi-Person 3D
  Pose Estimation Tracking and Forecasting on a Video Snippet
Snipper: A Spatiotemporal Transformer for Simultaneous Multi-Person 3D Pose Estimation Tracking and Forecasting on a Video Snippet
Shihao Zou
Yuanlu Xu
Chao Li
Lingni Ma
Li Cheng
Minh Vo
266
17
0
09 Jul 2022
SimVP: Simpler yet Better Video Prediction
SimVP: Simpler yet Better Video PredictionComputer Vision and Pattern Recognition (CVPR), 2022
Zhangyang Gao
Cheng Tan
Lirong Wu
Stan Z. Li
372
329
0
09 Jun 2022
Patch-based Object-centric Transformers for Efficient Video Generation
Patch-based Object-centric Transformers for Efficient Video Generation
Wilson Yan
Ryogo Okumura
Stephen James
Pieter Abbeel
DiffMViT
269
6
0
08 Jun 2022
FlexLip: A Controllable Text-to-Lip System
FlexLip: A Controllable Text-to-Lip SystemItalian National Conference on Sensors (INS), 2022
Dan Oneaţă
Beáta Lőrincz
Adriana Stan
H. Cucu
151
5
0
07 Jun 2022
Cascaded Video Generation for Videos In-the-Wild
Cascaded Video Generation for Videos In-the-WildInternational Conference on Pattern Recognition (ICPR), 2022
Lluis Castrejon
Nicolas Ballas
Aaron Courville
VGen
191
0
0
01 Jun 2022
Iso-Dream: Isolating and Leveraging Noncontrollable Visual Dynamics in
  World Models
Iso-Dream: Isolating and Leveraging Noncontrollable Visual Dynamics in World ModelsNeural Information Processing Systems (NeurIPS), 2022
Minting Pan
Geng Chen
Yunbo Wang
Xiaokang Yang
344
53
0
27 May 2022
Future Object Detection with Spatiotemporal Transformers
Future Object Detection with Spatiotemporal Transformers
Adam Tonderski
Joakim Johnander
Christoffer Petersson
Kalle AAstrom
ViT
186
1
0
21 Apr 2022
When Physics Meets Machine Learning: A Survey of Physics-Informed
  Machine Learning
When Physics Meets Machine Learning: A Survey of Physics-Informed Machine Learning
Chuizheng Meng
Sungyong Seo
Defu Cao
Sam Griesemer
Yan Liu
PINNAI4CE
282
128
0
31 Mar 2022
Stochastic Video Prediction with Structure and Motion
Stochastic Video Prediction with Structure and Motion
Adil Kaan Akan
Sadra Safadoust
Fatma Guney
VGen
176
10
0
20 Mar 2022
Transframer: Arbitrary Frame Prediction with Generative Models
Transframer: Arbitrary Frame Prediction with Generative Models
C. Nash
João Carreira
Jacob Walker
Iain Barr
Andrew Jaegle
Mateusz Malinowski
Peter W. Battaglia
ViT
277
44
0
17 Mar 2022
MSPred: Video Prediction at Multiple Spatio-Temporal Scales with
  Hierarchical Recurrent Networks
MSPred: Video Prediction at Multiple Spatio-Temporal Scales with Hierarchical Recurrent NetworksBritish Machine Vision Conference (BMVC), 2022
Angel Villar-Corrales
Ani J. Karapetyan
Andreas Boltres
Sven Behnke
375
12
0
17 Mar 2022
Show Me What and Tell Me How: Video Synthesis via Multimodal
  Conditioning
Show Me What and Tell Me How: Video Synthesis via Multimodal ConditioningComputer Vision and Pattern Recognition (CVPR), 2022
Ligong Han
Jian Ren
Hsin-Ying Lee
Francesco Barbieri
Kyle Olszewski
Shervin Minaee
Dimitris N. Metaxas
Sergey Tulyakov
DiffMVGen
224
45
0
04 Mar 2022
Filtered-CoPhy: Unsupervised Learning of Counterfactual Physics in Pixel
  Space
Filtered-CoPhy: Unsupervised Learning of Counterfactual Physics in Pixel SpaceInternational Conference on Learning Representations (ICLR), 2022
Steeven Janny
Fabien Baradel
Natalia Neverova
M. Nadri
Greg Mori
Christian Wolf
CML
209
17
0
01 Feb 2022
Autoencoding Video Latents for Adversarial Video Generation
Autoencoding Video Latents for Adversarial Video Generation
Sai Hemanth Kasaraneni
VGen
127
3
0
18 Jan 2022
Image Animation with Keypoint Mask
Image Animation with Keypoint Mask
Or Toledano
Yanir Marmor
Dov Gertz
VGen
117
2
0
20 Dec 2021
A Hierarchical Spatio-Temporal Graph Convolutional Neural Network for
  Anomaly Detection in Videos
A Hierarchical Spatio-Temporal Graph Convolutional Neural Network for Anomaly Detection in Videos
Xianling Zeng
Yalong Jiang
Wenrui Ding
Hongguang Li
Yafeng Hao
Zifeng Qiu
219
75
0
08 Dec 2021
GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras
GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras
Ye Yuan
Umar Iqbal
Pavlo Molchanov
Kris Kitani
Jan Kautz
3DH
322
153
0
02 Dec 2021
Layered Controllable Video Generation
Layered Controllable Video Generation
Jiahui Huang
Yuhe Jin
K. M. Yi
Leonid Sigal
VGen
395
12
0
24 Nov 2021
12345
Next