ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.06465
  4. Cited By
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video
  Prediction

AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction

10 June 2024
Zhen Xing
Qi Dai
Zejia Weng
Zuxuan Wu
Yu-Gang Jiang
    VGen
ArXivPDFHTML

Papers citing "AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction"

13 / 13 papers shown
Title
Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning
Baoqi Pei
Y. Huang
Jilan Xu
Guo Chen
Yuping He
...
Yali Wang
Weidi Xie
Yu Qiao
Fei Wu
Limin Wang
34
0
0
02 Mar 2025
A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction
A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction
Yongfan Chen
Xiuwen Zhu
Tianyu Li
EGVM
VGen
45
3
0
08 Feb 2025
OmniVid: A Generative Framework for Universal Video Understanding
OmniVid: A Generative Framework for Universal Video Understanding
Junke Wang
Dongdong Chen
Chong Luo
Bo He
Lu Yuan
Zuxuan Wu
Yu-Gang Jiang
VLM
VGen
60
14
0
26 Mar 2024
VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis
VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis
Yumeng Li
William H. Beluch
M. Keuper
Dan Zhang
Anna Khoreva
DiffM
VGen
68
5
0
20 Mar 2024
MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
Yanhui Wang
Jianmin Bao
Wenming Weng
Ruoyu Feng
Dacheng Yin
...
Yuhui Yuan
Chuanxin Tang
Xiaoyan Sun
Chong Luo
Baining Guo
DiffM
VGen
63
14
0
30 Nov 2023
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large
  Datasets
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
...
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
VGen
150
985
0
25 Nov 2023
VideoFusion: Decomposed Diffusion Models for High-Quality Video
  Generation
VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation
Zhengxiong Luo
Dayou Chen
Yingya Zhang
Yan Huang
Liangsheng Wang
Yujun Shen
Deli Zhao
Jinren Zhou
Tien-Ping Tan
DiffM
VGen
126
215
0
15 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
CogVideo: Large-scale Pretraining for Text-to-Video Generation via
  Transformers
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
235
556
0
29 May 2022
Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain
  Datasets
Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets
F. Ebert
Yanlai Yang
Karl Schmeckpeper
Bernadette Bucher
G. Georgakis
Kostas Daniilidis
Chelsea Finn
Sergey Levine
147
212
0
27 Sep 2021
VideoGPT: Video Generation using VQ-VAE and Transformers
VideoGPT: Video Generation using VQ-VAE and Transformers
Wilson Yan
Yunzhi Zhang
Pieter Abbeel
A. Srinivas
ViT
VGen
237
342
0
20 Apr 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
272
1,939
0
09 Feb 2021
Xception: Deep Learning with Depthwise Separable Convolutions
Xception: Deep Learning with Depthwise Separable Convolutions
François Chollet
MDE
BDL
PINN
190
12,781
0
07 Oct 2016
1