ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.12824
  4. Cited By
Tell Me What Happened: Unifying Text-guided Video Completion via
  Multimodal Masked Video Generation

Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation

23 November 2022
Tsu-jui Fu
Licheng Yu
Ning Zhang
Cheng-Yang Fu
Jong-Chyi Su
William Yang Wang
Sean Bell
    VGen
ArXivPDFHTML

Papers citing "Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation"

10 / 10 papers shown
Title
Masked Image Modeling: A Survey
Masked Image Modeling: A Survey
Vlad Hondru
Florinel-Alin Croitoru
Shervin Minaee
Radu Tudor Ionescu
N. Sebe
59
6
0
13 Aug 2024
Cross-Modal Prototype based Multimodal Federated Learning under Severely Missing Modality
Cross-Modal Prototype based Multimodal Federated Learning under Severely Missing Modality
Huy Q. Le
Chu Myaet Thwal
Yu Qiao
Ye Lin Tun
Minh N. H. Nguyen
Choong Seon Hong
Choong Seon Hong
60
4
0
25 Jan 2024
MaskViT: Masked Visual Pre-Training for Video Prediction
MaskViT: Masked Visual Pre-Training for Video Prediction
Agrim Gupta
Stephen Tian
Yunzhi Zhang
Jiajun Wu
Roberto Martín-Martín
Li Fei-Fei
100
110
0
23 Jun 2022
CogVideo: Large-scale Pretraining for Text-to-Video Generation via
  Transformers
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
243
556
0
29 May 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
VideoGPT: Video Generation using VQ-VAE and Transformers
VideoGPT: Video Generation using VQ-VAE and Transformers
Wilson Yan
Yunzhi Zhang
Pieter Abbeel
A. Srinivas
ViT
VGen
242
482
0
20 Apr 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Real-time Localized Photorealistic Video Style Transfer
Real-time Localized Photorealistic Video Style Transfer
Xide Xia
Tianfan Xue
Wei-Sheng Lai
Zheng Sun
Abby Chang
Brian Kulis
Jiawen Chen
43
30
0
20 Oct 2020
Learning to Decompose and Disentangle Representations for Video
  Prediction
Learning to Decompose and Disentangle Representations for Video Prediction
Jun-Ting Hsieh
Bingbin Liu
De-An Huang
Li Fei-Fei
Juan Carlos Niebles
DRL
127
302
0
11 Jun 2018
Imagine This! Scripts to Compositions to Videos
Imagine This! Scripts to Compositions to Videos
Tanmay Gupta
Dustin Schwenk
Ali Farhadi
Derek Hoiem
Aniruddha Kembhavi
CoGe
VGen
109
87
0
10 Apr 2018
1