ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.12328
  4. Cited By
InstructVid2Vid: Controllable Video Editing with Natural Language
  Instructions

InstructVid2Vid: Controllable Video Editing with Natural Language Instructions

21 May 2023
Bosheng Qin
Juncheng Li
Siliang Tang
Tat-Seng Chua
Yueting Zhuang
    VGen
    DiffM
ArXivPDFHTML

Papers citing "InstructVid2Vid: Controllable Video Editing with Natural Language Instructions"

8 / 8 papers shown
Title
VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion
  Models
VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models
Zhen Xing
Qi Dai
Zihao Zhang
Hui Zhang
Hang-Rui Hu
Zuxuan Wu
Yu-Gang Jiang
VGen
30
17
0
30 Nov 2023
Testing the Depth of ChatGPT's Comprehension via Cross-Modal Tasks Based
  on ASCII-Art: GPT3.5's Abilities in Regard to Recognizing and Generating
  ASCII-Art Are Not Totally Lacking
Testing the Depth of ChatGPT's Comprehension via Cross-Modal Tasks Based on ASCII-Art: GPT3.5's Abilities in Regard to Recognizing and Generating ASCII-Art Are Not Totally Lacking
David Bayani
MLLM
26
5
0
28 Jul 2023
Edit Everything: A Text-Guided Generative System for Images Editing
Edit Everything: A Text-Guided Generative System for Images Editing
Defeng Xie
Ruichen Wang
Jiancang Ma
Chen Chen
H. Lu
D. Yang
Fobo Shi
Xiaodong Lin
DiffM
80
31
0
27 Apr 2023
Edit-A-Video: Single Video Editing with Object-Aware Consistency
Edit-A-Video: Single Video Editing with Object-Aware Consistency
Chaehun Shin
Heeseung Kim
Che Hyun Lee
Sang-gil Lee
Sung-Hoon Yoon
DiffM
VGen
111
50
0
14 Mar 2023
Video-P2P: Video Editing with Cross-attention Control
Video-P2P: Video Editing with Cross-attention Control
Shaoteng Liu
Yuechen Zhang
Wenbo Li
Zhe-nan Lin
Jiaya Jia
DiffM
VGen
133
202
0
08 Mar 2023
CogVideo: Large-scale Pretraining for Text-to-Video Generation via
  Transformers
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
243
556
0
29 May 2022
Language Models with Image Descriptors are Strong Few-Shot
  Video-Language Learners
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Zhenhailong Wang
Manling Li
Ruochen Xu
Luowei Zhou
Jie Lei
...
Chenguang Zhu
Derek Hoiem
Shih-Fu Chang
Mohit Bansal
Heng Ji
MLLM
VLM
167
134
0
22 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
388
4,010
0
28 Jan 2022
1