Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.18837
Cited By
VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models
30 November 2023
Zhen Xing
Qi Dai
Zihao Zhang
Hui Zhang
Hang-Rui Hu
Zuxuan Wu
Yu-Gang Jiang
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models"
21 / 21 papers shown
Title
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Ziqi Pang
Xin Xu
Yu-Xiong Wang
DiffM
57
0
0
15 Apr 2025
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
Quanhao Li
Zhen Xing
Rui Wang
Hui Zhang
Qi Dai
Zuxuan Wu
VGen
61
0
0
20 Mar 2025
Human2Robot: Learning Robot Actions from Paired Human-Robot Videos
Sicheng Xie
Haidong Cao
Zejia Weng
Zhen Xing
Shiwei Shen
Jiaqi Leng
Xipeng Qiu
Yanwei Fu
Zuxuan Wu
Yu Jiang
45
0
0
23 Feb 2025
StableAnimator: High-Quality Identity-Preserving Human Image Animation
Shuyuan Tu
Zhen Xing
Xintong Han
Zhi-Qi Cheng
Qi Dai
Chong Luo
Zuxuan Wu
VGen
99
13
0
26 Nov 2024
GenRec: Unifying Video Generation and Recognition with Diffusion Models
Zejia Weng
Xitong Yang
Zhen Xing
Zuxuan Wu
Yu-Gang Jiang
VGen
DiffM
25
5
0
27 Aug 2024
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction
Zhen Xing
Qi Dai
Zejia Weng
Zuxuan Wu
Yu-Gang Jiang
VGen
39
10
0
10 Jun 2024
Temporally Consistent Object Editing in Videos using Extended Attention
AmirHossein Zamani
Amir G. Aghdam
Tiberiu Popa
Eugene Belilovsky
DiffM
23
1
0
01 Jun 2024
Zero-shot High-fidelity and Pose-controllable Character Animation
Bingwen Zhu
Fanyi Wang
Tianyi Lu
Peng Liu
Jingwen Su
Jinxiu Liu
Yanhao Zhang
Zuxuan Wu
Guo-Jun Qi
Yu-Gang Jiang
DiffM
VGen
35
6
0
21 Apr 2024
InstructBrush: Learning Attention-based Instruction Optimization for Image Editing
Ruoyu Zhao
Qingnan Fan
Fei Kou
Shuai Qin
Hong Gu
Wei Wu
Pengcheng Xu
Mingrui Zhu
Nannan Wang
Xinbo Gao
25
4
0
27 Mar 2024
OmniVid: A Generative Framework for Universal Video Understanding
Junke Wang
Dongdong Chen
Chong Luo
Bo He
Lu Yuan
Zuxuan Wu
Yu-Gang Jiang
VLM
VGen
63
14
0
26 Mar 2024
FDGaussian: Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model
Qijun Feng
Zhen Xing
Zuxuan Wu
Yu-Gang Jiang
3DGS
27
4
0
15 Mar 2024
MotionEditor: Editing Video Motion via Content-Aware Diffusion
Shuyuan Tu
Qi Dai
Zhi-Qi Cheng
Hang-Rui Hu
Xintong Han
Zuxuan Wu
Yu-Gang Jiang
DiffM
VGen
20
30
0
30 Nov 2023
A Survey on Video Diffusion Models
Zhen Xing
Qijun Feng
Haoran Chen
Qi Dai
Hang-Rui Hu
Hang Xu
Zuxuan Wu
Yu-Gang Jiang
EGVM
VGen
50
112
0
16 Oct 2023
SimDA: Simple Diffusion Adapter for Efficient Video Generation
Zhen Xing
Qi Dai
Hang-Rui Hu
Zuxuan Wu
Yu-Gang Jiang
VGen
DiffM
12
81
0
18 Aug 2023
MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing
Kai Zhang
Lingbo Mo
Wenhu Chen
Huan Sun
Yu-Chuan Su
EGVM
105
235
0
16 Jun 2023
VideoLLM: Modeling Video Sequence with Large Language Models
Guo Chen
Yin-Dong Zheng
Jiahao Wang
Jilan Xu
Yifei Huang
...
Yi Wang
Yali Wang
Yu Qiao
Tong Lu
Limin Wang
MLLM
92
51
0
22 May 2023
Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation
Yuval Kirstain
Adam Polyak
Uriel Singer
Shahbuland Matiana
Joe Penna
Omer Levy
EGVM
160
345
0
02 May 2023
Video-P2P: Video Editing with Cross-attention Control
Shaoteng Liu
Yuechen Zhang
Wenbo Li
Zhe-nan Lin
Jiaya Jia
DiffM
VGen
133
202
0
08 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
278
1,939
0
09 Feb 2021
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
229
74,467
0
18 May 2015
1