ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.01434
  4. Cited By
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video
  Generation

StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

Neural Information Processing Systems (NeurIPS), 2024
2 May 2024
Yupeng Zhou
Daquan Zhou
Ming-Ming Cheng
Jiashi Feng
Qibin Hou
    DiffMVGen
ArXiv (abs)PDFHTMLHuggingFace (57 upvotes)Github (6293★)

Papers citing "StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation"

50 / 67 papers shown
Title
Planning with Sketch-Guided Verification for Physics-Aware Video Generation
Planning with Sketch-Guided Verification for Physics-Aware Video Generation
Yidong Huang
Zun Wang
Han Lin
Dong-Ki Kim
Shayegan Omidshafiei
Jaehong Yoon
Yue Zhang
Mohit Bansal
VGen
137
0
0
21 Nov 2025
Driving scenario generation and evaluation using a structured layer representation and foundational models
Driving scenario generation and evaluation using a structured layer representation and foundational models
Arthur Hubert
Gamal Elghazaly
R. Frank
76
0
0
03 Nov 2025
BachVid: Training-Free Video Generation with Consistent Background and Character
BachVid: Training-Free Video Generation with Consistent Background and Character
Han Yan
Xibin Song
Yifu Wang
Hongdong Li
Pan Ji
Chao Ma
DiffMVGen
108
0
0
24 Oct 2025
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives
Yihao Meng
Hao Ouyang
Yue Yu
Qiuyu Wang
Wen Wang
...
Yixuan Li
Cheng Chen
Yanhong Zeng
Yujun Shen
Huamin Qu
VGen
104
2
0
23 Oct 2025
When and Where do Events Switch in Multi-Event Video Generation?
When and Where do Events Switch in Multi-Event Video Generation?
Ruotong Liao
Guowen Huang
Qing Cheng
Thomas Seidl
Daniel Cremers
Volker Tresp
DiffMVGen
172
0
0
03 Oct 2025
Stitch: Training-Free Position Control in Multimodal Diffusion Transformers
Stitch: Training-Free Position Control in Multimodal Diffusion Transformers
Jessica Bader
Mateusz Pach
Maria A. Bravo
Serge Belongie
Zeynep Akata
96
1
0
30 Sep 2025
Plotñ Polish: Zero-shot Story Visualization and Disentangled Editing with Text-to-Image Diffusion Models
Plotñ Polish: Zero-shot Story Visualization and Disentangled Editing with Text-to-Image Diffusion Models
Kiymet Akdemir
Jing Shi
Kushal Kafle
Brian L. Price
Pinar Yanardag
DiffM
87
0
0
04 Sep 2025
Seeing Clearly, Forgetting Deeply: Revisiting Fine-Tuned Video Generators for Driving Simulation
Seeing Clearly, Forgetting Deeply: Revisiting Fine-Tuned Video Generators for Driving Simulation
Chun-Peng Chang
Chen-Yu Wang
Julian Schmidt
Holger Caesar
A. Pagani
VGen
199
1
0
22 Aug 2025
Ouroboros: Single-step Diffusion Models for Cycle-consistent Forward and Inverse Rendering
Ouroboros: Single-step Diffusion Models for Cycle-consistent Forward and Inverse Rendering
Shanlin Sun
Yifan Wang
Hanwen Zhang
Yifeng Xiong
Qin Ren
Ruogu Fang
Xiaohui Xie
Chenyu You
126
2
0
20 Aug 2025
PersonaVlog: Personalized Multimodal Vlog Generation with Multi-Agent Collaboration and Iterative Self-Correction
PersonaVlog: Personalized Multimodal Vlog Generation with Multi-Agent Collaboration and Iterative Self-Correction
Xiaolu Hou
Bing Ma
Jiaxiang Cheng
Xuhua Ren
Kai Yu
Wenyue Li
Tianxiang Zheng
Qinglin Lu
DiffMVGen
72
0
0
19 Aug 2025
CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models
CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models
Xiaoxue Wu
Bingjie Gao
Yu Qiao
Yaohui Wang
Xinyuan Chen
DiffMVGen
141
5
0
15 Aug 2025
CoreEditor: Consistent 3D Editing via Correspondence-constrained Diffusion
CoreEditor: Consistent 3D Editing via Correspondence-constrained Diffusion
Zhe Zhu
Honghua Chen
Peng Li
Mingqiang Wei
DiffM
89
1
0
15 Aug 2025
Gen-AFFECT: Generation of Avatar Fine-grained Facial Expressions with Consistent identiTy
Gen-AFFECT: Generation of Avatar Fine-grained Facial Expressions with Consistent identiTy
Hao Yu
Rupayan Mallick
Margrit Betke
Sarah Adel Bargal
DiffM
58
0
0
13 Aug 2025
Lay2Story: Extending Diffusion Transformers for Layout-Togglable Story Generation
Lay2Story: Extending Diffusion Transformers for Layout-Togglable Story Generation
Ao Ma
Jiasong Feng
Ke Cao
Jing Wang
Yun Wang
Quanwei Zhang
Zhanjie Zhang
DiffMVGen
126
4
0
12 Aug 2025
Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing
Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing
Joonghyuk Shin
Alchan Hwang
Yujin Kim
Daneul Kim
Jaesik Park
DiffM
81
2
0
11 Aug 2025
Devil is in the Detail: Towards Injecting Fine Details of Image Prompt in Image Generation via Conflict-free Guidance and Stratified Attention
Devil is in the Detail: Towards Injecting Fine Details of Image Prompt in Image Generation via Conflict-free Guidance and Stratified AttentionComputer Vision and Pattern Recognition (CVPR), 2025
Kyungmin Jo
Jooyeol Yun
Jaegul Choo
DiffM
95
2
0
04 Aug 2025
StorySync: Training-Free Subject Consistency in Text-to-Image Generation via Region Harmonization
StorySync: Training-Free Subject Consistency in Text-to-Image Generation via Region Harmonization
Gopalji Gaur
Mohammadreza Zolfaghari
Thomas Brox
DiffM
83
0
0
31 Jul 2025
Captain Cinema: Towards Short Movie Generation
Captain Cinema: Towards Short Movie Generation
Junfei Xiao
Ceyuan Yang
Lvmin Zhang
S. Cai
Yang Zhao
Yuwei Guo
Gordon Wetzstein
Maneesh Agrawala
Alan Yuille
Lu Jiang
DiffMVGen
138
16
0
24 Jul 2025
NarrLV: Towards a Comprehensive Narrative-Centric Evaluation for Long Video Generation
NarrLV: Towards a Comprehensive Narrative-Centric Evaluation for Long Video Generation
X. Feng
H. Yu
M. Wu
Shuyan Hu
J. Chen
C. Zhu
J. Wu
X. Chu
K. Huang
DiffMEGVMVGen
373
4
0
15 Jul 2025
WAVE: Warp-Based View Guidance for Consistent Novel View Synthesis Using a Single Image
WAVE: Warp-Based View Guidance for Consistent Novel View Synthesis Using a Single Image
Jiwoo Park
Tae Eun Choi
Youngjun Jun
Seong Jae Hwang
DiffM
147
0
0
30 Jun 2025
OmniVCus: Feedforward Subject-driven Video Customization with Multimodal Control Conditions
OmniVCus: Feedforward Subject-driven Video Customization with Multimodal Control Conditions
Yuanhao Cai
Chentao Song
Xi Chen
Jinbo Xing
Yiwei Hu
...
Tianyu Wang
Y. Zhang
Xiaokang Yang
Zhe Lin
Alan Yuille
DiffMVGen
214
3
0
29 Jun 2025
MultiHuman-Testbench: Benchmarking Image Generation for Multiple Humans
MultiHuman-Testbench: Benchmarking Image Generation for Multiple Humans
Shubhankar Borse
Seokeon Choi
S. Park
J. Kim
Shreya Kadambi
Risheek Garrepalli
Sungrack Yun
Munawar Hayat
Fatih Porikli
EGVMVLM
207
2
0
25 Jun 2025
EchoShot: Multi-Shot Portrait Video Generation
EchoShot: Multi-Shot Portrait Video Generation
Jiahao Wang
Hualian Sheng
Sijia Cai
Weizhan Zhang
Caixia Yan
Yachuang Feng
Bing Deng
Jieping Ye
DiffMVGen
147
6
0
16 Jun 2025
AniMaker: Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation
AniMaker: Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation
Haoyuan Shi
Yunxin Li
Xinyu Chen
Longyue Wang
Baotian Hu
Min Zhang
DiffMVGen
282
1
0
12 Jun 2025
Consistent Story Generation: Unlocking the Potential of Zigzag Sampling
Consistent Story Generation: Unlocking the Potential of Zigzag Sampling
Mingxiao Li
Mang Ning
Marie-Francine Moens
DiffM
344
0
0
11 Jun 2025
ViStoryBench: Comprehensive Benchmark Suite for Story Visualization
ViStoryBench: Comprehensive Benchmark Suite for Story Visualization
Cailin Zhuang
Ailin Huang
Wei Cheng
J. Wu
Yaoqi Hu
...
Hengyuan Xu
Xuanyang Zhang
Xianfang Zeng
Gang Yu
Fangqiu Yi
CoGe
328
10
0
30 May 2025
Frame-Level Captions for Long Video Generation with Complex Multi Scenes
Frame-Level Captions for Long Video Generation with Complex Multi Scenes
Guangcong Zheng
Jianlong Yuan
Bo Wang
Haoyang Huang
Guoqing Ma
Nan Duan
DiffMVGen
242
0
0
27 May 2025
Action2Dialogue: Generating Character-Centric Narratives from Scene-Level Prompts
Action2Dialogue: Generating Character-Centric Narratives from Scene-Level Prompts
Taewon Kang
Ming C. Lin
DiffMVGen
269
0
0
22 May 2025
StoryReasoning Dataset: Using Chain-of-Thought for Scene Understanding and Grounded Story Generation
StoryReasoning Dataset: Using Chain-of-Thought for Scene Understanding and Grounded Story Generation
Daniel A. P. Oliveira
David Martins de Matos
VGen
189
1
0
15 May 2025
Not All Parameters Matter: Masking Diffusion Models for Enhancing Generation Ability
Not All Parameters Matter: Masking Diffusion Models for Enhancing Generation AbilityComputer Vision and Pattern Recognition (CVPR), 2025
Liwen Wang
Senmao Li
Fei Yang
Jianye Wang
Ziheng Zhang
Wenshu Fan
Yijiao Wang
Jian Yang
DiffM
317
2
0
06 May 2025
CineVerse: Consistent Keyframe Synthesis for Cinematic Scene Composition
CineVerse: Consistent Keyframe Synthesis for Cinematic Scene Composition
Quynh Phung
Long Mai
Fabian Caba Heilbron
Feng Liu
Jia-Bin Huang
Cusuh Ham
DiffMVGenCoGe
247
3
0
28 Apr 2025
StyleMe3D: Stylization with Disentangled Priors by Multiple Encoders on 3D Gaussians
StyleMe3D: Stylization with Disentangled Priors by Multiple Encoders on 3D Gaussians
Cailin Zhuang
Yaoqi Hu
Xinming Zhang
Wei Cheng
Jiacheng Bao
Shengqi Liu
Yiying Yang
Xianfang Zeng
Gang Yu
Ming Li
3DGS
274
4
0
21 Apr 2025
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
Junhao Cheng
Yuying Ge
Yixiao Ge
Jing Liao
Mingyu Ding
VGenAI4CE
336
5
0
01 Apr 2025
Latent Beam Diffusion Models for Generating Visual Sequences
Latent Beam Diffusion Models for Generating Visual Sequences
Guilherme Fernandes
Vasco Ramos
Regev Cohen
Idan Szpektor
João Magalhães
312
1
0
26 Mar 2025
DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Mode
DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Mode
Junjia Huang
Pengxiang Yan
Jinhang Cai
Jiyang Liu
Zhao Wang
Yitong Wang
Xinglong Wu
Guanbin Li
DiffM
198
4
0
17 Mar 2025
Personalize Anything for Free with Diffusion Transformer
Personalize Anything for Free with Diffusion Transformer
Haoran Feng
Zehuan Huang
Lin Li
Hairong Lv
Lu Sheng
DiffM
302
18
0
16 Mar 2025
A Self-supervised Motion Representation for Portrait Video Generation
A Self-supervised Motion Representation for Portrait Video Generation
Qiyuan Zhang
Chenyu Wu
Wenzhang Sun
Huaize Liu
Donglin Di
Wei Chen
Changqing Zou
VGen
221
1
0
13 Mar 2025
Long-horizon Visual Instruction Generation with Logic and Attribute Self-reflection
Long-horizon Visual Instruction Generation with Logic and Attribute Self-reflectionInternational Conference on Learning Representations (ICLR), 2025
Yucheng Suo
Fan Ma
Kaixin Shen
Linchao Zhu
Yi Yang
VLM
324
3
0
12 Mar 2025
VRMDiff: Text-Guided Video Referring Matting Generation of Diffusion
Lehan Yang
Jincen Song
Tianlong Wang
Daiqing Qi
Weili Shi
Yuheng Liu
Sheng Li
DiffMVOSVGen
270
1
0
11 Mar 2025
Text2Story: Advancing Video Storytelling with Text Guidance
Text2Story: Advancing Video Storytelling with Text Guidance
Taewon Kang
D. Kothandaraman
Ming C. Lin
DiffMVGen
314
3
0
08 Mar 2025
MM-StoryAgent: Immersive Narrated Storybook Video Generation with a Multi-Agent Paradigm across Text, Image and Audio
Xuenan Xu
Jiahao Mei
Chenliang Li
Yuning Wu
Ming Yan
Shaopeng Lai
J.N. Zhang
Mengyue Wu
VGenLLMAG
232
15
0
07 Mar 2025
How to Move Your Dragon: Text-to-Motion Synthesis for Large-Vocabulary Objects
How to Move Your Dragon: Text-to-Motion Synthesis for Large-Vocabulary Objects
Wonkwang Lee
Jongwon Jeong
Taehong Moon
Hyeon-Jong Kim
Jaehyeon Kim
Gunhee Kim
Byeong-Uk Lee
DiffM
365
2
0
06 Mar 2025
VisAgent: Narrative-Preserving Story Visualization FrameworkIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Seungkwon Kim
GyuTae Park
Sangyeon Kim
Seung-Hun Nam
199
2
0
04 Mar 2025
Dynamic Concepts Personalization from Single Videos
Dynamic Concepts Personalization from Single Videos
Rameen Abdal
Or Patashnik
Ivan Skorokhodov
Willi Menapace
Aliaksandr Siarohin
Sergey Tulyakov
Daniel Cohen-Or
Kfir Aberman
DiffMVGen
246
6
0
21 Feb 2025
VideoAuteur: Towards Long Narrative Video Generation
VideoAuteur: Towards Long Narrative Video Generation
Junfei Xiao
Feng Cheng
Lu Qi
Liangke Gui
Jiepeng Cen
Zhibei Ma
Yaoyao Liu
Lu Jiang
VGen
323
7
0
10 Jan 2025
Multi-subject Open-set Personalization in Video Generation
Multi-subject Open-set Personalization in Video GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Tsai-Shien Chen
Aliaksandr Siarohin
Willi Menapace
Yuwei Fang
Kwot Sin Lee
Ivan Skorokhodov
Kfir Aberman
Jun-Yan Zhu
Ming-Hsuan Yang
Sergey Tulyakov
DiffMVGen
500
36
0
10 Jan 2025
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Haobo Yuan
Xianrui Li
Tao Zhang
Zilong Huang
Shilin Xu
...
Yunhai Tong
Lu Qi
Jiashi Feng
Ming-Hsuan Yang
Ming-Hsuan Yang
VLM
494
68
0
07 Jan 2025
Towards Precise Scaling Laws for Video Diffusion Transformers
Towards Precise Scaling Laws for Video Diffusion TransformersComputer Vision and Pattern Recognition (CVPR), 2024
Yuanyang Yin
Yaqi Zhao
Mingwu Zheng
Ke Lin
Jiarong Ou
...
Pengfei Wan
Di Zhang
Baoqun Yin
Wentao Zhang
Kun Gai
341
9
0
03 Jan 2025
Enhancing Long Video Generation Consistency without Tuning
Enhancing Long Video Generation Consistency without Tuning
Xingyao Li
Fengzhuo Zhang
Jiachun Pan
Yunlong Hou
Vincent Y. F. Tan
Zhuoran Yang
DiffMVGen
270
0
0
23 Dec 2024
Video Diffusion Transformers are In-Context Learners
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGenDiffM
763
7
0
14 Dec 2024
12
Next