ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.08683
  4. Cited By
SEED-Story: Multimodal Long Story Generation with Large Language Model

SEED-Story: Multimodal Long Story Generation with Large Language Model

11 July 2024
Shuai Yang
Yuying Ge
Yang Li
Yukang Chen
Yixiao Ge
Mingyu Ding
Yingcong Chen
    VGenDiffM
ArXiv (abs)PDFHTMLHuggingFace (26 upvotes)

Papers citing "SEED-Story: Multimodal Long Story Generation with Large Language Model"

40 / 40 papers shown
NAMeGEn: Creative Name Generation via A Novel Agent-based Multiple Personalized Goal Enhancement Framework
NAMeGEn: Creative Name Generation via A Novel Agent-based Multiple Personalized Goal Enhancement Framework
Shanlin Zhou
Xinpeng Wang
Jianxun Lian
Zhenghao Liu
L. Lakshmanan
Xiaoyuan Yi
Yongtao Hao
LLMAG
346
0
0
19 Nov 2025
DynaKV: Enabling Accurate and Efficient Long-Sequence LLM Decoding on Smartphones
DynaKV: Enabling Accurate and Efficient Long-Sequence LLM Decoding on Smartphones
Tuowei Wang
Minxing Huang
Fengzu Li
Ligeng Chen
Jinrui Zhang
Ju Ren
187
1
0
20 Oct 2025
LongLive: Real-time Interactive Long Video Generation
LongLive: Real-time Interactive Long Video Generation
Shuai Yang
Wei Huang
Ruihang Chu
Yicheng Xiao
Yuyang Zhao
...
Enze Xie
Yihao Chen
Yao Lu
Song Han
Yukang Chen
DiffMVGenVLM
241
30
0
26 Sep 2025
Plotñ Polish: Zero-shot Story Visualization and Disentangled Editing with Text-to-Image Diffusion Models
Plotñ Polish: Zero-shot Story Visualization and Disentangled Editing with Text-to-Image Diffusion Models
Kiymet Akdemir
Jing Shi
Kushal Kafle
Brian L. Price
Pinar Yanardag
DiffM
129
0
0
04 Sep 2025
SpotEdit: Evaluating Visually-Guided Image Editing Methods
SpotEdit: Evaluating Visually-Guided Image Editing Methods
Sara Ghazanfari
Wei-An Lin
Haitong Tian
Ersin Yumer
DiffM
140
0
0
25 Aug 2025
FlexMUSE: Multimodal Unification and Semantics Enhancement Framework with Flexible interaction for Creative Writing
FlexMUSE: Multimodal Unification and Semantics Enhancement Framework with Flexible interaction for Creative Writing
Jiahao Chen
Zhiyong Ma
Wenbiao Du
Qingyuan Chuai
87
1
0
22 Aug 2025
Story2Board: A Training-Free Approach for Expressive Storyboard Generation
Story2Board: A Training-Free Approach for Expressive Storyboard Generation
David Dinkevich
Matan Levy
Omri Avrahami
Dvir Samuel
Dani Lischinski
DiffM
93
4
0
13 Aug 2025
Lay2Story: Extending Diffusion Transformers for Layout-Togglable Story Generation
Lay2Story: Extending Diffusion Transformers for Layout-Togglable Story Generation
Ao Ma
Jiasong Feng
Ke Cao
Jing Wang
Yun Wang
Quanwei Zhang
Zhanjie Zhang
DiffMVGen
157
5
0
12 Aug 2025
StorySync: Training-Free Subject Consistency in Text-to-Image Generation via Region Harmonization
StorySync: Training-Free Subject Consistency in Text-to-Image Generation via Region Harmonization
Gopalji Gaur
Mohammadreza Zolfaghari
Thomas Brox
DiffM
152
0
0
31 Jul 2025
Aether Weaver: Multimodal Affective Narrative Co-Generation with Dynamic Scene Graphs
Aether Weaver: Multimodal Affective Narrative Co-Generation with Dynamic Scene Graphs
Saeed Ghorbani
VGen
148
0
0
29 Jul 2025
Captain Cinema: Towards Short Movie Generation
Captain Cinema: Towards Short Movie Generation
Junfei Xiao
Ceyuan Yang
Lvmin Zhang
S. Cai
Yang Zhao
Yuwei Guo
Gordon Wetzstein
Maneesh Agrawala
Alan Yuille
Lu Jiang
DiffMVGen
175
20
0
24 Jul 2025
Pushing the Limits of Safety: A Technical Report on the ATLAS Challenge 2025
Pushing the Limits of Safety: A Technical Report on the ATLAS Challenge 2025
Zonghao Ying
Siyang Wu
Run Hao
Peng Ying
Shixuan Sun
...
Xianglong Liu
Dawn Song
Yaoyao Liu
Juil Sock
Dacheng Tao
268
10
0
14 Jun 2025
A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation
A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation
Yukang Feng
Jianwen Sun
Chuanhao Li
Zizhen Li
Jiaxin Ai
...
Yifan Chang
Sizhuo Zhou
Shenglin Zhang
Yu Dai
Kaipeng Zhang
MLLMEGVM
302
0
0
11 Jun 2025
ViStoryBench: Comprehensive Benchmark Suite for Story Visualization
ViStoryBench: Comprehensive Benchmark Suite for Story Visualization
Cailin Zhuang
Ailin Huang
Wei Cheng
J. Wu
Yaoqi Hu
...
Hengyuan Xu
Xuanyang Zhang
Xianfang Zeng
Gang Yu
Fangqiu Yi
CoGe
478
12
0
30 May 2025
Action2Dialogue: Generating Character-Centric Narratives from Scene-Level Prompts
Action2Dialogue: Generating Character-Centric Narratives from Scene-Level Prompts
Taewon Kang
Ming C. Lin
DiffMVGen
389
1
0
22 May 2025
CineVerse: Consistent Keyframe Synthesis for Cinematic Scene Composition
CineVerse: Consistent Keyframe Synthesis for Cinematic Scene Composition
Quynh Phung
Long Mai
Fabian Caba Heilbron
Feng Liu
Jia-Bin Huang
Cusuh Ham
DiffMVGenCoGe
295
4
0
28 Apr 2025
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models
Pengfei Zhou
Fanrui Zhang
Xiaopeng Peng
Zhaopan Xu
Jiaxin Ai
...
Xiaojiang Peng
Xiaojun Chang
Wenqi Shao
Yang You
Jianchao Tan
ELMLRM
277
6
0
08 Apr 2025
Storybooth: Training-free Multi-Subject Consistency for Improved Visual Storytelling
Storybooth: Training-free Multi-Subject Consistency for Improved Visual StorytellingInternational Conference on Learning Representations (ICLR), 2025
Jaskirat Singh
Junshen Kevin Chen
Jonas Kohler
Michael Cohen
DiffMVGen
247
2
0
08 Apr 2025
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
Junhao Cheng
Yuying Ge
Yixiao Ge
Jing Liao
Mingyu Ding
VGenAI4CE
418
5
0
01 Apr 2025
SCORE: Story Coherence and Retrieval Enhancement for AI Narratives
SCORE: Story Coherence and Retrieval Enhancement for AI Narratives
Qiang Yi
Yangfan He
Jing Wang
Xinyuan Song
Shiyao Qian
...
Menghao Huo
Kuan Lu
Jiaqi Chen
Lewei He
Tianyu Shi
RALM
723
63
0
30 Mar 2025
Unified Dense Prediction of Video DiffusionComputer Vision and Pattern Recognition (CVPR), 2025
Lehan Yang
Lu Qi
Xianrui Li
Sheng Li
Varun Jampani
Ming-Hsuan Yang
MDEVOSVGen
368
6
0
12 Mar 2025
VisAgent: Narrative-Preserving Story Visualization FrameworkIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Seungkwon Kim
GyuTae Park
Sangyeon Kim
Seung-Hun Nam
267
2
0
04 Mar 2025
UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths
Weijia Mao
Zhiyong Yang
Mike Zheng Shou
MoE
702
2
0
10 Feb 2025
VideoAuteur: Towards Long Narrative Video Generation
VideoAuteur: Towards Long Narrative Video Generation
Junfei Xiao
Feng Cheng
Lu Qi
Liangke Gui
Jiepeng Cen
Zhibei Ma
Yaoyao Liu
Lu Jiang
VGen
391
7
0
10 Jan 2025
Generative AI for Cel-Animation: A Survey
Generative AI for Cel-Animation: A Survey
Yunlong Tang
Junjia Guo
Pinxin Liu
Zhiyuan Wang
Hang Hua
...
Jing Bi
Mingqian Feng
Xuzhao Li
Zeliang Zhang
Chenliang Xu
VGen
699
17
0
08 Jan 2025
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Haobo Yuan
Xianrui Li
Tao Zhang
Zilong Huang
Shilin Xu
...
Yunhai Tong
Lu Qi
Jiashi Feng
Ming-Hsuan Yang
Ming-Hsuan Yang
VLM
608
68
0
07 Jan 2025
IDEA-Bench: How Far are Generative Models from Professional Designing?
IDEA-Bench: How Far are Generative Models from Professional Designing?Computer Vision and Pattern Recognition (CVPR), 2024
C. Liang
Lianghua Huang
Jingwu Fang
Huanzhang Dou
Wei Wang
Zhi-Fan Wu
Yupeng Shi
Junge Zhang
Xin Zhao
Yu Liu
3DV
305
4
0
16 Dec 2024
SpearBot: Leveraging Large Language Models in a Generative-Critique
  Framework for Spear-Phishing Email Generation
SpearBot: Leveraging Large Language Models in a Generative-Critique Framework for Spear-Phishing Email GenerationInformation Fusion (Inf. Fusion), 2024
Qinglin Qi
Yun Luo
Yijia Xu
Wenbo Guo
Yong Fang
AAML
269
11
0
15 Dec 2024
Olympus: A Universal Task Router for Computer Vision Tasks
Olympus: A Universal Task Router for Computer Vision TasksComputer Vision and Pattern Recognition (CVPR), 2024
Yuanze Lin
Yunsheng Li
Dongdong Chen
Weijian Xu
Ronald Clark
Juil Sock
VLMObjD
1.2K
2
0
12 Dec 2024
Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads
Siqi Kou
Jiachun Jin
Chang Liu
Ye Ma
Jian Jia
Quan Chen
Peng Jiang
Zhijie Deng
Zhijie Deng
DiffMVGenVLM
604
28
0
28 Nov 2024
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained
  Video Reasoning via Core Frame Selection
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame SelectionComputer Vision and Pattern Recognition (CVPR), 2024
Songhao Han
Wei Huang
Hairong Shi
Le Zhuo
Xiu Su
Shifeng Zhang
Xu Zhou
Xiaojuan Qi
Yue Liao
Si Liu
VGenLRM
271
52
0
22 Nov 2024
Autoregressive Models in Vision: A Survey
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
Hao Fei
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
489
38
0
08 Nov 2024
MaskControl: Spatio-Temporal Control for Masked Motion Synthesis
MaskControl: Spatio-Temporal Control for Masked Motion Synthesis
Ekkasit Pinyoanuntapong
Muhammad Usama Saleem
Korrawe Karunratanakul
Pu Wang
Hongfei Xue
Chong Chen
Chuan Guo
Junli Cao
J. Ren
Sergey Tulyakov
VGen
467
83
0
14 Oct 2024
Trans4D: Realistic Geometry-Aware Transition for Compositional
  Text-to-4D Synthesis
Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis
Bohan Zeng
Ling Yang
Siyu Li
Jiaming Liu
Zixiang Zhang
...
Yongzhen Guo
Fu-Yun Wang
Minkai Xu
Stefano Ermon
Wentao Zhang
VGenAI4CE
208
17
0
09 Oct 2024
ACDC: Autoregressive Coherent Multimodal Generation using Diffusion
  Correction
ACDC: Autoregressive Coherent Multimodal Generation using Diffusion Correction
Hyungjin Chung
Dohun Lee
Jong Chul Ye
VGenDiffM
195
2
0
07 Oct 2024
A Survey on Multimodal Benchmarks: In the Era of Large AI Models
A Survey on Multimodal Benchmarks: In the Era of Large AI Models
Lin Li
Guikun Chen
Hanrong Shi
Jun Xiao
Long Chen
343
23
0
21 Sep 2024
SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for
  Short Drama
SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama
Jing Tang
Quanlu Jia
Yuqiang Xie
Zeyu Gong
Xiang Wen
Jiayi Zhang
Yalong Guo
Guibin Chen
Jiangping Yang
VGen
238
2
0
18 Aug 2024
Domain-invariant Representation Learning via Segment Anything Model for
  Blood Cell Classification
Domain-invariant Representation Learning via Segment Anything Model for Blood Cell Classification
Yongcheng Li
Lingcong Cai
Ying Lu
Cheng Lin
Yupeng Zhang
...
Genan Dai
Bowen Zhang
Jingzhou Cao
Xiangzhong Zhang
Xiaomao Fan
254
1
0
14 Aug 2024
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon Team
MLLM
580
629
0
16 May 2024
StoryGPT-V: Large Language Models as Consistent Story Visualizers
StoryGPT-V: Large Language Models as Consistent Story VisualizersComputer Vision and Pattern Recognition (CVPR), 2023
Xiaoqian Shen
Mohamed Elhoseiny
VLM
446
19
0
04 Dec 2023
1