ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.14494
  4. Cited By
Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM
  Animator

Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator

Neural Information Processing Systems (NeurIPS), 2023
25 September 2023
Hanzhuo Huang
Yufan Feng
Cheng Shi
Lan Xu
Jingyi Yu
Sibei Yang
    DiffMVGen
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator"

50 / 57 papers shown
Title
Hunyuan-GameCraft-2: Instruction-following Interactive Game World Model
Hunyuan-GameCraft-2: Instruction-following Interactive Game World Model
J. Tang
J. Liu
Jiaqi Li
Longhuang Wu
Haoyu Yang
Penghao Zhao
Siruis Gong
Xiang Yuan
Shuai Shao
Qinglin Lu
VGen
109
1
0
28 Nov 2025
Intervene-All-Paths: Unified Mitigation of LVLM Hallucinations across Alignment Formats
Intervene-All-Paths: Unified Mitigation of LVLM Hallucinations across Alignment Formats
Jiaye Qian
Ge Zheng
Yuchen Zhu
Sibei Yang
MLLM
265
1
0
21 Nov 2025
RISE-T2V: Rephrasing and Injecting Semantics with LLM for Expansive Text-to-Video Generation
RISE-T2V: Rephrasing and Injecting Semantics with LLM for Expansive Text-to-Video Generation
Xiangjun Zhang
Litong Gong
Yinglin Zheng
Yansong Liu
Wentao Jiang
Mingyi Xu
Biao Wang
Tiezheng Ge
Ming Zeng
DiffMVGen
128
1
0
06 Nov 2025
Why LVLMs Are More Prone to Hallucinations in Longer Responses: The Role of Context
Why LVLMs Are More Prone to Hallucinations in Longer Responses: The Role of Context
Ge Zheng
Jiaye Qian
Jiajin Tang
Sibei Yang
94
2
0
23 Oct 2025
Augmenting Moment Retrieval: Zero-Dependency Two-Stage Learning
Augmenting Moment Retrieval: Zero-Dependency Two-Stage Learning
Zhengxuan Wei
Jiajin Tang
Sibei Yang
VLM
152
0
0
22 Oct 2025
Closed-Loop Transfer for Weakly-supervised Affordance Grounding
Closed-Loop Transfer for Weakly-supervised Affordance Grounding
Jiajin Tang
Zhengxuan Wei
Ge Zheng
Sibei Yang
146
0
0
20 Oct 2025
Bridging Text and Video Generation: A Survey
Bridging Text and Video Generation: A Survey
Nilay Kumar
Priyansh Bhandari
G. Maragatham
VGen
252
0
0
06 Oct 2025
Sim-DETR: Unlock DETR for Temporal Sentence Grounding
Sim-DETR: Unlock DETR for Temporal Sentence Grounding
Jiajin Tang
Zhengxuan Wei
Yuchen Zhu
Cheng Shi
Guanbin Li
Guanbin Li
Sibei Yang
PINN
292
1
0
28 Sep 2025
Communicative Agents for Slideshow Storytelling Video Generation based on LLMs
Communicative Agents for Slideshow Storytelling Video Generation based on LLMs
Jingxing Fan
Jinrong Shen
Yusheng Yao
Shuangqing Wang
Qian Wang
Yuling Wang
DiffMVGen
136
0
0
01 Sep 2025
No More Sibling Rivalry: Debiasing Human-Object Interaction Detection
No More Sibling Rivalry: Debiasing Human-Object Interaction Detection
Bin Yang
Yulin Zhang
Hong-Yu Zhou
Sibei Yang
153
0
0
31 Aug 2025
PersonaVlog: Personalized Multimodal Vlog Generation with Multi-Agent Collaboration and Iterative Self-Correction
PersonaVlog: Personalized Multimodal Vlog Generation with Multi-Agent Collaboration and Iterative Self-Correction
Xiaolu Hou
Bing Ma
Jiaxiang Cheng
Xuhua Ren
Kai Yu
Wenyue Li
Tianxiang Zheng
Qinglin Lu
DiffMVGen
104
0
0
19 Aug 2025
Interactive Video Generation via Domain Adaptation
Interactive Video Generation via Domain Adaptation
Ishaan Rawal
Suryansh Kumar
DiffMVGen
148
0
0
30 May 2025
EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation
EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Diljeet Jagpal
Xi Chen
Vinay P. Namboodiri
DiffMVGen
139
0
0
09 Apr 2025
MG-Gen: Single Image to Motion Graphics Generation
MG-Gen: Single Image to Motion Graphics Generation
Takahiro Shirakawa
Tomoyuki Suzuki
Takuto Narumoto
Daichi Haraguchi
VGen
555
0
0
03 Apr 2025
Leveraging LLMs with Iterative Loop Structure for Enhanced Social Intelligence in Video Question Answering
Leveraging LLMs with Iterative Loop Structure for Enhanced Social Intelligence in Video Question Answering
Erika Mori
Yue Qiu
Hirokatsu Kataoka
Y. Aoki
276
0
0
27 Mar 2025
Learning to Animate Images from A Few Videos to Portray Delicate Human Actions
Haoxin Li
Yingchen Yu
Qilong Wu
Hanwang Zhang
Boyang Li
Song Bai
3DHVGen
1.1K
1
0
01 Mar 2025
Visual Large Language Models for Generalized and Specialized Applications
Jiayi Zhang
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
VLM
454
33
0
06 Jan 2025
Optical Flow Representation Alignment Mamba Diffusion Model for Medical
  Video Generation
Optical Flow Representation Alignment Mamba Diffusion Model for Medical Video Generation
Zhenbin Wang
Lei Zhang
Lituan Wang
Minjuan Zhu
Zhenwei Zhang
VGenMedIm
310
6
0
03 Nov 2024
MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation
MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound GenerationInternational Conference on Learning Representations (ICLR), 2024
T. Pham
Tri Ton
Chang D. Yoo
278
8
0
03 Oct 2024
Multi-modal Generative AI: Multi-modal LLMs, Diffusions, and the Unification
Multi-modal Generative AI: Multi-modal LLMs, Diffusions, and the Unification
X. Wang
Yuwei Zhou
Bin Huang
Hong Chen
Wenwu Zhu
DiffM
478
9
0
23 Sep 2024
Compositional 3D-aware Video Generation with LLM Director
Compositional 3D-aware Video Generation with LLM DirectorNeural Information Processing Systems (NeurIPS), 2024
Hanxin Zhu
Tianyu He
Anni Tang
Junliang Guo
Zhibo Chen
Jiang Bian
DiffMVGen
192
12
0
31 Aug 2024
Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy
  Curvature of Attention
Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of AttentionNeural Information Processing Systems (NeurIPS), 2024
Mengkang Hu
DiffM
263
30
0
01 Aug 2024
Adaptive Pre-training Data Detection for Large Language Models via
  Surprising Tokens
Adaptive Pre-training Data Detection for Large Language Models via Surprising Tokens
Anqi Zhang
Chaofeng Wu
245
8
0
30 Jul 2024
Natural Language but Omitted? On the Ineffectiveness of Large Language
  Models' privacy policy from End-users' Perspective
Natural Language but Omitted? On the Ineffectiveness of Large Language Models' privacy policy from End-users' Perspective
Shuning Zhang
Haobin Xing
Xin Yi
Hewu Li
PILM
297
0
0
26 Jun 2024
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing
  Reliability,Reproducibility, and Practicality
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality
Tianle Zhang
Langtian Ma
Yuchen Yan
Yuchen Zhang
Kai Wang
...
Wenqi Shao
Yang You
Yu Qiao
Ping Luo
Kaipeng Zhang
VGen
331
4
0
13 Jun 2024
TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and
  Image-to-Video Generation
TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation
Weixi Feng
Jiachen Li
Michael Stephen Saxon
Tsu-Jui Fu
Wenhu Chen
William Yang Wang
EGVMVGen
234
29
0
12 Jun 2024
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video
  Prediction
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction
Zhen Xing
Jingdong Sun
Zejia Weng
Zuxuan Wu
Yu-Gang Jiang
VGen
288
21
0
10 Jun 2024
Text Prompting for Multi-Concept Video Customization by Autoregressive
  Generation
Text Prompting for Multi-Concept Video Customization by Autoregressive Generation
D. Kothandaraman
Kihyuk Sohn
Ruben Villegas
P. Voigtlaender
Dinesh Manocha
Mohammad Babaeizadeh
VGenDiffM
207
3
0
22 May 2024
Image-of-Thought Prompting for Visual Reasoning Refinement in Multimodal
  Large Language Models
Image-of-Thought Prompting for Visual Reasoning Refinement in Multimodal Large Language Models
Qiji Zhou
Ruochen Zhou
Zike Hu
Panzhong Lu
Siyang Gao
Yue Zhang
LRM
300
42
0
22 May 2024
DisenStudio: Customized Multi-subject Text-to-Video Generation with
  Disentangled Spatial Control
DisenStudio: Customized Multi-subject Text-to-Video Generation with Disentangled Spatial Control
Hong Chen
Xin Wang
Yipeng Zhang
Yuwei Zhou
Zeyang Zhang
Siao Tang
Wenwu Zhu
VGenDiffM
153
17
0
21 May 2024
TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation
TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation
Hritik Bansal
Yonatan Bitton
Michal Yarom
Idan Szpektor
Aditya Grover
Kai-Wei Chang
DiffM
407
22
0
07 May 2024
The devil is in the object boundary: towards annotation-free instance
  segmentation using Foundation Models
The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models
Cheng Shi
Sibei Yang
VLM
196
8
0
18 Apr 2024
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Shenghai Yuan
Jinfa Huang
Yujun Shi
Yongqi Xu
Ruijie Zhu
Bin Lin
Xinhua Cheng
Li-xin Yuan
Jiebo Luo
VGen
471
55
0
07 Apr 2024
FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs
FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs
Guoqiang Chen
Xiuwei Shang
Shaoyin Cheng
Yanming Zhang
Weiming Zhang
Neng H. Yu
N. Yu
350
6
0
27 Mar 2024
E4C: Enhance Editability for Text-Based Image Editing by Harnessing
  Efficient CLIP Guidance
E4C: Enhance Editability for Text-Based Image Editing by Harnessing Efficient CLIP Guidance
Tianrui Huang
Pu Cao
Pu Cao
Chun Liu
Mengjie Hu
Zhiwei Liu
Qing-Huang Song
DiffM
184
0
0
15 Mar 2024
Action Reimagined: Text-to-Pose Video Editing for Dynamic Human Actions
Action Reimagined: Text-to-Pose Video Editing for Dynamic Human Actions
Lan Wang
Vishnu Boddeti
Sernam Lim
VGenDiffM
153
0
0
11 Mar 2024
Sora as an AGI World Model? A Complete Survey on Text-to-Video
  Generation
Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation
Joseph Cho
Fachrina Dewi Puspitasari
Sheng Zheng
Jingyao Zheng
Lik-Hang Lee
Tae-Ho Kim
Choong Seon Hong
Chaoning Zhang
EGVMVGen
274
65
0
08 Mar 2024
Intelligent Director: An Automatic Framework for Dynamic Visual
  Composition using ChatGPT
Intelligent Director: An Automatic Framework for Dynamic Visual Composition using ChatGPT
Sixiao Zheng
Jingyang Huo
Yu Wang
Yanwei Fu
VGenDiffM
162
1
0
24 Feb 2024
UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance
  Editing
UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing
Jianhong Bai
Tianyu He
Yuchi Wang
Junliang Guo
Haoji Hu
Zuozhu Liu
Jiang Bian
VGen
299
41
0
20 Feb 2024
Vlogger: Make Your Dream A Vlog
Vlogger: Make Your Dream A VlogComputer Vision and Pattern Recognition (CVPR), 2024
Shaobin Zhuang
Kunchang Li
Xinyuan Chen
Yaohui Wang
Ziwei Liu
Yu Qiao
Yali Wang
VGenDiffM
147
62
0
17 Jan 2024
Large Language Models for Robotics: Opportunities, Challenges, and
  Perspectives
Large Language Models for Robotics: Opportunities, Challenges, and Perspectives
Yuan Liu
Zihao Wu
Yiwei Li
Hanqi Jiang
Peng Shu
...
Lin Zhao
Bao Ge
Xiang Li
Tianming Liu
Shu Zhang
LM&Ro
232
134
0
09 Jan 2024
A Recipe for Scaling up Text-to-Video Generation with Text-free Videos
A Recipe for Scaling up Text-to-Video Generation with Text-free Videos
Xiang Wang
Shiwei Zhang
Hangjie Yuan
Zhiwu Qing
Biao Gong
Yingya Zhang
Yujun Shen
Changxin Gao
Nong Sang
DiffMVGen
264
49
0
25 Dec 2023
EVP: Enhanced Visual Perception using Inverse Multi-Attentive Feature
  Refinement and Regularized Image-Text Alignment
EVP: Enhanced Visual Perception using Inverse Multi-Attentive Feature Refinement and Regularized Image-Text Alignment
M. Lavrenyuk
Shariq Farooq Bhat
Matthias Müller
Peter Wonka
ObjDMDE
230
13
0
13 Dec 2023
PEEKABOO: Interactive Video Generation via Masked-Diffusion
PEEKABOO: Interactive Video Generation via Masked-DiffusionComputer Vision and Pattern Recognition (CVPR), 2023
Yash Jain
Anshul Nasery
Vibhav Vineet
Harkirat Singh Behl
VGen
276
59
0
12 Dec 2023
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation
Zhiwu Qing
Shiwei Zhang
Jiayu Wang
Xiang Wang
Yujie Wei
Yingya Zhang
Changxin Gao
Nong Sang
VGenDiffM
196
54
0
07 Dec 2023
DreamVideo: Composing Your Dream Videos with Customized Subject and
  Motion
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion
Yujie Wei
Shiwei Zhang
Zhiwu Qing
Hangjie Yuan
Zhiheng Liu
Yu Liu
Yingya Zhang
Jingren Zhou
Hongming Shan
DiffMVGen
241
150
0
07 Dec 2023
MEVG: Multi-event Video Generation with Text-to-Video Models
MEVG: Multi-event Video Generation with Text-to-Video Models
Gyeongrok Oh
Jaehwan Jeong
Sieun Kim
Wonmin Byeon
Jinkyu Kim
Sungwoong Kim
Sangpil Kim
VGenDiffM
291
36
0
07 Dec 2023
WonderJourney: Going from Anywhere to Everywhere
WonderJourney: Going from Anywhere to Everywhere
Hong-Xing Yu
Haoyi Duan
Junhwa Hur
Kyle Sargent
Michael Rubinstein
...
Forrester Cole
Deqing Sun
Noah Snavely
Jiajun Wu
Charles Herrmann
VGen
242
115
0
06 Dec 2023
MotionZero:Exploiting Motion Priors for Zero-shot Text-to-Video
  Generation
MotionZero:Exploiting Motion Priors for Zero-shot Text-to-Video Generation
Jingkuan Song
Litao Guo
Lianli Gao
Hengtao Shen
Jingkuan Song
VGen
144
6
0
28 Nov 2023
FlowZero: Zero-Shot Text-to-Video Synthesis with LLM-Driven Dynamic
  Scene Syntax
FlowZero: Zero-Shot Text-to-Video Synthesis with LLM-Driven Dynamic Scene Syntax
Yu Lu
Linchao Zhu
Hehe Fan
Yi Yang
VGenDiffM
376
20
0
27 Nov 2023
12
Next