Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator

Neural Information Processing Systems (NeurIPS), 2023

25 September 2023

Jingyi Yu

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)

Papers citing "Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator"

50 / 57 papers shown

Hunyuan-GameCraft-2: Instruction-following Interactive Game World Model

118

28 Nov 2025

Intervene-All-Paths: Unified Mitigation of LVLM Hallucinations across Alignment Formats

290

21 Nov 2025

RISE-T2V: Rephrasing and Injecting Semantics with LLM for Expansive Text-to-Video Generation

148

06 Nov 2025

Why LVLMs Are More Prone to Hallucinations in Longer Responses: The Role of Context

23 Oct 2025

Augmenting Moment Retrieval: Zero-Dependency Two-Stage Learning

160

22 Oct 2025

Closed-Loop Transfer for Weakly-supervised Affordance Grounding

146

20 Oct 2025

Bridging Text and Video Generation: A Survey

264

06 Oct 2025

Sim-DETR: Unlock DETR for Temporal Sentence Grounding

300

28 Sep 2025

Communicative Agents for Slideshow Storytelling Video Generation based on LLMs

140

01 Sep 2025

No More Sibling Rivalry: Debiasing Human-Object Interaction Detection

165

31 Aug 2025

PersonaVlog: Personalized Multimodal Vlog Generation with Multi-Agent Collaboration and Iterative Self-Correction

128

19 Aug 2025

Interactive Video Generation via Domain Adaptation

Ishaan Rawal

Suryansh Kumar

DiffM VGen

162

30 May 2025

EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video GenerationComputer Vision and Pattern Recognition (CVPR), 2025

143

09 Apr 2025

MG-Gen: Single Image to Motion Graphics Generation

591

03 Apr 2025

Leveraging LLMs with Iterative Loop Structure for Enhanced Social Intelligence in Video Question Answering

288

27 Mar 2025

Learning to Animate Images from A Few Videos to Portray Delicate Human Actions

1.1K

01 Mar 2025

Visual Large Language Models for Generalized and Specialized Applications

458

06 Jan 2025

Optical Flow Representation Alignment Mamba Diffusion Model for Medical Video Generation

314

03 Nov 2024

MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound GenerationInternational Conference on Learning Representations (ICLR), 2024

T. Pham

Tri Ton

Chang D. Yoo

283

03 Oct 2024

Multi-modal Generative AI: Multi-modal LLMs, Diffusions, and the Unification

490

23 Sep 2024

Compositional 3D-aware Video Generation with LLM DirectorNeural Information Processing Systems (NeurIPS), 2024

Zhibo Chen

208

31 Aug 2024

Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of AttentionNeural Information Processing Systems (NeurIPS), 2024

Mengkang Hu

DiffM

279

01 Aug 2024

Adaptive Pre-training Data Detection for Large Language Models via Surprising Tokens

Anqi Zhang

Chaofeng Wu

245

30 Jul 2024

Natural Language but Omitted? On the Ineffectiveness of Large Language Models' privacy policy from End-users' Perspective

310

26 Jun 2024

Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality

Kai Wang

...

Yu Qiao

Kaipeng Zhang

355

13 Jun 2024

TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation

Weixi Feng

Jiachen Li

Michael Stephen Saxon

Tsu-Jui Fu

Wenhu Chen

William Yang Wang

EGVM VGen

238

12 Jun 2024

AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction

Yu-Gang Jiang

292

10 Jun 2024

Text Prompting for Multi-Concept Video Customization by Autoregressive Generation

227

22 May 2024

Image-of-Thought Prompting for Visual Reasoning Refinement in Multimodal Large Language Models

Yue Zhang

313

22 May 2024

DisenStudio: Customized Multi-subject Text-to-Video Generation with Disentangled Spatial Control

164

21 May 2024

TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation

419

07 May 2024

The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models

Cheng Shi

Sibei Yang

VLM

208

18 Apr 2024

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

Shenghai Yuan

Bin Lin

483

07 Apr 2024

FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs

350

27 Mar 2024

E4C: Enhance Editability for Text-Based Image Editing by Harnessing Efficient CLIP Guidance

192

15 Mar 2024

Action Reimagined: Text-to-Pose Video Editing for Dynamic Human Actions

169

11 Mar 2024

Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation

Joseph Cho

Fachrina Dewi Puspitasari

Lik-Hang Lee

274

08 Mar 2024

Intelligent Director: An Automatic Framework for Dynamic Visual Composition using ChatGPT

163

24 Feb 2024

UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing

Jianhong Bai

Tianyu He

Yuchi Wang

Junliang Guo

Haoji Hu

Zuozhu Liu

Jiang Bian

VGen

299

20 Feb 2024

Vlogger: Make Your Dream A VlogComputer Vision and Pattern Recognition (CVPR), 2024

Ziwei Liu

Yu Qiao

Yali Wang

VGen DiffM

147

17 Jan 2024

Large Language Models for Robotics: Opportunities, Challenges, and Perspectives

...

Xiang Li

Tianming Liu

Shu Zhang

LM&Ro

239

135

09 Jan 2024

A Recipe for Scaling up Text-to-Video Generation with Text-free Videos

268

25 Dec 2023

EVP: Enhanced Visual Perception using Inverse Multi-Attentive Feature Refinement and Regularized Image-Text Alignment

242

13 Dec 2023

PEEKABOO: Interactive Video Generation via Masked-DiffusionComputer Vision and Pattern Recognition (CVPR), 2023

276

12 Dec 2023

Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation

208

07 Dec 2023

DreamVideo: Composing Your Dream Videos with Customized Subject and Motion

241

153

07 Dec 2023

MEVG: Multi-event Video Generation with Text-to-Video Models

303

07 Dec 2023

WonderJourney: Going from Anywhere to Everywhere

Hong-Xing Yu

Haoyi Duan

Junhwa Hur

Kyle Sargent

Michael Rubinstein

...

Forrester Cole

Deqing Sun

Noah Snavely

Jiajun Wu

Charles Herrmann

VGen

262

118

06 Dec 2023

MotionZero:Exploiting Motion Priors for Zero-shot Text-to-Video Generation

Lianli Gao

Jingkuan Song

156

28 Nov 2023

FlowZero: Zero-Shot Text-to-Video Synthesis with LLM-Driven Dynamic Scene Syntax

Yu Lu

Linchao Zhu

Hehe Fan

Yi Yang

VGen DiffM

378

27 Nov 2023