ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.13812
  4. Cited By
Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs

Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs

26 August 2023
Hao Fei
Shengqiong Wu
Wei Ji
Hanwang Zhang
Tat-Seng Chua
    VGen
    DiffM
ArXivPDFHTML

Papers citing "Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs"

26 / 26 papers shown
Title
Structure-Accurate Medical Image Translation based on Dynamic Frequency Balance and Knowledge Guidance
Structure-Accurate Medical Image Translation based on Dynamic Frequency Balance and Knowledge Guidance
Jiahua Xu
Dawei Zhou
Lei Hu
Zaiyi Liu
N. Wang
Xinbo Gao
MedIm
27
0
0
13 Apr 2025
VEGAS: Towards Visually Explainable and Grounded Artificial Social Intelligence
VEGAS: Towards Visually Explainable and Grounded Artificial Social Intelligence
Hao Li
Hao Fei
Zechao Hu
Zhengwei Yang
Zheng Wang
45
0
0
03 Apr 2025
FlowMotion: Target-Predictive Conditional Flow Matching for Jitter-Reduced Text-Driven Human Motion Generation
FlowMotion: Target-Predictive Conditional Flow Matching for Jitter-Reduced Text-Driven Human Motion Generation
Manolo Canales Cuba
Vinícius do Carmo Melício
João Paulo Gois
3DH
48
0
0
02 Apr 2025
Multi-Object Sketch Animation by Scene Decomposition and Motion Planning
Multi-Object Sketch Animation by Scene Decomposition and Motion Planning
Jingyu Liu
Zijie Xin
Yuhan Fu
Ruixiang Zhao
Bangxiang Lan
Xirong Li
39
0
0
25 Mar 2025
Human Motion Unlearning
Human Motion Unlearning
Edoardo De Matteis
Matteo Migliarini
Alessio Sampieri
Indro Spinelli
Fabio Galasso
MU
55
0
0
24 Mar 2025
Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene
Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene
Shengqiong Wu
Hao Fei
Jingkang Yang
X. Li
Juncheng Li
H. Zhang
Tat-Seng Chua
46
0
0
19 Mar 2025
Universal Scene Graph Generation
Universal Scene Graph Generation
Shengqiong Wu
Hao Fei
Tat-Seng Chua
36
0
0
19 Mar 2025
Text2Story: Advancing Video Storytelling with Text Guidance
Taewon Kang
D. Kothandaraman
Ming C. Lin
DiffM
VGen
57
0
0
08 Mar 2025
Training-free and Adaptive Sparse Attention for Efficient Long Video Generation
Training-free and Adaptive Sparse Attention for Efficient Long Video Generation
Yifei Xia
Suhan Ling
Fangcheng Fu
Y. Wang
Huixia Li
Xuefeng Xiao
Bin Cui
VGen
51
2
0
28 Feb 2025
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Hao Fei
Shengqiong Wu
H. Zhang
Tat-Seng Chua
Shuicheng Yan
56
35
0
31 Dec 2024
FlexCache: Flexible Approximate Cache System for Video Diffusion
FlexCache: Flexible Approximate Cache System for Video Diffusion
Desen Sun
Henry Tian
Tim Lu
Sihang Liu
DiffM
28
0
0
18 Dec 2024
Synergistic Dual Spatial-aware Generation of Image-to-Text and
  Text-to-Image
Synergistic Dual Spatial-aware Generation of Image-to-Text and Text-to-Image
Yu Zhao
Hao Fei
Xiangtai Li
L. Qin
Jiayi Ji
Hongyuan Zhu
Meishan Zhang
M. Zhang
Jianguo Wei
DiffM
26
1
0
20 Oct 2024
PanoSent: A Panoptic Sextuple Extraction Benchmark for Multimodal
  Conversational Aspect-based Sentiment Analysis
PanoSent: A Panoptic Sextuple Extraction Benchmark for Multimodal Conversational Aspect-based Sentiment Analysis
Meng Luo
Hao Fei
Bobo Li
Shengqiong Wu
Qian Liu
Soujanya Poria
Erik Cambria
M. Lee
W. Hsu
28
7
0
18 Aug 2024
3D-GRES: Generalized 3D Referring Expression Segmentation
3D-GRES: Generalized 3D Referring Expression Segmentation
Changli Wu
Yihang Liu
Jiayi Ji
Yiwei Ma
Haowei Wang
Gen Luo
Henghui Ding
Xiaoshuai Sun
Rongrong Ji
34
6
0
30 Jul 2024
Bridging the Intent Gap: Knowledge-Enhanced Visual Generation
Bridging the Intent Gap: Knowledge-Enhanced Visual Generation
Yi Cheng
Ziwei Xu
Dongyun Lin
Harry Cheng
Yongkang Wong
Ying Sun
Joo Hwee Lim
Mohan S. Kankanhalli
23
0
0
21 May 2024
WorldGPT: Empowering LLM as Multimodal World Model
WorldGPT: Empowering LLM as Multimodal World Model
Zhiqi Ge
Hongzhe Huang
Mingze Zhou
Juncheng Li
Guoming Wang
Siliang Tang
Yueting Zhuang
29
26
0
28 Apr 2024
X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap
  Between Text-to-2D and Text-to-3D Generation
X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation
Yiwei Ma
Yijun Fan
Jiayi Ji
Haowei Wang
Xiaoshuai Sun
Guannan Jiang
Annan Shu
Rongrong Ji
14
7
0
30 Nov 2023
De-fine: Decomposing and Refining Visual Programs with Auto-Feedback
De-fine: Decomposing and Refining Visual Programs with Auto-Feedback
Minghe Gao
Juncheng Li
Hao Fei
Liang Pang
Wei Ji
Guoming Wang
Wenqiao Zhang
Siliang Tang
Yueting Zhuang
14
8
0
21 Nov 2023
VideoFusion: Decomposed Diffusion Models for High-Quality Video
  Generation
VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation
Zhengxiong Luo
Dayou Chen
Yingya Zhang
Yan Huang
Liangsheng Wang
Yujun Shen
Deli Zhao
Jinren Zhou
Tien-Ping Tan
DiffM
VGen
132
215
0
15 Mar 2023
CogVideo: Large-scale Pretraining for Text-to-Video Generation via
  Transformers
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
235
556
0
29 May 2022
Flexible Diffusion Modeling of Long Videos
Flexible Diffusion Modeling of Long Videos
William Harvey
Saeid Naderiparizi
Vaden Masrani
Christian Weilbach
Frank D. Wood
DiffM
BDL
VGen
167
284
0
23 May 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
StyleVideoGAN: A Temporal Generative Model using a Pretrained StyleGAN
StyleVideoGAN: A Temporal Generative Model using a Pretrained StyleGAN
Gereon Fox
A. Tewari
Mohamed A. Elgharib
Christian Theobalt
163
51
0
15 Jul 2021
VideoGPT: Video Generation using VQ-VAE and Transformers
VideoGPT: Video Generation using VQ-VAE and Transformers
Wilson Yan
Yunzhi Zhang
Pieter Abbeel
A. Srinivas
ViT
VGen
237
482
0
20 Apr 2021
Image Generation from Scene Graphs
Image Generation from Scene Graphs
Justin Johnson
Agrim Gupta
Li Fei-Fei
GNN
208
809
0
04 Apr 2018
1