ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.06072
  4. Cited By
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

12 August 2024
Zhuoyi Yang
Jiayan Teng
Wendi Zheng
Ming Ding
Shiyu Huang
Jiazheng Xu
Yuanming Yang
Wenyi Hong
Xiaohan Zhang
Guanyu Feng
Da Yin
Yuxuan Zhang
Weihan Wang
Weihan Wang
Yean Cheng
Xiaotao Gu
Yuxiao Dong
Jie Tang
    DiffM
    VGen
ArXivPDFHTML

Papers citing "CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer"

50 / 297 papers shown
Title
Decouple and Track: Benchmarking and Improving Video Diffusion Transformers for Motion Transfer
Decouple and Track: Benchmarking and Improving Video Diffusion Transformers for Motion Transfer
Qingyu Shi
Jianzong Wu
Jinbin Bai
J. Zhang
Lu Qi
X. Li
Yunhai Tong
44
0
0
21 Mar 2025
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
Kaisi Guan
Zhengfeng Lai
Y. Sun
Peng Zhang
Wei Liu
Kieran Liu
Meng Cao
Ruihua Song
VGen
54
0
0
21 Mar 2025
Position: Interactive Generative Video as Next-Generation Game Engine
Position: Interactive Generative Video as Next-Generation Game Engine
Jiwen Yu
Yiran Qin
Haoxuan Che
Quande Liu
Xintao Wang
Pengfei Wan
Di Zhang
Xihui Liu
VGen
42
1
0
21 Mar 2025
Enabling Versatile Controls for Video Diffusion Models
Enabling Versatile Controls for Video Diffusion Models
Xu Zhang
Hao Zhou
Haoming Qin
Xiaobin Lu
Jiaxing Yan
Guanzhong Wang
Zeyu Chen
Yi Liu
DiffM
VGen
55
0
0
21 Mar 2025
ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos
ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos
Haolin Yang
Feilong Tang
Ming Hu
Yulong Li
Junjie Guo
Yexin Liu
Zelin Peng
Junjun He
Zongyuan Ge
VGen
DiffM
92
0
0
20 Mar 2025
Animating the Uncaptured: Humanoid Mesh Animation with Video Diffusion Models
Animating the Uncaptured: Humanoid Mesh Animation with Video Diffusion Models
Marc Benedí San Millán
Angela Dai
Matthias Nießner
DiffM
64
0
0
20 Mar 2025
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
Quanhao Li
Zhen Xing
Rui Wang
Hui Zhang
Qi Dai
Zuxuan Wu
VGen
61
0
0
20 Mar 2025
VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling
VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling
Hyojun Go
Byeongjun Park
Hyelin Nam
Byung-Hoon Kim
Hyungjin Chung
Changick Kim
3DGS
VGen
92
1
0
20 Mar 2025
Temporal Regularization Makes Your Video Generator Stronger
Temporal Regularization Makes Your Video Generator Stronger
Harold Haodong Chen
Haojian Huang
Xianfeng Wu
Yexin Liu
Yajing Bai
Wen-Jie Shu
Harry Yang
Ser-Nam Lim
VGen
54
2
0
19 Mar 2025
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation
Qihui Zhang
Munan Ning
Zheyuan Liu
Yanbo Wang
Jiayi Ye
Yue Huang
Shuo Yang
Xiao Chen
Y. Song
Li Yuan
LRM
56
0
0
19 Mar 2025
Visual Persona: Foundation Model for Full-Body Human Customization
Visual Persona: Foundation Model for Full-Body Human Customization
Jisu Nam
Soowon Son
Zhan Xu
Jing Shi
Difan Liu
Feng Liu
Aashish Misraa
Seungryong Kim
Yang Zhou
DiffM
34
0
0
19 Mar 2025
Forensics-Bench: A Comprehensive Forgery Detection Benchmark Suite for Large Vision Language Models
Forensics-Bench: A Comprehensive Forgery Detection Benchmark Suite for Large Vision Language Models
Jin Wang
Chenghui Lv
Xian Li
Shichao Dong
Huadong Li
Kelu Yao
Chao Li
Wenqi Shao
Ping Luo
56
0
0
19 Mar 2025
Fast Autoregressive Video Generation with Diagonal Decoding
Fast Autoregressive Video Generation with Diagonal Decoding
Yang Ye
Junliang Guo
Haoyu Wu
Tianyu He
Tim Pearce
Tabish Rashid
Katja Hofmann
Jiang Bian
DiffM
VGen
71
1
0
18 Mar 2025
MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation
MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation
Hongyu Zhang
Yufan Deng
Shenghai Yuan
Peng Jin
Zesen Cheng
Yian Zhao
Chang-Shu Liu
Jie Chen
DiffM
VGen
89
0
0
18 Mar 2025
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Nvidia
Johan Bjorck
Fernando Castañeda
Nikita Cherniadev
Xingye Da
...
Ao Zhang
Hao Zhang
Yizhou Zhao
Ruijie Zheng
Yuke Zhu
VLM
56
12
0
18 Mar 2025
Concat-ID: Towards Universal Identity-Preserving Video Synthesis
Concat-ID: Towards Universal Identity-Preserving Video Synthesis
Yong Zhong
Zhuoyi Yang
Jiayan Teng
Xiaotao Gu
Chongxuan Li
VGen
63
0
0
18 Mar 2025
MagicDistillation: Weak-to-Strong Video Distillation for Large-Scale Few-Step Synthesis
MagicDistillation: Weak-to-Strong Video Distillation for Large-Scale Few-Step Synthesis
Shitong Shao
Hongwei Yi
Hanzhong Guo
Tian Ye
Daquan Zhou
Michael Lingelbach
Zhiqiang Xu
Zeke Xie
VGen
50
0
0
17 Mar 2025
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
Dewei Zhou
Mingwei Li
Zongxin Yang
Yi Yang
87
0
0
17 Mar 2025
TACO: Taming Diffusion for in-the-wild Video Amodal Completion
TACO: Taming Diffusion for in-the-wild Video Amodal Completion
Ruijie Lu
Yixin Chen
Yu Liu
Jiaxiang Tang
Junfeng Ni
Diwen Wan
Gang Zeng
Siyuan Huang
DiffM
VGen
41
3
0
15 Mar 2025
SteerX: Creating Any Camera-Free 3D and 4D Scenes with Geometric Steering
SteerX: Creating Any Camera-Free 3D and 4D Scenes with Geometric Steering
Byeongjun Park
Hyojun Go
Hyelin Nam
Byung-Hoon Kim
Hyungjin Chung
Changick Kim
VGen
LLMSV
44
1
0
15 Mar 2025
HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models
Ziqin Zhou
Yifan Yang
Y. Yang
Tianyu He
Houwen Peng
Kai Qiu
Qi Dai
Lili Qiu
Chong Luo
Lingqiao Liu
DiffM
VGen
45
1
0
14 Mar 2025
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
Jianhong Bai
Menghan Xia
Xiao Fu
Xintao Wang
Lianrui Mu
...
Zuozhu Liu
Haoji Hu
Xiang Bai
Pengfei Wan
Di Zhang
DiffM
VGen
43
3
0
14 Mar 2025
Cross-Modal Learning for Music-to-Music-Video Description Generation
Zhuoyuan Mao
Mengjie Zhao
Qiyu Wu
Zhi-Wei Zhong
Wei-Hsiang Liao
Hiromi Wakaki
Yuki Mitsufuji
DiffM
VGen
73
0
0
14 Mar 2025
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance
Yufan Deng
Xun Guo
Y. Wang
Jacob Zhiyuan Fang
Angtian Wang
Shenghai Yuan
Yiding Yang
Bo Liu
Haibin Huang
Chongyang Ma
DiffM
VGen
62
0
0
13 Mar 2025
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lv
Chenyang Si
Junhao Song
Zhenyu Yang
Yu Qiao
Ziwei Liu
Kwan-Yee K. Wong
VGen
DiffM
76
7
0
13 Mar 2025
CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models
Hao He
Ceyuan Yang
Shanchuan Lin
Yinghao Xu
Meng Wei
Liangke Gui
Qi Zhao
Gordon Wetzstein
Lu Jiang
Hongsheng Li
DiffM
VGen
95
5
0
13 Mar 2025
VideoMerge: Towards Training-free Long Video Generation
Siyang Zhang
Harry Yang
Ser-Nam Lim
DiffM
VGen
43
0
0
13 Mar 2025
Learning Few-Step Diffusion Models by Trajectory Distribution Matching
Yihong Luo
Tianyang Hu
Jiacheng Sun
Yujun Cai
Jing Tang
DiffM
66
1
0
13 Mar 2025
Long Context Tuning for Video Generation
Yuwei Guo
Ceyuan Yang
Ziyan Yang
Zhibei Ma
Zhijie Lin
Zhenheng Yang
Dahua Lin
Lu Jiang
DiffM
VGen
62
1
0
13 Mar 2025
Cosh-DiT: Co-Speech Gesture Video Synthesis via Hybrid Audio-Visual Diffusion Transformers
Yasheng Sun
Zhiliang Xu
Hang Zhou
Jiazhi Guan
Quanwei Yang
...
Yingying Li
Haocheng Feng
J. Wang
Ziwei Liu
Koike Hideki
VGen
51
0
0
13 Mar 2025
RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models
Yijing Lin
Mengqi Huang
Shuhan Zhuang
Zhendong Mao
VGen
41
0
0
13 Mar 2025
UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?
UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?
Yuanxin Liu
Rui Zhu
Shuhuai Ren
Jiacong Wang
Haoyuan Guo
Xu Sun
Lu Jiang
56
1
0
13 Mar 2025
V2Edit: Versatile Video Diffusion Editor for Videos and 3D Scenes
V2Edit: Versatile Video Diffusion Editor for Videos and 3D Scenes
Yanming Zhang
Jun-Kun Chen
Jipeng Lyu
Yu-Xiong Wang
DiffM
VGen
44
0
0
13 Mar 2025
Unified Dense Prediction of Video Diffusion
Lehan Yang
Lu Qi
X. Li
Sheng Li
Varun Jampani
Ming Yang
MDE
VOS
VGen
58
0
0
12 Mar 2025
WonderVerse: Extendable 3D Scene Generation with Video Generative Models
WonderVerse: Extendable 3D Scene Generation with Video Generative Models
Hao Feng
Zhi Zuo
Jia-Hui Pan
Ka-Hei Hui
Yihua Shao
Qi Dou
Wei Xie
Zhengzhe Liu
VGen
47
1
0
12 Mar 2025
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation
Hyeonho Jeong
Suhyeon Lee
Jong Chul Ye
VGen
57
0
0
12 Mar 2025
Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption
Luozheng Qin
Zhiyu Tan
Mengping Yang
Xiaomeng Yang
Hao Li
78
0
0
12 Mar 2025
PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop
Chenyu Li
Oscar Michel
Xichen Pan
Sainan Liu
Mike Roberts
Saining Xie
VGen
50
3
0
12 Mar 2025
Error Analyses of Auto-Regressive Video Diffusion Models: A Unified Framework
Jing Wang
Fengzhuo Zhang
Xiaoli Li
Vincent Y. F. Tan
Tianyu Pang
Chao Du
Aixin Sun
Zhuoran Yang
VGen
59
1
0
12 Mar 2025
TPDiff: Temporal Pyramid Video Diffusion Model
L. Ran
Mike Zheng Shou
73
0
0
12 Mar 2025
Accelerating Diffusion Sampling via Exploiting Local Transition Coherence
Shangwen Zhu
Han Zhang
Zhantao Yang
Qianyu Peng
Zhao Pu
H. Wang
Fan Cheng
DiffM
43
0
0
12 Mar 2025
WISA: World Simulator Assistant for Physics-Aware Text-to-Video Generation
Jing Wang
Ao Ma
Ke Cao
Jun Zheng
Zhanjie Zhang
...
Yuhang Ma
Bo Cheng
Dawei Leng
Yuhui Yin
Xiaodan Liang
VGen
79
3
0
11 Mar 2025
REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder
Yitian Zhang
Long Mai
Aniruddha Mahapatra
David Bourgin
Yicong Hong
Jonah Casebeer
Feng Liu
Y. Fu
DiffM
VGen
43
0
0
11 Mar 2025
SARA: Structural and Adversarial Representation Alignment for Training-efficient Diffusion Models
Hesen Chen
Junyan Wang
Zhiyu Tan
Hao Li
53
0
0
11 Mar 2025
VRMDiff: Text-Guided Video Referring Matting Generation of Diffusion
Lehan Yang
Jincen Song
Tianlong Wang
Daiqing Qi
Weili Shi
Yuheng Liu
Sheng Li
DiffM
VOS
VGen
69
0
0
11 Mar 2025
From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers
Jiacheng Liu
Chang Zou
Yuanhuiyi Lyu
Junjie Chen
Linfeng Zhang
DiffM
54
0
0
10 Mar 2025
DreamRelation: Relation-Centric Video Customization
Yujie Wei
Shiwei Zhang
Hangjie Yuan
Biao Gong
Longxiang Tang
...
Haonan Qiu
Hengjia Li
Shuai Tan
Y. Zhang
Hongming Shan
VGen
68
1
0
10 Mar 2025
AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion
Mingzhen Sun
Weining Wang
Gen Li
Jiawei Liu
Jiahui Sun
Wanquan Feng
Shanshan Lao
Siyu Zhou
Qian He
J. Liu
DiffM
VGen
73
2
0
10 Mar 2025
VACE: All-in-One Video Creation and Editing
Zeyinzi Jiang
Zhen Han
Chaojie Mao
J. Zhang
Yulin Pan
Yu Liu
DiffM
VGen
36
4
0
10 Mar 2025
CineBrain: A Large-Scale Multi-Modal Brain Dataset During Naturalistic Audiovisual Narrative Processing
Jianxiong Gao
Yichang Liu
Baofeng Yang
Jianfeng Feng
Yanwei Fu
VGen
53
1
0
10 Mar 2025
Previous
123456
Next