ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.06072
  4. Cited By
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

12 August 2024
Zhuoyi Yang
Jiayan Teng
Wendi Zheng
Ming Ding
Shiyu Huang
Jiazheng Xu
Yuanming Yang
Wenyi Hong
Xiaohan Zhang
Guanyu Feng
Da Yin
Yuxuan Zhang
Weihan Wang
Weihan Wang
Yean Cheng
Xiaotao Gu
Yuxiao Dong
Jie Tang
    DiffM
    VGen
ArXivPDFHTML

Papers citing "CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer"

50 / 297 papers shown
Title
FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution
FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution
Gene Chou
Wenqi Xian
Guandao Yang
Mohamed Abdelfattah
Bharath Hariharan
Noah Snavely
Ning Yu
P. Debevec
MDE
22
0
0
09 Apr 2025
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
Ruotian Peng
Haiying He
Yake Wei
Yandong Wen
D. Hu
VLM
32
0
0
09 Apr 2025
SIGMAN:Scaling 3D Human Gaussian Generation with Millions of Assets
SIGMAN:Scaling 3D Human Gaussian Generation with Millions of Assets
Yuhang Yang
Fengqi Liu
Yixing Lu
Qin Zhao
Pingyu Wu
...
Ran Yi
Yang Cao
Lizhuang Ma
Zheng-jun Zha
Junting Dong
3DGS
27
0
0
09 Apr 2025
TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis
TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis
Tri Ton
Ji Woo Hong
Chang D. Yoo
VGen
24
0
0
08 Apr 2025
Video-Bench: Human-Aligned Video Generation Benchmark
Video-Bench: Human-Aligned Video Generation Benchmark
Hui Han
Siyuan Li
Jiaqi Chen
Yiwen Yuan
Yuling Wu
...
Y. Li
J. Zhang
Chi Zhang
Li Li
Yongxin Ni
EGVM
VGen
65
0
0
07 Apr 2025
One-Minute Video Generation with Test-Time Training
One-Minute Video Generation with Test-Time Training
Karan Dalal
Daniel Koceja
Gashon Hussein
Jiarui Xu
Yue Zhao
...
Tatsunori Hashimoto
Sanmi Koyejo
Yejin Choi
Yu Sun
Xiaolong Wang
ViT
75
3
0
07 Apr 2025
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Mengchao Wang
Qiang Wang
Fan Jiang
Yaqi Fan
Yunpeng Zhang
Yonggang Qi
Kun Zhao
Mu Xu
DiffM
VGen
23
0
0
07 Apr 2025
Multi-identity Human Image Animation with Structural Video Diffusion
Multi-identity Human Image Animation with Structural Video Diffusion
Zhenzhi Wang
Y. Li
Yanhong Zeng
Yuwei Guo
D. Lin
Tianfan Xue
Bo Dai
VGen
17
0
0
05 Apr 2025
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Xuyang Guo
Zekai Huang
Jiayan Huo
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
ALM
VGen
53
2
0
05 Apr 2025
HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration
HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration
Boyuan Wang
Runqi Ouyang
Xiaofeng Wang
Zheng Zhu
Guosheng Zhao
Chaojun Ni
Guan Huang
Lihong Liu
Xingang Wang
3DGS
63
0
0
04 Apr 2025
MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
Wulin Xie
Y. Zhang
Chaoyou Fu
Yang Shi
Bingyan Nie
Hongkai Chen
Z. Zhang
Liang Wang
T. Tan
31
1
0
04 Apr 2025
Classic Video Denoising in a Machine Learning World: Robust, Fast, and Controllable
Classic Video Denoising in a Machine Learning World: Robust, Fast, and Controllable
Xin Jin
Simon Niklaus
Zhoutong Zhang
Zhihao Xia
Chunle Guo
Yuting Yang
J. Chen
Chongyi Li
VGen
31
0
0
04 Apr 2025
Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model
Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model
Shengjun Zhang
Jinzhao Li
Xin Fei
Hao Liu
Yueqi Duan
DiffM
3DGS
VGen
64
0
0
03 Apr 2025
SkyReels-A2: Compose Anything in Video Diffusion Transformers
SkyReels-A2: Compose Anything in Video Diffusion Transformers
Zhengcong Fei
D. Li
Di Qiu
J. Wang
Yikun Dou
...
J. Xu
Mingyuan Fan
Guibin Chen
Yang Li
Yahui Zhou
DiffM
VGen
63
2
0
03 Apr 2025
Morpheus: Benchmarking Physical Reasoning of Video Generative Models with Real Physical Experiments
Morpheus: Benchmarking Physical Reasoning of Video Generative Models with Real Physical Experiments
Chenyu Zhang
Daniil Cherniavskii
Andrii Zadaianchuk
Antonios Tragoudaras
Antonios Vozikis
Thijmen Nijdam
Derck W. E. Prinzhorn
Mark Bodracska
N. Sebe
E. Gavves
EGVM
VGen
43
0
0
03 Apr 2025
FlowR: Flowing from Sparse to Dense 3D Reconstructions
FlowR: Flowing from Sparse to Dense 3D Reconstructions
Tobias Fischer
Samuel Rota Buló
Yung-Hsu Yang
Nikhil Varma Keetha
Lorenzo Porzi
Norman Muller
Katja Schwarz
Jonathon Luiten
Marc Pollefeys
Peter Kontschieder
3DGS
39
0
0
02 Apr 2025
WorldScore: A Unified Evaluation Benchmark for World Generation
WorldScore: A Unified Evaluation Benchmark for World Generation
Haoyi Duan
Hong-Xing Yu
Sirui Chen
L. Fei-Fei
Jiajun Wu
VGen
60
1
0
01 Apr 2025
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
Junhao Cheng
Yuying Ge
Yixiao Ge
Jing Liao
Ying Shan
VGen
AI4CE
49
0
0
01 Apr 2025
Articulated Kinematics Distillation from Video Diffusion Models
Articulated Kinematics Distillation from Video Diffusion Models
Xuan Li
Qianli Ma
Tsung-Yi Lin
Yongxin Chen
Chenfanfu Jiang
Ming-Yu Liu
Donglai Xiang
VGen
28
0
0
01 Apr 2025
HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation
HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation
Boyuan Wang
Xiaofeng Wang
Chaojun Ni
Guosheng Zhao
Zhiqin Yang
...
Yukun Zhou
Xinze Chen
Guan Huang
Lihong Liu
Xingang Wang
VGen
49
2
0
31 Mar 2025
HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation
HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation
Kun Liu
Qi Liu
Xinchen Liu
Jie Li
Yongdong Zhang
Jiebo Luo
Xiaodong He
Wu Liu
VGen
35
0
0
31 Mar 2025
Sim-and-Real Co-Training: A Simple Recipe for Vision-Based Robotic Manipulation
Sim-and-Real Co-Training: A Simple Recipe for Vision-Based Robotic Manipulation
Abhiram Maddukuri
Z. L. Jiang
L. Chen
Soroush Nasiriany
Yuqi Xie
...
Scott Reed
Ken Goldberg
Ajay Mandlekar
Linxi Fan
Yuke Zhu
55
1
0
31 Mar 2025
SketchVideo: Sketch-based Video Generation and Editing
SketchVideo: Sketch-based Video Generation and Editing
Feng-Lin Liu
Hongbo Fu
Xintao Wang
Weicai Ye
Pengfei Wan
Di Zhang
Lin Gao
DiffM
VGen
37
0
0
30 Mar 2025
VLIPP: Towards Physically Plausible Video Generation with Vision and Language Informed Physical Prior
VLIPP: Towards Physically Plausible Video Generation with Vision and Language Informed Physical Prior
Xindi Yang
Baolu Li
Y. Zhang
Zhenfei Yin
Lei Bai
...
Zhiyong Wang
Jianfei Cai
Tien-Tsin Wong
Huchuan Lu
Xu Jia
DiffM
VGen
39
0
0
30 Mar 2025
VideoGen-Eval: Agent-based System for Video Generation Evaluation
VideoGen-Eval: Agent-based System for Video Generation Evaluation
Yuhang Yang
Ke Fan
S.
Hongxiang Li
Ailing Zeng
FeiLin Han
Wei-dong Zhai
W. Liu
Yang Cao
Zheng-jun Zha
EGVM
VGen
73
0
0
30 Mar 2025
MoCha: Towards Movie-Grade Talking Character Synthesis
MoCha: Towards Movie-Grade Talking Character Synthesis
Cong Wei
Bo Sun
Haoyu Ma
Ji Hou
F. Xu
...
Kunpeng Li
Tingbo Hou
Animesh Sinha
Peter Vajda
Wenhu Chen
VGen
36
0
0
30 Mar 2025
DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers
DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers
H. Zhang
R. Su
Zhihang Yuan
Pengtao Chen
Mingzhu Shen Yibo Fan
Shengen Yan
Guohao Dai
Yu Wang
34
0
0
28 Mar 2025
CoGen: 3D Consistent Video Generation via Adaptive Conditioning for Autonomous Driving
CoGen: 3D Consistent Video Generation via Adaptive Conditioning for Autonomous Driving
Yishen Ji
Ziyue Zhu
Zhenxin Zhu
Kaixin Xiong
Ming Lu
Zhiqi Li
Lijun Zhou
Haiyang Sun
Bing Wang
Tong Lu
VGen
41
1
0
28 Mar 2025
DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation
DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation
Haoyu Zhao
Zhongang Qi
Cong Wang
Qingping Zheng
Guansong Lu
Fei Chen
Hang Xu
Zuxuan Wu
DiffM
VGen
41
0
0
27 Mar 2025
Exploring the Evolution of Physics Cognition in Video Generation: A Survey
Exploring the Evolution of Physics Cognition in Video Generation: A Survey
Minghui Lin
Xiang Wang
Y. Wang
Shu Wang
Fengqi Dai
...
Cunxiang Wang
Zhengrong Zuo
Nong Sang
Siteng Huang
Donglin Wang
EGVM
VGen
75
3
0
27 Mar 2025
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness
Dian Zheng
Ziqi Huang
Hongbo Liu
Kai Zou
Yinan He
...
Y. Zhang
Jingwen He
Wei-Shi Zheng
Yu Qiao
Ziwei Liu
EGVM
VGen
46
3
0
27 Mar 2025
Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance
Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance
Jaywon Koo
J. Hernandez
Moayed Haji-Ali
Ziyan Yang
Vicente Ordonez
EGVM
62
0
0
27 Mar 2025
Synthetic Video Enhances Physical Fidelity in Video Synthesis
Synthetic Video Enhances Physical Fidelity in Video Synthesis
Qi Zhao
Xingyu Ni
Ziyu Wang
Feng Cheng
Ziyan Yang
Lu Jiang
Bohan Wang
VGen
38
2
0
26 Mar 2025
Video Motion Graphs
Video Motion Graphs
Haiyang Liu
Zhan Xu
Fa-Ting Hong
Hsin-Ping Huang
Yi Zhou
Yang Zhou
DiffM
VGen
83
0
0
26 Mar 2025
VPO: Aligning Text-to-Video Generation Models with Prompt Optimization
VPO: Aligning Text-to-Video Generation Models with Prompt Optimization
Jiale Cheng
Ruiliang Lyu
Xiaotao Gu
Xiao-Chang Liu
Jiazheng Xu
...
Zhuoyi Yang
Yuxiao Dong
Jie Tang
H. Wang
Minlie Huang
VGen
72
0
0
26 Mar 2025
Multi-Object Sketch Animation by Scene Decomposition and Motion Planning
Multi-Object Sketch Animation by Scene Decomposition and Motion Planning
Jingyu Liu
Zijie Xin
Yuhan Fu
Ruixiang Zhao
Bangxiang Lan
Xirong Li
34
0
0
25 Mar 2025
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
Jiazhi Guan
Kaisiyuan Wang
Zhiliang Xu
Quanwei Yang
Yasheng Sun
...
Errui Ding
J. Wang
Youjian Zhao
Hang Zhou
Ziwei Liu
VGen
37
0
0
25 Mar 2025
AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset
AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset
Haiyu Zhang
Xinyuan Chen
Yaohui Wang
Xihui Liu
Yunhong Wang
Yu Qiao
VGen
59
0
0
25 Mar 2025
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Yuchao Gu
Weijia Mao
Mike Zheng Shou
VGen
67
1
0
25 Mar 2025
Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Mask2^22DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Tianhao Qi
Jianlong Yuan
Wanquan Feng
Shancheng Fang
Jiawei Liu
Siyu Zhou
Qian He
Hongtao Xie
Yongdong Zhang
DiffM
VGen
34
0
0
25 Mar 2025
AdaWorld: Learning Adaptable World Models with Latent Actions
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao
Siyuan Zhou
Yilun Du
Jun Zhang
Chuang Gan
VGen
51
3
0
24 Mar 2025
CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models
CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models
Weichen Fan
Amber Yijia Zheng
Raymond A. Yeh
Ziwei Liu
41
1
0
24 Mar 2025
Target-Aware Video Diffusion Models
Target-Aware Video Diffusion Models
Taeksoo Kim
Hanbyul Joo
DiffM
VGen
89
1
0
24 Mar 2025
Aether: Geometric-Aware Unified World Modeling
Aether: Geometric-Aware Unified World Modeling
Aether Team
Haoyi Zhu
Y. Wang
Jianjun Zhou
Wenzheng Chang
...
Zizun Li
Junyi Chen
Chunhua Shen
Jiangmiao Pang
Tong He
DiffM
VGen
51
2
0
24 Mar 2025
Can Text-to-Video Generation help Video-Language Alignment?
Can Text-to-Video Generation help Video-Language Alignment?
Luca Zanella
Massimiliano Mancini
Willi Menapace
Sergey Tulyakov
Yiming Wang
Elisa Ricci
DiffM
VGen
50
0
0
24 Mar 2025
Training-free Diffusion Acceleration with Bottleneck Sampling
Training-free Diffusion Acceleration with Bottleneck Sampling
Ye Tian
Xin Xia
Yuxi Ren
Shanchuan Lin
Xing Wang
Xuefeng Xiao
Yunhai Tong
L. Yang
Bin Cui
49
0
0
24 Mar 2025
Video-T1: Test-Time Scaling for Video Generation
Video-T1: Test-Time Scaling for Video Generation
F. Liu
Hanyang Wang
Yimo Cai
Kaiyan Zhang
Xiaohang Zhan
Yueqi Duan
DiffM
VGen
76
1
0
24 Mar 2025
ReconDreamer++: Harmonizing Generative and Reconstructive Models for Driving Scene Representation
ReconDreamer++: Harmonizing Generative and Reconstructive Models for Driving Scene Representation
Guosheng Zhao
Xiaofeng Wang
Chaojun Ni
Zheng Zhu
Wenkang Qin
Guan Huang
Xingang Wang
44
1
0
24 Mar 2025
LongDiff: Training-Free Long Video Generation in One Go
LongDiff: Training-Free Long Video Generation in One Go
Zhuoling Li
Hossein Rahmani
Qiuhong Ke
J. Liu
DiffM
VGen
VLM
54
0
0
23 Mar 2025
InstructVEdit: A Holistic Approach for Instructional Video Editing
InstructVEdit: A Holistic Approach for Instructional Video Editing
Chi Zhang
C. Feng
Feng Yan
Qiming Zhang
Mingjin Zhang
Yujie Zhong
Jing Zhang
Lin Ma
DiffM
VGen
36
0
0
22 Mar 2025
Previous
123456
Next