Video diffusion transformers (vDiTs) have made impressive progress in text-to-video generation, but their high computational demands present major challenges for practical deployment. While existing acceleration methods reduce workload at various granularities, they often rely on heuristics, limiting their applicability.
View on arXiv@article{liu2025_2506.05096, title={ Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers }, author={ Haosong Liu and Yuge Cheng and Zihan Liu and Aiyue Chen and Jing Lin and Yiwu Yao and Chen Chen and Jingwen Leng and Yu Feng and Minyi Guo }, journal={arXiv preprint arXiv:2506.05096}, year={ 2025 } }