Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.10629
Cited By
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
14 October 2024
Enze Xie
Junsong Chen
Junyu Chen
Han Cai
Haotian Tang
Yujun Lin
Zhekai Zhang
Muyang Li
Ligeng Zhu
Y. Lu
Song Han
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers"
36 / 36 papers shown
Title
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
Biao Gong
Cheng Zou
Dandan Zheng
Hu Yu
Jingdong Chen
...
Qingpei Guo
Rui Liu
Weilong Chai
Xinyu Xiao
Ziyuan Huang
MLLM
66
1
0
05 May 2025
WILD: a new in-the-Wild Image Linkage Dataset for synthetic image attribution
Pietro Bongini
S. Mandelli
Andrea Montibeller
Mirko Casu
Orazio Pontorno
...
Paolo Bestagini
Irene Amerini
F. D. De Natale
S. Battiato
Mauro Barni
VLM
76
0
0
28 Apr 2025
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Xu Ma
Peize Sun
Haoyu Ma
Hao Tang
Chih-Yao Ma
...
Matt Feiszli
Peizhao Zhang
Peter Vajda
Sam S. Tsai
Y. Fu
65
1
0
24 Apr 2025
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning
Le Zhuo
Liangbing Zhao
Sayak Paul
Yue Liao
Renrui Zhang
Yi Xin
Peng Gao
Mohamed Elhoseiny
H. Li
VLM
63
0
0
22 Apr 2025
Acquire and then Adapt: Squeezing out Text-to-Image Model for Image Restoration
Junyuan Deng
Xinyi Wu
Yongxing Yang
Congchao Zhu
Song Wang
Zhenyao Wu
26
0
0
21 Apr 2025
Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis
Jingjing Ren
Wenbo Li
Zhongdao Wang
Haoze Sun
Bangzhen Liu
...
Aoxue Li
Shifeng Zhang
Bin Shao
Yong Guo
Lei Zhu
VGen
29
0
0
20 Apr 2025
SUDO: Enhancing Text-to-Image Diffusion Models with Self-Supervised Direct Preference Optimization
Liang Peng
Boxi Wu
Haoran Cheng
Yibo Zhao
Xiaofei He
19
0
0
20 Apr 2025
SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation
Minho Park
Taewoong Kang
Jooyeol Yun
Sungwon Hwang
Jaegul Choo
VGen
MDE
17
0
0
19 Apr 2025
Packing Input Frame Context in Next-Frame Prediction Models for Video Generation
Lvmin Zhang
Maneesh Agrawala
DiffM
VGen
67
0
0
17 Apr 2025
H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models
Yushu Wu
Yanyu Li
Ivan Skorokhodov
Anil Kag
Willi Menapace
Sharath Girish
Aliaksandr Siarohin
Yanzhi Wang
Sergey Tulyakov
DiffM
VGen
33
0
0
14 Apr 2025
Structure-Accurate Medical Image Translation based on Dynamic Frequency Balance and Knowledge Guidance
Jiahua Xu
Dawei Zhou
Lei Hu
Zaiyi Liu
N. Wang
Xinbo Gao
MedIm
17
0
0
13 Apr 2025
Transfer between Modalities with MetaQueries
Xichen Pan
Satya Narayan Shukla
Aashu Singh
Zhuokai Zhao
Shlok Kumar Mishra
...
Jiuhai Chen
Kunpeng Li
F. Xu
Ji Hou
Saining Xie
DiffM
35
6
0
08 Apr 2025
Lumina-OmniLV: A Unified Multimodal Framework for General Low-Level Vision
Yuandong Pu
Le Zhuo
Kaiwen Zhu
Liangbin Xie
Wenlong Zhang
Xiangyu Chen
Peng Gao
Yu Qiao
Chao Dong
Yihao Liu
MLLM
55
1
0
07 Apr 2025
IntrinsiX: High-Quality PBR Generation using Image Priors
Peter Kocsis
Lukas Höllein
Matthias Nießner
33
0
0
01 Apr 2025
DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
Zheng-Peng Duan
Jiawei Zhang
Xin Jin
Z. Zhang
Zheng Xiong
Dongqing Zou
Jimmy S. Ren
Chun-Le Guo
Chongyi Li
34
0
0
30 Mar 2025
Can Video Diffusion Model Reconstruct 4D Geometry?
Jinjie Mai
Wenxuan Zhu
Haozhe Liu
Bing Li
Cheng Zheng
Jürgen Schmidhuber
Bernard Ghanem
VGen
MDE
70
0
0
27 Mar 2025
Scaling Down Text Encoders of Text-to-Image Diffusion Models
Lifu Wang
Daqing Liu
Xinchen Liu
Xiaodong He
VLM
38
0
0
25 Mar 2025
AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset
Haiyu Zhang
Xinyuan Chen
Yaohui Wang
Xihui Liu
Yunhong Wang
Yu Qiao
VGen
59
0
0
25 Mar 2025
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models
Jinjin Zhang
Qiuyu Huang
Junjie Liu
Xiefan Guo
Di Huang
42
1
0
24 Mar 2025
Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models
Jinho Jeong
Sangmin Han
Jinwoo Kim
Seon Joo Kim
29
0
0
24 Mar 2025
EDiT: Efficient Diffusion Transformers with Linear Compressed Attention
Philipp Becker
Abhinav Mehrotra
Ruchika Chavhan
Malcolm Chadwick
Luca Morreale
Mehdi Noroozi
Alberto Gil C. P. Ramos
Sourav Bhattacharya
38
0
0
20 Mar 2025
FP4DiT: Towards Effective Floating Point Quantization for Diffusion Transformers
Ruichen Chen
Keith G. Mills
Di Niu
MQ
50
0
0
19 Mar 2025
EEdit: Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing
Zexuan Yan
Yue Ma
Chang Zou
Wenteng Chen
Qifeng Chen
Linfeng Zhang
49
0
0
13 Mar 2025
NAMI: Efficient Image Generation via Progressive Rectified Flow Transformers
Yuhang Ma
Bo Cheng
Shanyuan Liu
Ao Ma
Xiaoyu Wu
Liebucha Wu
Dawei Leng
Yuhui Yin
55
0
0
12 Mar 2025
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Junsong Chen
Shuchen Xue
Yuyang Zhao
Jincheng Yu
Sayak Paul
Junyu Chen
Han Cai
E. Xie
Song Han
VLM
56
1
0
12 Mar 2025
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity
Kwanyoung Kim
Byeongsu Sim
DiffM
VLM
48
0
0
10 Mar 2025
TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation
Victor Shea-Jay Huang
Le Zhuo
Yi Xin
Zhaokai Wang
Peng Gao
Hongsheng Li
DiffM
35
1
0
10 Mar 2025
X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation
Jian Ma
Qirong Peng
Xu Guo
Chen Chen
H. Lu
Zhenyu Yang
VLM
64
1
0
08 Mar 2025
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation
Pengzhi Li
Pengfei Yu
Zide Liu
Wei He
Xuhao Pan
Xudong Rao
Tao Wei
Wei Chen
VLM
53
0
0
25 Feb 2025
DiC: Rethinking Conv3x3 Designs in Diffusion Models
Yuchuan Tian
Jing Han
Chengcheng Wang
Yuchen Liang
Chao Xu
Hanting Chen
DiffM
21
1
0
03 Jan 2025
SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Yushu Wu
Zhixing Zhang
Yanyu Li
Yanwu Xu
Anil Kag
...
Ju Hu
Dimitris N. Metaxas
Yanzhi Wang
Sergey Tulyakov
Jian Ren
DiffM
VGen
90
2
0
13 Dec 2024
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement
Zhennan Chen
Yajie Li
Haofan Wang
Z. Chen
Zhengkai Jiang
Jun Yu Li
Qian Wang
Jian Yang
Ying Tai
DiffM
45
8
0
10 Nov 2024
Taming Rectified Flow for Inversion and Editing
Jiangshan Wang
Junfu Pu
Zhongang Qi
Jiayi Guo
Yue Ma
Nisha Huang
Yuxin Chen
Xiu Li
Ying Shan
39
22
0
07 Nov 2024
Public Domain 12M: A Highly Aesthetic Image-Text Dataset with Novel Governance Mechanisms
Jordan Meyer
Nick Padgett
Cullen Miller
Laura Exline
26
4
0
30 Oct 2024
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
Lianghui Zhu
Zilong Huang
Bencheng Liao
Jun Hao Liew
Hanshu Yan
Jiashi Feng
Xinggang Wang
60
12
0
28 May 2024
Balanced Mixed-Type Tabular Data Synthesis with Diffusion Models
Zeyu Yang
Peikun Guo
Khadija Zanna
Akane Sano
Xiaoxue Yang
Akane Sano
DiffM
30
8
0
12 Apr 2024
1