Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2505.13219
Cited By
v1
v2
v3
v4
v5 (latest)
PiT: Progressive Diffusion Transformer
19 May 2025
Jiafu Wu
Yabiao Wang
Jian Li
Jinlong Peng
Yun Cao
Chengjie Wang
Jiangning Zhang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PiT: Progressive Diffusion Transformer"
27 / 27 papers shown
Title
EMOv2: Pushing 5M Vision Model Frontier
Jing Zhang
T. Hu
Haoyang He
Zhucun Xue
Yun Wang
Chengjie Wang
Wenshu Fan
Hefei Ling
Dacheng Tao
VLM
228
4
0
09 Dec 2024
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
...
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
DiffM
2.0K
2,655
0
05 Mar 2024
DiffiT: Diffusion Vision Transformers for Image Generation
European Conference on Computer Vision (ECCV), 2023
Ali Hatamizadeh
Jiaming Song
Guilin Liu
Jan Kautz
Arash Vahdat
311
114
0
04 Dec 2023
PVG: Progressive Vision Graph for Vision Recognition
ACM Multimedia (ACM MM), 2023
Jiafu Wu
Jian Li
Jiangning Zhang
Boshen Zhang
M. Chi
Yabiao Wang
Chengjie Wang
ViT
284
20
0
01 Aug 2023
MDTv2: Masked Diffusion Transformer is a Strong Image Synthesizer
IEEE International Conference on Computer Vision (ICCV), 2023
Shanghua Gao
Pan Zhou
Mingg-Ming Cheng
Shuicheng Yan
DiffM
978
241
0
25 Mar 2023
One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale
International Conference on Machine Learning (ICML), 2023
Fan Bao
Shen Nie
Kaiwen Xue
Chongxuan Li
Shiliang Pu
Yaole Wang
Gang Yue
Yue Cao
Hang Su
Jun Zhu
DiffM
525
177
0
12 Mar 2023
Rethinking Mobile Block for Efficient Attention-based Models
IEEE International Conference on Computer Vision (ICCV), 2023
Jiangning Zhang
Xiangtai Li
Jian Li
Liang Liu
Zhucun Xue
Boshen Zhang
Zhe Jiang
Tianxin Huang
Yabiao Wang
Chengjie Wang
MQ
287
189
0
03 Jan 2023
All are Worth Words: A ViT Backbone for Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2022
Fan Bao
Shen Nie
Kaiwen Xue
Yue Cao
Chongxuan Li
Hang Su
Jun Zhu
VLM
529
490
0
25 Sep 2022
Masked-attention Mask Transformer for Universal Image Segmentation
Bowen Cheng
Ishan Misra
Alex Schwing
Alexander Kirillov
Rohit Girdhar
ISeg
1.4K
3,242
0
02 Dec 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Neural Information Processing Systems (NeurIPS), 2021
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
434
1,445
0
09 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
Neural Information Processing Systems (NeurIPS), 2021
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
369
389
0
07 Jun 2021
CvT: Introducing Convolutions to Vision Transformers
IEEE International Conference on Computer Vision (ICCV), 2021
Haiping Wu
Bin Xiao
Noel Codella
Xiyang Dai
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
423
2,242
0
29 Mar 2021
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
IEEE International Conference on Computer Vision (ICCV), 2021
Chun-Fu Chen
Quanfu Fan
Yikang Shen
ViT
286
1,868
0
27 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
IEEE International Conference on Computer Vision (ICCV), 2021
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
1.9K
28,004
0
25 Mar 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
IEEE International Conference on Computer Vision (ICCV), 2021
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
842
4,374
0
24 Feb 2021
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
IEEE International Conference on Computer Vision (ICCV), 2021
Li-xin Yuan
Yunpeng Chen
Tao Wang
Weihao Yu
Yujun Shi
Zihang Jiang
Francis E. H. Tay
Jiashi Feng
Shuicheng Yan
ViT
549
2,300
0
28 Jan 2021
Training data-efficient image transformers & distillation through attention
International Conference on Machine Learning (ICML), 2020
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Edouard Grave
ViT
621
8,174
0
23 Dec 2020
Pre-Trained Image Processing Transformer
Computer Vision and Pattern Recognition (CVPR), 2020
Hanting Chen
Yunhe Wang
Tianyu Guo
Chang Xu
Yiping Deng
Zhenhua Liu
Siwei Ma
Chunjing Xu
Chao Xu
Wen Gao
VLM
ViT
783
2,018
0
01 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
1.3K
54,329
0
22 Oct 2020
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
4.6K
25,339
0
19 Jun 2020
End-to-End Object Detection with Transformers
European Conference on Computer Vision (ECCV), 2020
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT
3DV
PINN
2.3K
16,241
0
26 May 2020
GLU Variants Improve Transformer
Noam M. Shazeer
523
1,429
0
12 Feb 2020
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Annual Meeting of the Association for Computational Linguistics (ACL), 2019
Elena Voita
David Talbot
F. Moiseev
Rico Sennrich
Ivan Titov
611
1,318
0
23 May 2019
Attention Is All You Need
Neural Information Processing Systems (NeurIPS), 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
2.9K
159,241
0
12 Jun 2017
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
3.6K
215,507
0
10 Dec 2015
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
3.1K
88,566
0
18 May 2015
LINE: Large-scale Information Network Embedding
Jian Tang
Meng Qu
Mingzhe Wang
Ming Zhang
Jun Yan
Qiaozhu Mei
GNN
347
5,504
0
12 Mar 2015
1