Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.11886
Cited By
DeepViT: Towards Deeper Vision Transformer
22 March 2021
Daquan Zhou
Bingyi Kang
Xiaojie Jin
Linjie Yang
Xiaochen Lian
Zihang Jiang
Qibin Hou
Jiashi Feng
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DeepViT: Towards Deeper Vision Transformer"
50 / 253 papers shown
Title
The Lighter The Better: Rethinking Transformers in Medical Image Segmentation Through Adaptive Pruning
Xian Lin
Li Yu
Kwang-Ting Cheng
Zengqiang Yan
ViT
MedIm
22
31
0
29 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
30
32
0
19 Jun 2022
SP-ViT: Learning 2D Spatial Priors for Vision Transformers
Yuxuan Zhou
Wangmeng Xiang
C. Li
Biao Wang
Xihan Wei
Lei Zhang
M. Keuper
Xia Hua
ViT
29
15
0
15 Jun 2022
EfficientFormer: Vision Transformers at MobileNet Speed
Yanyu Li
Geng Yuan
Yang Wen
Eric Hu
Georgios Evangelidis
Sergey Tulyakov
Yanzhi Wang
Jian Ren
ViT
16
343
0
02 Jun 2022
Unified Recurrence Modeling for Video Action Anticipation
Tsung-Ming Tai
G. Fiameni
Cheng-Kuang Lee
Simon See
O. Lanz
19
8
0
02 Jun 2022
Few-Shot Diffusion Models
Giorgio Giannone
Didrik Nielsen
Ole Winther
DiffM
178
49
0
30 May 2022
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
Xiaosong Zhang
Yunjie Tian
Wei Huang
QiXiang Ye
Qi Dai
Lingxi Xie
Qi Tian
52
26
0
30 May 2022
Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation
Yixuan Wei
Han Hu
Zhenda Xie
Zheng-Wei Zhang
Yue Cao
Jianmin Bao
Dong Chen
B. Guo
CLIP
83
124
0
27 May 2022
Sharpness-Aware Training for Free
Jiawei Du
Daquan Zhou
Jiashi Feng
Vincent Y. F. Tan
Joey Tianyi Zhou
AAML
17
92
0
27 May 2022
A Study on Transformer Configuration and Training Objective
Fuzhao Xue
Jianghai Chen
Aixin Sun
Xiaozhe Ren
Zangwei Zheng
Xiaoxin He
Yongming Chen
Xin Jiang
Yang You
30
7
0
21 May 2022
MSTRIQ: No Reference Image Quality Assessment Based on Swin Transformer with Multi-Stage Fusion
Jing Wang
Haotian Fa
X. Hou
Yitian Xu
Tao Li
X. Lu
Lean Fu
29
21
0
20 May 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
102
73
0
20 May 2022
Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization
Luke Melas-Kyriazi
Christian Rupprecht
Iro Laina
Andrea Vedaldi
28
159
0
16 May 2022
Incorporating Prior Knowledge into Neural Networks through an Implicit Composite Kernel
Ziyang Jiang
Tongshu Zheng
Yiling Liu
David Carlson
15
4
0
15 May 2022
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers
Junting Pan
Adrian Bulat
Fuwen Tan
Xiatian Zhu
L. Dudziak
Hongsheng Li
Georgios Tzimiropoulos
Brais Martínez
ViT
23
180
0
06 May 2022
Sequencer: Deep LSTM for Image Classification
Yuki Tatsunami
Masato Taki
VLM
ViT
16
78
0
04 May 2022
A survey on attention mechanisms for medical applications: are we moving towards better algorithms?
Tiago Gonçalves
Isabel Rio-Torto
Luís F. Teixeira
J. S. Cardoso
OOD
MedIm
24
36
0
26 Apr 2022
High-Efficiency Lossy Image Coding Through Adaptive Neighborhood Information Aggregation
Ming-Tse Lu
Fangdong Chen
Shiliang Pu
Zhan Ma
37
44
0
25 Apr 2022
3D Shuffle-Mixer: An Efficient Context-Aware Vision Learner of Transformer-MLP Paradigm for Dense Prediction in Medical Volume
Jianye Pang
Cheng Jiang
Yihao Chen
Jianbo Chang
M. Feng
Renzhi Wang
Jianhua Yao
ViT
MedIm
28
11
0
14 Apr 2022
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
30
240
0
07 Apr 2022
MixFormer: Mixing Features across Windows and Dimensions
Qiang Chen
Qiman Wu
Jian Wang
Qinghao Hu
T. Hu
Errui Ding
Jian Cheng
Jingdong Wang
MDE
ViT
13
101
0
06 Apr 2022
MaxViT: Multi-Axis Vision Transformer
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
ViT
43
636
0
04 Apr 2022
Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers
Yunjie Tian
Lingxi Xie
Jiemin Fang
Mengnan Shi
Junran Peng
Xiaopeng Zhang
Jianbin Jiao
Qi Tian
QiXiang Ye
23
19
0
27 Mar 2022
Self-supervised Video-centralised Transformer for Video Face Clustering
Yujiang Wang
Mingzhi Dong
Jie Shen
Yi-Si Luo
Yiming Lin
Pingchuan Ma
Stavros Petridis
M. Pantic
ViT
18
3
0
24 Mar 2022
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Yuting Yang
Licheng Jiao
Xuantong Liu
F. Liu
Shuyuan Yang
Zhixi Feng
Xu Tang
ViT
MedIm
22
28
0
24 Mar 2022
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Zhan Tong
Yibing Song
Jue Wang
Limin Wang
ViT
111
1,120
0
23 Mar 2022
Training-free Transformer Architecture Search
Qinqin Zhou
Kekai Sheng
Xiawu Zheng
Ke Li
Xing Sun
Yonghong Tian
Jie Chen
Rongrong Ji
ViT
32
46
0
23 Mar 2022
CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance
Tianchen Zhao
Niansong Zhang
Xuefei Ning
He-Nan Wang
Li Yi
Yu Wang
3DPC
ViT
20
8
0
18 Mar 2022
LDP: Learnable Dynamic Precision for Efficient Deep Neural Network Training and Inference
Zhongzhi Yu
Y. Fu
Shang Wu
Mengquan Li
Haoran You
Yingyan Lin
24
1
0
15 Mar 2022
The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy
Tianlong Chen
Zhenyu (Allen) Zhang
Yu Cheng
Ahmed Hassan Awadallah
Zhangyang Wang
ViT
27
37
0
12 Mar 2022
Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice
Peihao Wang
Wenqing Zheng
Tianlong Chen
Zhangyang Wang
ViT
17
127
0
09 Mar 2022
Stepwise Feature Fusion: Local Guides Global
Jinfeng Wang
Qiming Huang
Feilong Tang
Jia Meng
Jionglong Su
Sifan Song
ViT
MedIm
19
179
0
07 Mar 2022
Knowledge Amalgamation for Object Detection with Transformers
Haofei Zhang
Feng Mao
Mengqi Xue
Gongfan Fang
Zunlei Feng
Jie Song
Mingli Song
ViT
108
12
0
07 Mar 2022
Patch Similarity Aware Data-Free Quantization for Vision Transformers
Zhikai Li
Liping Ma
Mengjuan Chen
Junrui Xiao
Qingyi Gu
MQ
ViT
17
43
0
04 Mar 2022
Multi-Tailed Vision Transformer for Efficient Inference
Yunke Wang
Bo Du
Wenyuan Wang
Chang Xu
ViT
208
6
0
03 Mar 2022
Recent Advances in Vision Transformer: A Survey and Outlook of Recent Work
Khawar Islam
ViT
26
44
0
03 Mar 2022
Aggregated Pyramid Vision Transformer: Split-transform-merge Strategy for Image Recognition without Convolutions
Ruikang Ju
Ting-Yu Lin
Jen-Shiun Chiang
Jia-Hao Jian
Yu-Shian Lin
Liu-Rui-Yi Huang
ViT
14
1
0
02 Mar 2022
TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation
R. Liu
Kailun Yang
Alina Roitberg
Jiaming Zhang
Kunyu Peng
Huayao Liu
Yaonan Wang
Rainer Stiefelhagen
ViT
39
36
0
27 Feb 2022
Auto-scaling Vision Transformers without Training
Wuyang Chen
Wei Huang
Xianzhi Du
Xiaodan Song
Zhangyang Wang
Denny Zhou
ViT
25
23
0
24 Feb 2022
Not All Patches are What You Need: Expediting Vision Transformers via Token Reorganizations
Youwei Liang
Chongjian Ge
Zhan Tong
Yibing Song
Jue Wang
P. Xie
ViT
12
233
0
16 Feb 2022
AA-TransUNet: Attention Augmented TransUNet For Nowcasting Tasks
Yimin Yang
S. Mehrkanoon
ViT
AI4TS
34
41
0
10 Feb 2022
Syfer: Neural Obfuscation for Private Data Release
Adam Yala
Victor Quach
H. Esfahanizadeh
Rafael G. L. DÓliveira
K. Duffy
Muriel Médard
Tommi Jaakkola
Regina Barzilay
PICV
11
7
0
28 Jan 2022
O-ViT: Orthogonal Vision Transformer
Yanhong Fei
Yingjie Liu
Xian Wei
Mingsong Chen
ViT
11
7
0
28 Jan 2022
DynaMixer: A Vision MLP Architecture with Dynamic Mixing
Ziyu Wang
Wenhao Jiang
Yiming Zhu
Li Yuan
Yibing Song
Wei Liu
35
43
0
28 Jan 2022
Generalised Image Outpainting with U-Transformer
Penglei Gao
Xi Yang
Rui Zhang
John Y. Goulermas
Yujie Geng
Yuyao Yan
Kaizhu Huang
ViT
14
17
0
27 Jan 2022
Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
S. Song
Li Erran Li
Gao Huang
ViT
22
452
0
03 Jan 2022
MIA-Former: Efficient and Robust Vision Transformers via Multi-grained Input-Adaptation
Zhongzhi Yu
Y. Fu
Sicheng Li
Chaojian Li
Yingyan Lin
ViT
28
19
0
21 Dec 2021
Lite Vision Transformer with Enhanced Self-Attention
Chenglin Yang
Yilin Wang
Jianming Zhang
He Zhang
Zijun Wei
Zhe-nan Lin
Alan Yuille
ViT
21
112
0
20 Dec 2021
A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation
Wuyang Chen
Xianzhi Du
Fan Yang
Lucas Beyer
Xiaohua Zhai
...
Huizhong Chen
Jing Li
Xiaodan Song
Zhangyang Wang
Denny Zhou
ViT
21
20
0
17 Dec 2021
MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection
Rui Dai
Srijan Das
Kumara Kahatapitiya
Michael S. Ryoo
F. Brémond
ViT
36
73
0
07 Dec 2021
Previous
1
2
3
4
5
6
Next