Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.07118
Cited By
DeiT III: Revenge of the ViT
14 April 2022
Hugo Touvron
Matthieu Cord
Hervé Jégou
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DeiT III: Revenge of the ViT"
22 / 72 papers shown
Title
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Yuxin Fang
Wen Wang
Binhui Xie
Quan-Sen Sun
Ledell Yu Wu
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
CLIP
54
673
0
14 Nov 2022
ParCNetV2: Oversized Kernel with Enhanced Attention
Ruihan Xu
Haokui Zhang
Wenze Hu
Shiliang Zhang
Xiaoyu Wang
ViT
17
6
0
14 Nov 2022
BiViT: Extremely Compressed Binary Vision Transformer
Yefei He
Zhenyu Lou
Luoming Zhang
Jing Liu
Weijia Wu
Hong Zhou
Bohan Zhuang
ViT
MQ
18
28
0
14 Nov 2022
Effective Audio Classification Network Based on Paired Inverse Pyramid Structure and Dense MLP Block
Yunhao Chen
Yunjie Zhu
Zihui Yan
Yifan Huang
Zhen Ren
Jianlu Shen
Lifang Chen
20
9
0
05 Nov 2022
Towards Sustainable Self-supervised Learning
Shanghua Gao
Pan Zhou
Mingg-Ming Cheng
Shuicheng Yan
CLL
25
7
0
20 Oct 2022
lo-fi: distributed fine-tuning without communication
Mitchell Wortsman
Suchin Gururangan
Shen Li
Ali Farhadi
Ludwig Schmidt
Michael G. Rabbat
Ari S. Morcos
19
24
0
19 Oct 2022
OCD: Learning to Overfit with Conditional Diffusion Models
Shahar Lutati
Lior Wolf
DiffM
18
8
0
02 Oct 2022
ViT-DD: Multi-Task Vision Transformer for Semi-Supervised Driver Distraction Detection
Yunsheng Ma
Ziran Wang
ViT
33
13
0
19 Sep 2022
Accelerating Vision Transformer Training via a Patch Sampling Schedule
Bradley McDanel
C. Huynh
ViT
22
1
0
19 Aug 2022
Vision Transformers: From Semantic Segmentation to Dense Prediction
Li Zhang
Jiachen Lu
Sixiao Zheng
Xinxuan Zhao
Xiatian Zhu
Yanwei Fu
Tao Xiang
Jianfeng Feng
Philip H. S. Torr
ViT
19
7
0
19 Jul 2022
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
Feng Liang
Yangguang Li
Diana Marculescu
SSL
TPM
ViT
40
22
0
28 May 2022
A Closer Look at Self-Supervised Lightweight Vision Transformers
Shaoru Wang
Jin Gao
Zeming Li
Jian-jun Sun
Weiming Hu
ViT
62
41
0
28 May 2022
Vision Transformers in 2022: An Update on Tiny ImageNet
Ethan Huynh
ViT
26
11
0
21 May 2022
Better plain ViT baselines for ImageNet-1k
Lucas Beyer
Xiaohua Zhai
Alexander Kolesnikov
ViT
VLM
22
111
0
03 May 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,412
0
11 Nov 2021
ResNet strikes back: An improved training procedure in timm
Ross Wightman
Hugo Touvron
Hervé Jégou
AI4TS
207
484
0
01 Oct 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,592
0
04 May 2021
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
282
1,518
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,604
0
24 Feb 2021
Video Transformer Network
Daniel Neimark
Omri Bar
Maya Zohar
Dotan Asselmann
ViT
193
419
0
01 Feb 2021
Fixing the train-test resolution discrepancy: FixEfficientNet
Hugo Touvron
Andrea Vedaldi
Matthijs Douze
Hervé Jégou
AAML
176
110
0
18 Mar 2020
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
282
39,170
0
01 Sep 2014
Previous
1
2