Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2012.12877
Cited By
Training data-efficient image transformers & distillation through attention
23 December 2020
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training data-efficient image transformers & distillation through attention"
50 / 1,106 papers shown
Title
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining
Xiaoyi Dong
Jianmin Bao
Yinglin Zheng
Ting Zhang
Dongdong Chen
...
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
CLIP
VLM
40
157
0
25 Aug 2022
Masked Autoencoders Enable Efficient Knowledge Distillers
Yutong Bai
Zeyu Wang
Junfei Xiao
Chen Wei
Huiyu Wang
Alan Yuille
Yuyin Zhou
Cihang Xie
CLL
24
39
0
25 Aug 2022
Improved Zero-Shot Audio Tagging & Classification with Patchout Spectrogram Transformers
Paul Primus
Gerhard Widmer
VLM
17
5
0
24 Aug 2022
Federated Self-Supervised Contrastive Learning and Masked Autoencoder for Dermatological Disease Diagnosis
Yawen Wu
Dewen Zeng
Zhepeng Wang
Yi Sheng
Lei Yang
A. James
Yiyu Shi
Jingtong Hu
18
7
0
24 Aug 2022
Efficient Attention-free Video Shift Transformers
Adrian Bulat
Brais Martínez
Georgios Tzimiropoulos
ViT
27
1
0
23 Aug 2022
How good are deep models in understanding the generated images?
Ali Borji
OOD
21
6
0
23 Aug 2022
A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective
Chanwoo Park
Sangdoo Yun
Sanghyuk Chun
AAML
18
32
0
21 Aug 2022
A Multi-Head Model for Continual Learning via Out-of-Distribution Replay
Gyuhak Kim
Zixuan Ke
Bin Liu
VLM
CLL
OODD
15
29
0
20 Aug 2022
Exploring Adversarial Robustness of Vision Transformers in the Spectral Perspective
Gihyun Kim
Juyeop Kim
Jong-Seok Lee
AAML
ViT
18
4
0
20 Aug 2022
Accelerating Vision Transformer Training via a Patch Sampling Schedule
Bradley McDanel
C. Huynh
ViT
25
1
0
19 Aug 2022
Improved Image Classification with Token Fusion
Keong-Hun Choi
Jin-Woo Kim
Yaolong Wang
J. Ha
ViT
19
0
0
19 Aug 2022
GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement
Zhi-Qi Cheng
Qianwen Dai
Siyao Li
Teruko Mitamura
Alexander G. Hauptmann
16
34
0
18 Aug 2022
The LAM Dataset: A Novel Benchmark for Line-Level Handwritten Text Recognition
S. Cascianelli
Vittorio Pippi
Martin Maarand
Marcella Cornia
Lorenzo Baraldi
Christopher Kermorvant
Rita Cucchiara
19
7
0
16 Aug 2022
How Well Do Vision Transformers (VTs) Transfer To The Non-Natural Image Domain? An Empirical Study Involving Art Classification
Vincent Tonkes
M. Sabatelli
ViT
25
6
0
09 Aug 2022
Transformers as Meta-Learners for Implicit Neural Representations
Yinbo Chen
Xiaolong Wang
AI4CE
21
60
0
04 Aug 2022
Multi-Feature Vision Transformer via Self-Supervised Representation Learning for Improvement of COVID-19 Diagnosis
Xiao Qi
D. Foran
J. Nosher
I. Hacihaliloglu
ViT
MedIm
22
3
0
03 Aug 2022
Making the Best of Both Worlds: A Domain-Oriented Transformer for Unsupervised Domain Adaptation
Wen-hui Ma
Jinming Zhang
Shuang Li
Chi Harold Liu
Yulin Wang
Wei Li
15
14
0
02 Aug 2022
Pose Uncertainty Aware Movement Synchrony Estimation via Spatial-Temporal Graph Transformer
Jicheng Li
Anjana Bhat
R. Barmaki
ViT
27
5
0
01 Aug 2022
Cross Attention Based Style Distribution for Controllable Person Image Synthesis
Xinyue Zhou
M. Yin
Xinyuan Chen
Li Sun
Changxin Gao
Qingli Li
DiffM
14
54
0
01 Aug 2022
Local Perception-Aware Transformer for Aerial Tracking
Changhong Fu
Wei Peng
Sihang Li
Junjie Ye
Ziang Cao
28
8
0
01 Aug 2022
Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization
T. Nguyen
Richard G. Baraniuk
Robert M. Kirby
Stanley J. Osher
Bao Wang
21
9
0
01 Aug 2022
UAVM: Towards Unifying Audio and Visual Models
Yuan Gong
Alexander H. Liu
Andrew Rouditchenko
James R. Glass
27
20
0
29 Jul 2022
Safety-Enhanced Autonomous Driving Using Interpretable Sensor Fusion Transformer
Hao Shao
Letian Wang
Ruobing Chen
Hongsheng Li
Y. Liu
38
195
0
28 Jul 2022
Jigsaw-ViT: Learning Jigsaw Puzzles in Vision Transformer
Yingyi Chen
Xiaoke Shen
Yahui Liu
Qinghua Tao
Johan A. K. Suykens
AAML
ViT
23
22
0
25 Jul 2022
Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation
Jiaming Zhang
Kailun Yang
Haowen Shi
Simon Reiß
Kunyu Peng
Chaoxiang Ma
Haodong Fu
Philip H. S. Torr
Kaiwei Wang
Rainer Stiefelhagen
ViT
MDE
31
35
0
25 Jul 2022
MAR: Masked Autoencoders for Efficient Action Recognition
Zhiwu Qing
Shiwei Zhang
Ziyuan Huang
Xiang Wang
Yuehuang Wang
Yiliang Lv
Changxin Gao
Nong Sang
21
42
0
24 Jul 2022
High-Resolution Swin Transformer for Automatic Medical Image Segmentation
Chen Wei
Shenghan Ren
Kaitai Guo
Haihong Hu
Jimin Liang
ViT
OOD
MedIm
17
36
0
23 Jul 2022
An Impartial Take to the CNN vs Transformer Robustness Contest
Francesco Pinto
Philip H. S. Torr
P. Dokania
UQCV
AAML
24
48
0
22 Jul 2022
PanGu-Coder: Program Synthesis with Function-Level Language Modeling
Fenia Christopoulou
Gerasimos Lampouras
Milan Gritta
Guchun Zhang
Yinpeng Guo
...
Guangtai Liang
Jia Wei
Xin Jiang
Qianxiang Wang
Qun Liu
ELM
SyDa
ALM
39
74
0
22 Jul 2022
Exploring Fine-Grained Audiovisual Categorization with the SSW60 Dataset
Grant Van Horn
Rui Qian
Kimberly Wilber
Hartwig Adam
Oisin Mac Aodha
Serge J. Belongie
21
10
0
21 Jul 2022
Towards Efficient Adversarial Training on Vision Transformers
Boxi Wu
Jindong Gu
Zhifeng Li
Deng Cai
Xiaofei He
Wei Liu
ViT
AAML
35
37
0
21 Jul 2022
Locality Guidance for Improving Vision Transformers on Tiny Datasets
Kehan Li
Runyi Yu
Zhennan Wang
Li-ming Yuan
Guoli Song
Jie Chen
ViT
24
43
0
20 Jul 2022
Vision Transformers: From Semantic Segmentation to Dense Prediction
Li Zhang
Jiachen Lu
Sixiao Zheng
Xinxuan Zhao
Xiatian Zhu
Yanwei Fu
Tao Xiang
Jianfeng Feng
Philip H. S. Torr
ViT
24
7
0
19 Jul 2022
Assaying Out-Of-Distribution Generalization in Transfer Learning
F. Wenzel
Andrea Dittadi
Peter V. Gehler
Carl-Johann Simon-Gabriel
Max Horn
...
Chris Russell
Thomas Brox
Bernt Schiele
Bernhard Schölkopf
Francesco Locatello
OOD
OODD
AAML
51
71
0
19 Jul 2022
GAFX: A General Audio Feature eXtractor
Zhaoyang Bu
Han Zhang
Xiaohu Zhu
30
0
0
19 Jul 2022
Time Is MattEr: Temporal Self-supervision for Video Transformers
Sukmin Yun
Jaehyung Kim
Dongyoon Han
Hwanjun Song
Jung-Woo Ha
Jinwoo Shin
ViT
15
12
0
19 Jul 2022
Adversarial Pixel Restoration as a Pretext Task for Transferable Perturbations
H. Malik
Shahina Kunhimon
Muzammal Naseer
Salman Khan
F. Khan
AAML
20
8
0
18 Jul 2022
HiFormer: Hierarchical Multi-scale Representations Using Transformers for Medical Image Segmentation
Moein Heidari
A. Kazerouni
Milad Soltany Kadarvish
Reza Azad
Ehsan Khodapanah Aghdam
Julien Cohen-Adad
Dorit Merhof
MedIm
ViT
25
178
0
18 Jul 2022
TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers
Jihao Liu
B. Liu
Hang Zhou
Hongsheng Li
Yu Liu
ViT
12
66
0
18 Jul 2022
Position Prediction as an Effective Pretraining Strategy
Shuangfei Zhai
Navdeep Jaitly
Jason Ramapuram
Dan Busbridge
Tatiana Likhomanenko
Joseph Y. Cheng
Walter A. Talbott
Chen Huang
Hanlin Goh
J. Susskind
ViT
43
23
0
15 Jul 2022
Learning Parallax Transformer Network for Stereo Image JPEG Artifacts Removal
Xuhao Jiang
Weimin Tan
Ri Cheng
Shili Zhou
Bo Yan
ViT
11
6
0
15 Jul 2022
IDET: Iterative Difference-Enhanced Transformers for High-Quality Change Detection
Qingle Guo
Ruofei Wang
Rui Huang
Wei Fan
Yuxiang Zhang
21
14
0
15 Jul 2022
Masked Autoencoders that Listen
Po-Yao (Bernie) Huang
Hu Xu
Juncheng Billy Li
Alexei Baevski
Michael Auli
Wojciech Galuba
Florian Metze
Christoph Feichtenhofer
13
268
0
13 Jul 2022
Eliminating Gradient Conflict in Reference-based Line-Art Colorization
Zekun Li
Zhengyang Geng
Zhao Kang
Wenyu Chen
Yibo Yang
21
35
0
13 Jul 2022
Dual Vision Transformer
Ting Yao
Yehao Li
Yingwei Pan
Yu Wang
Xiaoping Zhang
Tao Mei
ViT
141
75
0
11 Jul 2022
Facilitated machine learning for image-based fruit quality assessment
Manuel Knott
F. Pérez-Cruz
T. Defraeye
16
47
0
10 Jul 2022
Beyond Transfer Learning: Co-finetuning for Action Localisation
Anurag Arnab
Xuehan Xiong
A. Gritsenko
Rob Romijnders
Josip Djolonga
Mostafa Dehghani
Chen Sun
Mario Lucic
Cordelia Schmid
30
8
0
08 Jul 2022
VidConv: A modernized 2D ConvNet for Efficient Video Recognition
Chuong H. Nguyen
Su Huynh
Vinh Nguyen
Ngoc-Khanh Nguyen
ViT
27
3
0
08 Jul 2022
Efficient Lung Cancer Image Classification and Segmentation Algorithm Based on Improved Swin Transformer
Ruinan Sun
Yu Pang
ViT
MedIm
14
18
0
04 Jul 2022
I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
Zhikai Li
Qingyi Gu
MQ
46
95
0
04 Jul 2022
Previous
1
2
3
...
12
13
14
...
21
22
23
Next