Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.10270
Cited By
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
18 June 2021
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers"
50 / 415 papers shown
Title
A New Learning Paradigm for Foundation Model-based Remote Sensing Change Detection
Kaiyu Li
Xiangyong Cao
Deyu Meng
18
48
0
02 Dec 2023
Token Fusion: Bridging the Gap between Token Pruning and Token Merging
Minchul Kim
Shangqian Gao
Yen-Chang Hsu
Yilin Shen
Hongxia Jin
10
29
0
02 Dec 2023
Improve Supervised Representation Learning with Masked Image Modeling
Kaifeng Chen
Daniel M. Salz
Huiwen Chang
Kihyuk Sohn
Dilip Krishnan
Mojtaba Seyedhosseini
SSL
ViT
19
2
0
01 Dec 2023
Initializing Models with Larger Ones
Zhiqiu Xu
Yanjie Chen
Kirill Vishniakov
Yida Yin
Zhiqiang Shen
Trevor Darrell
Lingjie Liu
Zhuang Liu
28
17
0
30 Nov 2023
BioCLIP: A Vision Foundation Model for the Tree of Life
Samuel Stevens
Jiaman Wu
Matthew J Thompson
Elizabeth G Campolongo
Chan Hee Song
...
Wasila M Dahdul
Charles V. Stewart
Tanya Berger-Wolf
Wei-Lun Chao
Yu-Chuan Su
18
62
0
30 Nov 2023
Communication-Efficient Heterogeneous Federated Learning with Generalized Heavy-Ball Momentum
Riccardo Zaccone
Carlo Masone
Marco Ciccone
FedML
13
2
0
30 Nov 2023
Improving Adversarial Transferability via Model Alignment
A. Ma
Amir-massoud Farahmand
Yangchen Pan
Philip H. S. Torr
Jindong Gu
AAML
21
5
0
30 Nov 2023
DiG-IN: Diffusion Guidance for Investigating Networks -- Uncovering Classifier Differences Neuron Visualisations and Visual Counterfactual Explanations
Maximilian Augustin
Yannic Neuhaus
Matthias Hein
DiffM
19
3
0
29 Nov 2023
Efficient Stitchable Task Adaptation
Haoyu He
Zizheng Pan
Jing Liu
Jianfei Cai
Bohan Zhuang
16
3
0
29 Nov 2023
Efficient Key-Based Adversarial Defense for ImageNet by Using Pre-trained Model
AprilPyone Maungmaung
Isao Echizen
Hitoshi Kiya
VLM
AAML
21
0
0
28 Nov 2023
REACT: Recognize Every Action Everywhere All At Once
N. V. R. Chappa
Pha Nguyen
P. Dobbs
Khoa Luu
30
6
0
27 Nov 2023
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
Sotiris Anagnostidis
Gregor Bachmann
Imanol Schlag
Thomas Hofmann
23
2
0
06 Nov 2023
Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models
Andy Zhou
Jindong Wang
Yu-xiong Wang
Haohan Wang
VLM
30
6
0
02 Nov 2023
PAUMER: Patch Pausing Transformer for Semantic Segmentation
Evann Courdier
Prabhu Teja Sivaprasad
F. Fleuret
24
2
0
01 Nov 2023
TiC-CLIP: Continual Training of CLIP Models
Saurabh Garg
Mehrdad Farajtabar
Hadi Pouransari
Raviteja Vemulapalli
Sachin Mehta
Oncel Tuzel
Vaishaal Shankar
Fartash Faghri
VLM
CLIP
31
26
0
24 Oct 2023
Domain-specific optimization and diverse evaluation of self-supervised models for histopathology
Jeremy Lai
Faruk Ahmed
Supriya Vijay
Tiam Jaroensri
Jessica Loo
...
Jonathan Krause
Yun-hui Liu
Po-Hsuan Cameron Chen
Ellery Wulczyn
David F. Steiner
30
7
0
20 Oct 2023
Minimalist and High-Performance Semantic Segmentation with Plain Vision Transformers
Yuanduo Hong
Jue Wang
Weichao Sun
Huihui Pan
VLM
ViT
27
7
0
19 Oct 2023
Context-Aware Meta-Learning
Christopher Fifty
Dennis Duan
Ronald G. Junkins
Ehsan Amid
Jurij Leskovec
Christopher Ré
Sebastian Thrun
LRM
VLM
MLLM
25
9
0
17 Oct 2023
MatFormer: Nested Transformer for Elastic Inference
Devvrit
Sneha Kudugunta
Aditya Kusupati
Tim Dettmers
Kaifeng Chen
...
Yulia Tsvetkov
Hannaneh Hajishirzi
Sham Kakade
Ali Farhadi
Prateek Jain
24
22
0
11 Oct 2023
PriViT: Vision Transformers for Fast Private Inference
Naren Dhyani
Jianqiao Mo
Minsu Cho
Ameya Joshi
Siddharth Garg
Brandon Reagen
Chinmay Hegde
10
4
0
06 Oct 2023
Distilling Inductive Bias: Knowledge Distillation Beyond Model Compression
Gousia Habib
Tausifa Jan Saleem
Brejesh Lall
VLM
14
0
0
30 Sep 2023
Masked Autoencoders are Scalable Learners of Cellular Morphology
Oren Z. Kraus
Kian Kenyon-Dean
Saber Saberian
Maryam Fallah
Peter McLean
...
Chi Vicky Cheng
Kristen Morse
Maureen Makes
Ben Mabey
Berton A. Earnshaw
11
14
0
27 Sep 2023
Improving Facade Parsing with Vision Transformers and Line Integration
Bowen Wang
Jiaxing Zhang
Ran Zhang
Yunqin Li
Liangzhi Li
Yuta Nakashima
ViT
13
17
0
27 Sep 2023
CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss
R. S. Srinivasa
Jaejin Cho
Chouchang Yang
Yashas Malur Saidutta
Ching Hua Lee
Yilin Shen
Hongxia Jin
VLM
16
8
0
26 Sep 2023
Regress Before Construct: Regress Autoencoder for Point Cloud Self-supervised Learning
Yang Liu
C. L. P. Chen
Can Wang
Xulin King
Mengyuan Liu
3DPC
24
7
0
25 Sep 2023
FeCAM: Exploiting the Heterogeneity of Class Distributions in Exemplar-Free Continual Learning
Dipam Goswami
Yuyang Liu
Bartlomiej Twardowski
Joost van de Weijer
CLL
10
52
0
25 Sep 2023
Single Image Test-Time Adaptation for Segmentation
Klara Janouskova
T. Shor
Chaim Baskin
Jirí Matas
TTA
OOD
26
3
0
25 Sep 2023
On Separate Normalization in Self-supervised Transformers
Xiaohui Chen
Yinkai Wang
Yuanqi Du
S. Hassoun
Liping Liu
ViT
19
1
0
22 Sep 2023
NoisyNN: Exploring the Influence of Information Entropy Change in Learning Systems
Xiao-Xing Yu
Zhe Huang
Yao Xue
Lu Zhang
Li Wang
Tianming Liu
Dajiang Zhu
6
6
0
19 Sep 2023
Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts
Jiang-Xin Shi
Tong Wei
Zhi-Hua Zhou
Jiejing Shao
Xin-Yan Han
Yu-Feng Li
19
26
0
18 Sep 2023
Towards Universal Image Embeddings: A Large-Scale Dataset and Challenge for Generic Image Representations
Nikolaos-Antonios Ypsilantis
Kaifeng Chen
Bingyi Cao
Mário Lipovský
Pelin Dogan-Schönberger
Grzegorz Makosa
Boris Bluntschli
Mojtaba Seyedhosseini
Ondrej Chum
André Araujo
SSL
13
13
0
04 Sep 2023
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
Shiji Song
Li Erran Li
Gao Huang
ViT
19
22
0
04 Sep 2023
Is visual explanation with Grad-CAM more reliable for deeper neural networks? a case study with automatic pneumothorax diagnosis
Zirui Qiu
H. Rivaz
Yiming Xiao
FAtt
6
4
0
29 Aug 2023
Overcoming Generic Knowledge Loss with Selective Parameter Update
Wenxuan Zhang
Paul Janson
Rahaf Aljundi
Mohamed Elhoseiny
KELM
CLL
16
10
0
23 Aug 2023
Local Distortion Aware Efficient Transformer Adaptation for Image Quality Assessment
Kangmin Xu
Liang Liao
Jing Xiao
Chaofeng Chen
Haoning Wu
Qiong Yan
Weisi Lin
ViT
11
5
0
23 Aug 2023
Distributionally Robust Classification on a Data Budget
Ben Feuer
Ameya Joshi
Minh Pham
C. Hegde
OOD
14
2
0
07 Aug 2023
Continual Domain Adaptation on Aerial Images under Gradually Degrading Weather
C. S. Jahan
Andreas E. Savakis
9
1
0
02 Aug 2023
MiDaS v3.1 -- A Model Zoo for Robust Monocular Relative Depth Estimation
R. Birkl
Diana Wofk
Matthias Muller
MDE
18
133
0
26 Jul 2023
Learned Thresholds Token Merging and Pruning for Vision Transformers
Maxim Bonnaerens
J. Dambre
14
15
0
20 Jul 2023
Reverse Knowledge Distillation: Training a Large Model using a Small One for Retinal Image Matching on Limited Data
Sahar Almahfouz Nasser
N. Gupte
A. Sethi
MedIm
9
6
0
20 Jul 2023
Pre-train, Adapt and Detect: Multi-Task Adapter Tuning for Camouflaged Object Detection
Yinghui Xing
Dexuan Kong
Shizhou Zhang
Geng Chen
Lingyan Ran
Peng Wang
Yanning Zhang
31
4
0
20 Jul 2023
An Empirical Study of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration
Hiroki Naganuma
Ryuichiro Hataya
Kotaro Yoshida
Ioannis Mitliagkas
OODD
81
1
0
17 Jul 2023
FedYolo: Augmenting Federated Learning with Pretrained Transformers
Xuechen Zhang
Mingchen Li
Xiangyu Chang
Jiasi Chen
A. Roy-Chowdhury
A. Suresh
Samet Oymak
FedML
18
7
0
10 Jul 2023
MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers
Jakob Drachmann Havtorn
Amelie Royer
Tijmen Blankevoort
B. Bejnordi
17
8
0
05 Jul 2023
Stitched ViTs are Flexible Vision Backbones
Zizheng Pan
Jing Liu
Haoyu He
Jianfei Cai
Bohan Zhuang
11
2
0
30 Jun 2023
End-to-End Augmentation Hyperparameter Tuning for Self-Supervised Anomaly Detection
Jaemin Yoo
Lingxiao Zhao
L. Akoglu
17
4
0
21 Jun 2023
ExpPoint-MAE: Better interpretability and performance for self-supervised point cloud transformers
Ioannis Romanelis
Vlassis Fotis
Konstantinos Moustakas
Adrian Munteanu
ViT
3DPC
13
4
0
19 Jun 2023
SegViTv2: Exploring Efficient and Continual Semantic Segmentation with Plain Vision Transformers
Bowen Zhang
Liyang Liu
Minh Hieu Phan
Zhi Tian
Chunhua Shen
Yifan Liu
ViT
19
28
0
09 Jun 2023
Normalization Layers Are All That Sharpness-Aware Minimization Needs
Maximilian Mueller
Tiffany J. Vlaar
David Rolnick
Matthias Hein
8
18
0
07 Jun 2023
Performance-optimized deep neural networks are evolving into worse models of inferotemporal visual cortex
Drew Linsley
I. F. Rodriguez
Thomas Fel
Michael Arcaro
Saloni Sharma
Margaret Livingstone
Thomas Serre
22
18
0
06 Jun 2023
Previous
1
2
3
4
5
6
7
8
9
Next