ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.10270
  4. Cited By
How to train your ViT? Data, Augmentation, and Regularization in Vision
  Transformers

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

18 June 2021
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
    ViT
ArXivPDFHTML

Papers citing "How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers"

50 / 415 papers shown
Title
Vote&Mix: Plug-and-Play Token Reduction for Efficient Vision Transformer
Vote&Mix: Plug-and-Play Token Reduction for Efficient Vision Transformer
Shuai Peng
Di Fu
Baole Wei
Yong Cao
Liangcai Gao
Zhi Tang
ViT
30
1
0
30 Aug 2024
Symmetric masking strategy enhances the performance of Masked Image
  Modeling
Symmetric masking strategy enhances the performance of Masked Image Modeling
Khanh-Binh Nguyen
Chae Jung Park
26
0
0
23 Aug 2024
Supervised Representation Learning towards Generalizable Assembly State
  Recognition
Supervised Representation Learning towards Generalizable Assembly State Recognition
Tim J. Schoonbeek
Goutham Balachandran
H. Onvlee
Tim Houben
Shao-Hsuan Hung
Jacek Kustra
Peter H. N. de With
Fons van der Sommen
29
1
0
21 Aug 2024
Focus on Focus: Focus-oriented Representation Learning and Multi-view
  Cross-modal Alignment for Glioma Grading
Focus on Focus: Focus-oriented Representation Learning and Multi-view Cross-modal Alignment for Glioma Grading
Li Pan
Yupei Zhang
Qiushi Yang
Tan Li
Xiaohan Xing
Maximus C. F. Yeung
Zhen Chen
27
1
0
16 Aug 2024
Beyond Uniform Query Distribution: Key-Driven Grouped Query Attention
Beyond Uniform Query Distribution: Key-Driven Grouped Query Attention
Zohaib Khan
Muhammad Khaquan
Omer Tafveez
Burhanuddin Samiwala
Agha Ali Raza
27
3
0
15 Aug 2024
Downstream Transfer Attack: Adversarial Attacks on Downstream Models
  with Pre-trained Vision Transformers
Downstream Transfer Attack: Adversarial Attacks on Downstream Models with Pre-trained Vision Transformers
Weijie Zheng
Xingjun Ma
Hanxun Huang
Zuxuan Wu
Yu-Gang Jiang
AAML
19
0
0
03 Aug 2024
Privacy-Preserving Split Learning with Vision Transformers using
  Patch-Wise Random and Noisy CutMix
Privacy-Preserving Split Learning with Vision Transformers using Patch-Wise Random and Noisy CutMix
Yang Jin
Sihun Baek
Lei Zhang
Hyelin Nam
Praneeth Vepakomma
Ramesh Raskar
Mehdi Bennis
Seong-Lyun Kim
18
2
0
02 Aug 2024
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
Richard Ren
Steven Basart
Adam Khoja
Alice Gatti
Long Phan
...
Alexander Pan
Gabriel Mukobi
Ryan H. Kim
Stephen Fitz
Dan Hendrycks
ELM
26
19
0
31 Jul 2024
Mixture of Nested Experts: Adaptive Processing of Visual Tokens
Mixture of Nested Experts: Adaptive Processing of Visual Tokens
Gagan Jain
Nidhi Hegde
Aditya Kusupati
Arsha Nagrani
Shyamal Buch
Prateek Jain
Anurag Arnab
Sujoy Paul
MoE
33
7
0
29 Jul 2024
Depth-Wise Convolutions in Vision Transformers for Efficient Training on
  Small Datasets
Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets
Tianxiao Zhang
Wenju Xu
Bo Luo
Guanghui Wang
ViT
MDE
31
7
0
28 Jul 2024
A Survey on Cell Nuclei Instance Segmentation and Classification:
  Leveraging Context and Attention
A Survey on Cell Nuclei Instance Segmentation and Classification: Leveraging Context and Attention
João D. Nunes
D. Montezuma
Domingos Oliveira
Tania Pereira
Jaime S. Cardoso
31
1
0
26 Jul 2024
Hybrid Deep Learning-Based for Enhanced Occlusion Segmentation in PICU
  Patient Monitoring
Hybrid Deep Learning-Based for Enhanced Occlusion Segmentation in PICU Patient Monitoring
Mario Francisco Munoz
Hoang Vu Huy
Thanh-Dung Le
34
1
0
18 Jul 2024
Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language
  Large Models
Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models
Chen Ju
Haicheng Wang
Haozhe Cheng
Xu Chen
Zhonghua Zhai
Weilin Huang
Jinsong Lan
Shuai Xiao
Bo Zheng
VLM
41
5
0
16 Jul 2024
Adaptive Parametric Activation
Adaptive Parametric Activation
Konstantinos Panagiotis Alexandridis
Jiankang Deng
Anh Nguyen
Shan Luo
28
2
0
11 Jul 2024
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Kazi Sajeed Mehrab
M. Maruf
Arka Daw
Harish Babu Manogaran
Abhilash Neog
...
Paula Mabee
Wasila Dahdul
Anuj Karpatne
Wasila M Dahdul
Anuj Karpatne
20
4
0
10 Jul 2024
CTRL-F: Pairing Convolution with Transformer for Image Classification
  via Multi-Level Feature Cross-Attention and Representation Learning Fusion
CTRL-F: Pairing Convolution with Transformer for Image Classification via Multi-Level Feature Cross-Attention and Representation Learning Fusion
Hosam S. El-Assiouti
Hadeer El-Saadawy
M. Al-Berry
M. Tolba
ViT
47
0
0
09 Jul 2024
Image-Conditional Diffusion Transformer for Underwater Image Enhancement
Image-Conditional Diffusion Transformer for Underwater Image Enhancement
Xingyang Nie
Su Pan
Xiaoyu Zhai
Shifei Tao
Fengzhong Qu
Biao Wang
Huilin Ge
Guojie Xiao
27
2
0
07 Jul 2024
Precision at Scale: Domain-Specific Datasets On-Demand
Precision at Scale: Domain-Specific Datasets On-Demand
Jesús M. Rodríguez-de-Vera
Imanol G. Estepa
Ignacio Sarasúa
Bhalaji Nagarajan
P. Radeva
34
2
0
03 Jul 2024
PathAlign: A vision-language model for whole slide images in
  histopathology
PathAlign: A vision-language model for whole slide images in histopathology
Faruk Ahmed
Andrew Sellergren
Lin Yang
Shawn Xu
Boris Babenko
...
S. Shetty
Daniel Golden
Yun-hui Liu
David F. Steiner
Ellery Wulczyn
LM&MA
VLM
34
13
0
27 Jun 2024
Towards Efficient and Scalable Training of Differentially Private Deep
  Learning
Towards Efficient and Scalable Training of Differentially Private Deep Learning
Sebastian Rodriguez Beltran
Marlon Tobaben
Niki Loppi
Antti Honkela
14
0
0
25 Jun 2024
A Simple Framework for Open-Vocabulary Zero-Shot Segmentation
A Simple Framework for Open-Vocabulary Zero-Shot Segmentation
Thomas Stegmüller
Tim Lebailly
Nikola Dukic
Behzad Bozorgtabar
Tinne Tuytelaars
Jean-Philippe Thiran
VLM
31
1
0
23 Jun 2024
Potion: Towards Poison Unlearning
Potion: Towards Poison Unlearning
Stefan Schoepf
Jack Foster
Alexandra Brintrup
AAML
MU
33
6
0
13 Jun 2024
UDON: Universal Dynamic Online distillatioN for generic image
  representations
UDON: Universal Dynamic Online distillatioN for generic image representations
Nikolaos-Antonios Ypsilantis
Kaifeng Chen
André Araujo
Ondřej Chum
30
3
0
12 Jun 2024
Towards Fundamentally Scalable Model Selection: Asymptotically Fast
  Update and Selection
Towards Fundamentally Scalable Model Selection: Asymptotically Fast Update and Selection
Wenxiao Wang
Weiming Zhuang
Lingjuan Lyu
27
0
0
11 Jun 2024
Adapters Strike Back
Adapters Strike Back
Jan-Martin O. Steitz
Stefan Roth
22
5
0
10 Jun 2024
Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor
  Control
Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor Control
Dongyoon Hwang
ByungKun Lee
Hojoon Lee
Hyunseung Kim
Jaegul Choo
32
0
0
10 Jun 2024
Nomic Embed Vision: Expanding the Latent Space
Nomic Embed Vision: Expanding the Latent Space
Zach Nussbaum
Brandon Duderstadt
Andriy Mulyar
VLM
33
5
0
06 Jun 2024
Parameter-Inverted Image Pyramid Networks
Parameter-Inverted Image Pyramid Networks
Xizhou Zhu
Xue Yang
Zhaokai Wang
Hao Li
Wenhan Dou
Junqi Ge
Lewei Lu
Yu Qiao
Jifeng Dai
44
0
0
06 Jun 2024
M3LEO: A Multi-Modal, Multi-Label Earth Observation Dataset Integrating
  Interferometric SAR and RGB Data
M3LEO: A Multi-Modal, Multi-Label Earth Observation Dataset Integrating Interferometric SAR and RGB Data
Matthew J Allen
Francisco Dorr
Joseph A. Gallego-Mejia
Laura Martínez-Ferrer
Anna Jungbluth
Freddie Kalaitzis
Raúl Ramos-Pollán
31
3
0
06 Jun 2024
Reassessing How to Compare and Improve the Calibration of Machine Learning Models
Reassessing How to Compare and Improve the Calibration of Machine Learning Models
M. Chidambaram
Rong Ge
63
1
0
06 Jun 2024
SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN
SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN
Kang You
Zekai Xu
Chen Nie
Zhijie Deng
Qinghai Guo
Xiang Wang
Zhezhi He
30
10
0
05 Jun 2024
Choice of PEFT Technique in Continual Learning: Prompt Tuning is Not All
  You Need
Choice of PEFT Technique in Continual Learning: Prompt Tuning is Not All You Need
Martin Wistuba
Prabhu Teja Sivaprasad
Lukas Balles
Giovanni Zappella
22
0
0
05 Jun 2024
On the Nonlinearity of Layer Normalization
On the Nonlinearity of Layer Normalization
Yunhao Ni
Yuxin Guo
Junlong Jia
Lei Huang
23
4
0
03 Jun 2024
Searching for internal symbols underlying deep learning
Searching for internal symbols underlying deep learning
J. H. Lee
Sujith Vijayan
AI4CE
16
0
0
31 May 2024
Improving Object Detector Training on Synthetic Data by Starting With a
  Strong Baseline Methodology
Improving Object Detector Training on Synthetic Data by Starting With a Strong Baseline Methodology
Frank Ruis
Alma M. Liezenga
Friso G. Heslinga
Luca Ballan
Thijs A. Eker
Richard J. M. den Hollander
Martin C. van Leeuwen
Judith Dijk
Wyke Huizinga
23
2
0
30 May 2024
Wavelet-Based Image Tokenizer for Vision Transformers
Wavelet-Based Image Tokenizer for Vision Transformers
Zhenhai Zhu
Radu Soricut
ViT
24
3
0
28 May 2024
MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any
  Resolution
MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution
Wenzhuo Liu
Fei Zhu
Shijie Ma
Cheng-Lin Liu
18
4
0
28 May 2024
Activator: GLU Activation Function as the Core Component of a Vision
  Transformer
Activator: GLU Activation Function as the Core Component of a Vision Transformer
Abdullah Nazhat Abdullah
Tarkan Aydin
ViT
28
0
0
24 May 2024
Configuring Data Augmentations to Reduce Variance Shift in Positional
  Embedding of Vision Transformers
Configuring Data Augmentations to Reduce Variance Shift in Positional Embedding of Vision Transformers
Bum Jun Kim
Sang Woo Kim
ViT
32
1
0
23 May 2024
LookHere: Vision Transformers with Directed Attention Generalize and
  Extrapolate
LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate
A. Fuller
Daniel G. Kyrollos
Yousef Yassin
James R. Green
34
2
0
22 May 2024
Robust Disaster Assessment from Aerial Imagery Using Text-to-Image
  Synthetic Data
Robust Disaster Assessment from Aerial Imagery Using Text-to-Image Synthetic Data
Tarun Kalluri
Jihyeon Janel Lee
Kihyuk Sohn
Sahil Singla
Manmohan Chandraker
Joseph Z. Xu
Jeremiah Liu
29
1
0
22 May 2024
Audio Mamba: Pretrained Audio State Space Model For Audio Tagging
Audio Mamba: Pretrained Audio State Space Model For Audio Tagging
Jiaju Lin
Haoxuan Hu
Mamba
31
7
0
22 May 2024
How to train your ViT for OOD Detection
How to train your ViT for OOD Detection
Maximilian Mueller
Matthias Hein
11
0
0
21 May 2024
Quantum Vision Transformers for Quark-Gluon Classification
Quantum Vision Transformers for Quark-Gluon Classification
Marçal Comajoan Cara
Gopal Ramesh Dahale
Zhongtian Dong
Roy T. Forestano
S. Gleyzer
...
Kyoungchul Kong
Tom Magorsch
Konstantin T. Matchev
Katia Matcheva
Eyup B. Unlu
33
9
0
16 May 2024
Understanding Hyperbolic Metric Learning through Hard Negative Sampling
Understanding Hyperbolic Metric Learning through Hard Negative Sampling
Yun Yue
Fangzhou Lin
Guanyi Mou
Ziming Zhang
SSL
25
1
0
23 Apr 2024
An Experimental Study on Exploring Strong Lightweight Vision
  Transformers via Masked Image Modeling Pre-Training
An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Jin Gao
Shubo Lin
Shaoru Wang
Yutong Kou
Zeming Li
Liang Li
Congxuan Zhang
Xiaoqin Zhang
Yizheng Wang
Weiming Hu
37
1
0
18 Apr 2024
Masked Autoencoders for Microscopy are Scalable Learners of Cellular
  Biology
Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology
Oren Z. Kraus
Kian Kenyon-Dean
Saber Saberian
Maryam Fallah
Peter McLean
...
Chi Vicky Cheng
Kristen Morse
Maureen Makes
Ben Mabey
Berton A. Earnshaw
20
26
0
16 Apr 2024
Probing the 3D Awareness of Visual Foundation Models
Probing the 3D Awareness of Visual Foundation Models
Mohamed El Banani
Amit Raj
Kevis-Kokitsi Maninis
Abhishek Kar
Yuanzhen Li
Michael Rubinstein
Deqing Sun
Leonidas J. Guibas
Justin Johnson
Varun Jampani
28
79
0
12 Apr 2024
Struggle with Adversarial Defense? Try Diffusion
Struggle with Adversarial Defense? Try Diffusion
Yujie Li
Yanbin Wang
Haitao Xu
Bin Liu
Jianguo Sun
Zhenhao Guo
Wenrui Ma
DiffM
22
1
0
12 Apr 2024
HAPNet: Toward Superior RGB-Thermal Scene Parsing via Hybrid,
  Asymmetric, and Progressive Heterogeneous Feature Fusion
HAPNet: Toward Superior RGB-Thermal Scene Parsing via Hybrid, Asymmetric, and Progressive Heterogeneous Feature Fusion
Jiahang Li
Peng Yun
Qijun Chen
Rui Fan
36
8
0
04 Apr 2024
Previous
123456789
Next