Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.05187
Cited By
Training a Vision Transformer from scratch in less than 24 hours with 1 GPU
9 November 2022
Saghar Irandoust
Thibaut Durand
Yunduz Rakhmangulova
Wenjie Zi
Hossein Hajimirsadeghi
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training a Vision Transformer from scratch in less than 24 hours with 1 GPU"
8 / 8 papers shown
Title
A margin-based replacement for cross-entropy loss
Michael W. Spratling
Heiko H. Schütt
68
0
0
21 Jan 2025
Weight decay induces low-rank attention layers
Seijin Kobayashi
Yassir Akram
J. Oswald
39
6
0
31 Oct 2024
From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers
Jiu Feng
Mehmet Hamza Erol
Joon Son Chung
Arda Senocak
21
1
0
16 Jan 2024
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
15
41
0
12 Jul 2023
Training Strategies for Vision Transformers for Object Detection
Apoorv Singh
19
4
0
05 Apr 2023
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
69
330
0
11 Nov 2021
Primer: Searching for Efficient Transformers for Language Modeling
David R. So
Wojciech Mañke
Hanxiao Liu
Zihang Dai
Noam M. Shazeer
Quoc V. Le
VLM
83
152
0
17 Sep 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,622
0
24 Feb 2021
1