Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.11986
Cited By
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
28 January 2021
Li-xin Yuan
Yunpeng Chen
Tao Wang
Weihao Yu
Yujun Shi
Zihang Jiang
Francis E. H. Tay
Jiashi Feng
Shuicheng Yan
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet"
50 / 352 papers shown
Title
Lightweight RGB-D Salient Object Detection from a Speed-Accuracy Tradeoff Perspective
Songsong Duan
Xi Yang
Nannan Wang
Xinbo Gao
55
0
0
07 May 2025
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
W. Xu
Shibiao Xu
ViT
142
0
0
06 May 2025
Optimal Hyperspectral Undersampling Strategy for Satellite Imaging
Vita V. Vlasova
Vladimir G. Kuzmin
Maria S. Varetsa
Natalia A. Ibragimova
Oleg Y. Rogov
Elena V. Lyapuntsova
19
0
0
27 Apr 2025
TimeCapsule: Solving the Jigsaw Puzzle of Long-Term Time Series Forecasting with Compressed Predictive Representations
Yihang Lu
Yangyang Xu
Qitao Qing
Xianwei Meng
AI4TS
44
0
0
17 Apr 2025
Embedding Radiomics into Vision Transformers for Multimodal Medical Image Classification
Zhenyu Yang
Haiming Zhu
Rihui Zhang
Haipeng Zhang
Jianliang Wang
Chunhao Wang
Minbin Chen
F. Yin
MedIm
38
0
0
15 Apr 2025
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
Hao Wang
Shuo Zhang
Biao Leng
ViT
82
0
0
03 Apr 2025
Semi-Supervised 360 Layout Estimation with Panoramic Collaborative Perturbations
Junsong Zhang
Chunyu Lin
Zhijie Shen
Lang Nie
K. Liao
Yao Zhao
35
0
0
03 Mar 2025
VRM: Knowledge Distillation via Virtual Relation Matching
W. Zhang
Fei Xie
Weidong Cai
Chao Ma
76
0
0
28 Feb 2025
Low-Rank Thinning
Annabelle Michael Carrell
Albert Gong
Abhishek Shetty
Raaz Dwivedi
Lester W. Mackey
58
0
0
17 Feb 2025
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
Weikang Meng
Yadan Luo
Xin Li
D. Jiang
Zheng Zhang
145
0
0
25 Jan 2025
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang
Wonmin Byeon
Jiarui Xu
Jinwei Gu
Ka Chun Cheung
Xiaolong Wang
Kai Han
Jan Kautz
Sifei Liu
149
0
0
21 Jan 2025
VMamba: Visual State Space Model
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
152
611
0
31 Dec 2024
CMAL: A Novel Cross-Modal Associative Learning Framework for Vision-Language Pre-Training
Zhiyuan Ma
Jianjun Li
Guohui Li
Kaiyan Huang
VLM
56
9
0
16 Oct 2024
MoH: Multi-Head Attention as Mixture-of-Head Attention
Peng Jin
Bo Zhu
Li Yuan
Shuicheng Yan
MoE
31
13
0
15 Oct 2024
FabGPT: An Efficient Large Multimodal Model for Complex Wafer Defect Knowledge Queries
Yuqi Jiang
Xudong Lu
Qian Jin
Qi Sun
Hanming Wu
Cheng Zhuo
36
5
0
15 Jul 2024
Learning Motion Blur Robust Vision Transformers with Dynamic Early Exit for Real-Time UAV Tracking
You Wu
Xucheng Wang
Dan Zeng
Hengzhou Ye
Xiaolan Xie
Qijun Zhao
Shuiwang Li
35
3
0
07 Jul 2024
Improving robustness to corruptions with multiplicative weight perturbations
Trung Trinh
Markus Heinonen
Luigi Acerbi
Samuel Kaski
41
0
0
24 Jun 2024
Predicting Probabilities of Error to Combine Quantization and Early Exiting: QuEE
Florence Regol
Joud Chataoui
Bertrand Charpentier
Mark J. Coates
Pablo Piantanida
Stephan Gunnemann
39
0
0
20 Jun 2024
Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking
Xiangyang Yang
Dan Zeng
Xucheng Wang
You Wu
Hengzhou Ye
Qijun Zhao
Shuiwang Li
59
3
0
12 Jun 2024
A DeNoising FPN With Transformer R-CNN for Tiny Object Detection
Hou-I Liu
Yu-Wen Tseng
Kai-Cheng Chang
Pin-Jyun Wang
Hong-Han Shuai
Wen-Huang Cheng
ViT
ObjD
40
24
0
09 Jun 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
43
7
0
28 Mar 2024
Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory
Sensen Gao
Xiaojun Jia
Xuhong Ren
Ivor Tsang
Qing-Wu Guo
AAML
38
14
0
19 Mar 2024
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
Ting Yao
Yehao Li
Yingwei Pan
Tao Mei
ViT
25
15
0
18 Mar 2024
Segmentation Guided Sparse Transformer for Under-Display Camera Image Restoration
Jingyun Xue
Tao Wang
Jun Wang
Kaihao Zhang
ViT
43
2
0
09 Mar 2024
LUM-ViT: Learnable Under-sampling Mask Vision Transformer for Bandwidth Limited Optical Signal Acquisition
Lingfeng Liu
Dong Ni
Hangjie Yuan
ViT
29
0
0
03 Mar 2024
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Mahdi Karami
Ali Ghodsi
VLM
42
6
0
28 Feb 2024
FViT: A Focal Vision Transformer with Gabor Filter
Yulong Shi
Mingwei Sun
Yongshuai Wang
Rui Wang
52
4
0
17 Feb 2024
Learning Low-Rank Feature for Thorax Disease Classification
Rajeev Goel
Utkarsh Nath
Yancheng Wang
Alvin C. Silva
Teresa Wu
Yingzhen Yang
22
0
0
14 Feb 2024
DeSparsify: Adversarial Attack Against Token Sparsification Mechanisms in Vision Transformers
Oryan Yehezkel
Alon Zolfi
Amit Baras
Yuval Elovici
A. Shabtai
AAML
29
0
0
04 Feb 2024
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey
Yi Xin
Jianjiang Yang
Haodi Zhou
Junlong Du
Junlong Du
Yue Fan
Qing Li
Qing Li
Yuntao Du
VLM
70
75
0
03 Feb 2024
CascadedGaze: Efficiency in Global Context Extraction for Image Restoration
Amirhosein Ghasemabadi
Muhammad Kamran Janjua
Mohammad Salameh
Chunhua Zhou
Fengyu Sun
Di Niu
32
11
0
26 Jan 2024
Setting the Record Straight on Transformer Oversmoothing
G. Dovonon
M. Bronstein
Matt J. Kusner
22
5
0
09 Jan 2024
360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception
Zhijie Shen
Chunyu Lin
Junsong Zhang
Lang Nie
K. Liao
Yao Zhao
28
5
0
26 Dec 2023
A Survey on Open-Set Image Recognition
Jiaying Sun
Qiulei Dong
BDL
ObjD
32
3
0
25 Dec 2023
Video Recognition in Portrait Mode
Mingfei Han
Linjie Yang
Xiaojie Jin
Jiashi Feng
Xiaojun Chang
Heng Wang
30
3
0
21 Dec 2023
Graph Convolutions Enrich the Self-Attention in Transformers!
Jeongwhan Choi
Hyowon Wi
Jayoung Kim
Yehjin Shin
Kookjin Lee
Nathaniel Trask
Noseong Park
25
4
0
07 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
33
0
0
01 Dec 2023
QuadraNet: Improving High-Order Neural Interaction Efficiency with Hardware-Aware Quadratic Neural Networks
Chenhui Xu
Fuxun Yu
Zirui Xu
Chenchen Liu
Jinjun Xiong
Xiang Chen
33
4
0
29 Nov 2023
Improved TokenPose with Sparsity
Anning Li
ViT
34
0
0
16 Nov 2023
Rotation Invariant Transformer for Recognizing Object in UAVs
Shuo Chen
Mang Ye
Bo Du
ViT
32
18
0
05 Nov 2023
Improving Robustness for Vision Transformer with a Simple Dynamic Scanning Augmentation
Shashank Kotyan
Danilo Vasconcellos Vargas
ViT
22
2
0
01 Nov 2023
Minimalist and High-Performance Semantic Segmentation with Plain Vision Transformers
Yuanduo Hong
Jue Wang
Weichao Sun
Huihui Pan
VLM
ViT
37
7
0
19 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
31
4
0
10 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
40
3
0
08 Oct 2023
SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels
Henry Hengyuan Zhao
Pichao Wang
Yuyang Zhao
Hao Luo
F. Wang
Mike Zheng Shou
ViT
34
14
0
15 Sep 2023
Interpretability-Aware Vision Transformer
Yao Qiang
Chengyin Li
Prashant Khanduri
D. Zhu
ViT
82
7
0
14 Sep 2023
SwinFace: A Multi-task Transformer for Face Recognition, Expression Recognition, Age Estimation and Attribute Estimation
Lixiong Qin
Mei Wang
Chao Deng
K. Wang
Xiangshan Chen
Jiani Hu
Weihong Deng
CVBM
ViT
29
38
0
22 Aug 2023
MGMAE: Motion Guided Masking for Video Masked Autoencoding
Bingkun Huang
Zhiyu Zhao
Guozhen Zhang
Yu Qiao
Limin Wang
28
30
0
21 Aug 2023
A survey on deep learning in medical image registration: new technologies, uncertainty, evaluation metrics, and beyond
Junyu Chen
Yihao Liu
Shuwen Wei
Zhangxing Bian
Shalini Subramanian
A. Carass
Jerry L. Prince
Yong Du
OOD
39
36
0
28 Jul 2023
Set-level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training Models
Dong Lu
Zhiqiang Wang
Teng Wang
Weili Guan
Hongchang Gao
Feng Zheng
AAML
51
65
0
26 Jul 2023
1
2
3
4
5
6
7
8
Next