ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.11167
  4. Cited By
Vision Transformer with Super Token Sampling
v1v2 (latest)

Vision Transformer with Super Token Sampling

21 November 2022
Huaibo Huang
Xiaoqiang Zhou
Jie Cao
Xiao-Yu Zhang
Tieniu Tan
    ViT
ArXiv (abs)PDFHTMLGithub (142★)

Papers citing "Vision Transformer with Super Token Sampling"

28 / 28 papers shown
Title
Where Do Tokens Go? Understanding Pruning Behaviors in STEP at High Resolutions
Where Do Tokens Go? Understanding Pruning Behaviors in STEP at High Resolutions
Michal Szczepanski
Martyna Poreba
Karim Haroun
ViT
120
0
0
17 Sep 2025
Rectifying Magnitude Neglect in Linear Attention
Rectifying Magnitude Neglect in Linear Attention
Qihang Fan
Huaibo Huang
Yuang Ai
Xiao-Yu Zhang
271
4
0
01 Jul 2025
DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling
DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling
Yuang Ai
Qihang Fan
Xuefeng Hu
Zhenheng Yang
Xiao-Yu Zhang
Huaibo Huang
DiffM
310
1
0
16 May 2025
Delving Deep into Semantic Relation Distillation
Delving Deep into Semantic Relation Distillation
Zhaoyi Yan
Kangjun Liu
Qixiang Ye
197
2
0
27 Mar 2025
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less ComputeComputer Vision and Pattern Recognition (CVPR), 2025
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
311
3
0
27 Feb 2025
AllRestorer: All-in-One Transformer for Image Restoration under Composite Degradations
J. Mao
Yue Yang
Xuesong Yin
Ling Shao
Hao Tang
196
1
0
16 Nov 2024
Breaking the Low-Rank Dilemma of Linear Attention
Breaking the Low-Rank Dilemma of Linear AttentionComputer Vision and Pattern Recognition (CVPR), 2024
Qihang Fan
Huaibo Huang
Ran He
375
12
0
12 Nov 2024
PViT: Prior-augmented Vision Transformer for Out-of-distribution Detection
PViT: Prior-augmented Vision Transformer for Out-of-distribution Detection
Tianhao Zhang
Zhixiang Chen
Lyudmila Mihaylova
679
3
0
27 Oct 2024
STA-Unet: Rethink the semantic redundant for Medical Imaging
  Segmentation
STA-Unet: Rethink the semantic redundant for Medical Imaging Segmentation
Vamsi Krishna Vasa
Wenhui Zhu
Xiwen Chen
Peijie Qiu
Xuanzhao Dong
Yalin Wang
ViT
136
1
0
13 Oct 2024
From Pixels to Objects: A Hierarchical Approach for Part and Object
  Segmentation Using Local and Global Aggregation
From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global AggregationEuropean Conference on Computer Vision (ECCV), 2024
Yunfei Xie
Cihang Xie
Alan Yuille
Jieru Mei
OCL
181
2
0
02 Sep 2024
LoG-VMamba: Local-Global Vision Mamba for Medical Image Segmentation
LoG-VMamba: Local-Global Vision Mamba for Medical Image Segmentation
Trung Dang
Huy Hoang Nguyen
A. Tiulpin
Mamba
131
12
0
26 Aug 2024
CAS-ViT: Convolutional Additive Self-attention Vision Transformers for
  Efficient Mobile Applications
CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications
Tianfang Zhang
Lei Li
Yang Zhou
Wentao Liu
Chen Qian
Xiangyang Ji
ViT
179
66
0
07 Aug 2024
Dual-stage Hyperspectral Image Classification Model with Spectral
  Supertoken
Dual-stage Hyperspectral Image Classification Model with Spectral Supertoken
Peifu Liu
Tingfa Xu
Jie Wang
Huan Chen
Huiyan Bai
Jianan Li
245
7
0
10 Jul 2024
ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic
  Segmentation with Plain Vision Transformers
ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision TransformersComputer Vision and Pattern Recognition (CVPR), 2024
Narges Norouzi
Svetlana Orlova
Daan de Geus
Gijs Dubbelman
ViTFedML
155
22
0
14 Jun 2024
DiTFastAttn: Attention Compression for Diffusion Transformer Models
DiTFastAttn: Attention Compression for Diffusion Transformer Models
Zhihang Yuan
Pu Lu
Hanling Zhang
Xuefei Ning
Linfeng Zhang
Tianchen Zhao
Shengen Yan
Guohao Dai
Yu Wang
216
70
0
12 Jun 2024
Vision Transformer with Sparse Scan Prior
Vision Transformer with Sparse Scan Prior
Qihang Fan
Huaibo Huang
Mingrui Chen
ViT
349
7
0
22 May 2024
Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens
Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens
Qihang Fan
Huaibo Huang
Mingrui Chen
Ran He
308
2
0
22 May 2024
AgileFormer: Spatially Agile Transformer UNet for Medical Image
  Segmentation
AgileFormer: Spatially Agile Transformer UNet for Medical Image Segmentation
Peijie Qiu
Jin Yang
Sayantan Kumar
S. Ghosh
Aristeidis Sotiras
MedIm
193
15
0
29 Mar 2024
ViTAR: Vision Transformer with Any Resolution
ViTAR: Vision Transformer with Any Resolution
Qihang Fan
Quanzeng You
Xiaotian Han
Yongfei Liu
Yunzhe Tao
Huaibo Huang
Ran He
Hongxia Yang
ViT
277
20
0
27 Mar 2024
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and
  Time-Series Analysis
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and Time-Series Analysis
Badri N. Patro
Suhas Ranganath
Vinay P. Namboodiri
Vijay Srinivas Agneeswaran
251
4
0
26 Mar 2024
Subobject-level Image Tokenization
Subobject-level Image Tokenization
Delong Chen
Samuel Cahyawijaya
Jianfeng Liu
Baoyuan Wang
Pascale Fung
VLMOCL
577
19
0
22 Feb 2024
FViT: A Focal Vision Transformer with Gabor Filter
FViT: A Focal Vision Transformer with Gabor Filter
Yulong Shi
Mingwei Sun
Yongshuai Wang
Rui Wang
397
8
0
17 Feb 2024
SPFormer: Enhancing Vision Transformer with Superpixel Representation
SPFormer: Enhancing Vision Transformer with Superpixel Representation
Jieru Mei
Liang-Chieh Chen
Yaoyao Liu
Cihang Xie
ViTMDE
256
13
0
05 Jan 2024
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
EViT: An Eagle Vision Transformer with Bi-Fovea Self-AttentionIEEE Transactions on Cybernetics (IEEE Trans. Cybern.), 2023
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
345
8
0
10 Oct 2023
DualToken-ViT: Position-aware Efficient Vision Transformer with Dual
  Token Fusion
DualToken-ViT: Position-aware Efficient Vision Transformer with Dual Token FusionSDM (SDM), 2023
Zhenzhen Chu
Jiayu Chen
Cen Chen
Chengyu Wang
Ziheng Wu
Jun Huang
Weining Qian
ViT
127
3
0
21 Sep 2023
RMT: Retentive Networks Meet Vision Transformers
RMT: Retentive Networks Meet Vision TransformersComputer Vision and Pattern Recognition (CVPR), 2023
Qihang Fan
Huaibo Huang
Mingrui Chen
Hongmin Liu
Ran He
ViT
509
162
0
20 Sep 2023
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
Shiji Song
Li Erran Li
Gao Huang
ViT
209
40
0
04 Sep 2023
SpectFormer: Frequency and Attention is what you need in a Vision
  Transformer
SpectFormer: Frequency and Attention is what you need in a Vision TransformerIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Badri N. Patro
Vinay P. Namboodiri
Vijay Srinivas Agneeswaran
ViT
160
90
0
13 Apr 2023
1