ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.10790
  4. Cited By
ScalableViT: Rethinking the Context-oriented Generalization of Vision
  Transformer

ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer

21 March 2022
Rui Yang
Hailong Ma
Jie Wu
Yansong Tang
Xuefeng Xiao
Min Zheng
Xiu Li
    ViT
ArXivPDFHTML

Papers citing "ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer"

37 / 37 papers shown
Title
FusDreamer: Label-efficient Remote Sensing World Model for Multimodal Data Classification
FusDreamer: Label-efficient Remote Sensing World Model for Multimodal Data Classification
J. Wang
Weiwei Song
Hao Chen
J. Ren
Huimin Zhao
62
0
0
18 Mar 2025
HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding
HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding
Rui Yang
Lin Song
Yicheng Xiao
Runhui Huang
Yixiao Ge
Ying Shan
Hengshuang Zhao
MLLM
62
0
0
12 Mar 2025
Breaking the Low-Rank Dilemma of Linear Attention
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
28
0
0
12 Nov 2024
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing
  Attention
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
Nguyen Huu Bao Long
Chenyu Zhang
Yuzhi Shi
Tsubasa Hirakawa
Takayoshi Yamashita
Tohgoroh Matsui
H. Fujiyoshi
26
2
0
11 Oct 2024
Embedding-Free Transformer with Inference Spatial Reduction for
  Efficient Semantic Segmentation
Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation
Hyunwoo Yu
Yubin Cho
Beoungwoo Kang
Seunghun Moon
Kyeongbo Kong
Suk-Ju Kang
28
2
0
24 Jul 2024
Semantic Equitable Clustering: A Simple, Fast and Effective Strategy for
  Vision Transformer
Semantic Equitable Clustering: A Simple, Fast and Effective Strategy for Vision Transformer
Qihang Fan
Huaibo Huang
Mingrui Chen
Ran He
39
3
0
22 May 2024
Vision Transformer with Sparse Scan Prior
Vision Transformer with Sparse Scan Prior
Qihang Fan
Huaibo Huang
Mingrui Chen
Ran He
ViT
36
4
0
22 May 2024
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
Ting Yao
Yehao Li
Yingwei Pan
Tao Mei
ViT
23
14
0
18 Mar 2024
FViT: A Focal Vision Transformer with Gabor Filter
FViT: A Focal Vision Transformer with Gabor Filter
Yulong Shi
Mingwei Sun
Yongshuai Wang
Rui Wang
47
4
0
17 Feb 2024
Cross-level Attention with Overlapped Windows for Camouflaged Object
  Detection
Cross-level Attention with Overlapped Windows for Camouflaged Object Detection
Jiepan Li
Fangxiao Lu
Nan Xue
Zhuo Li
Hongyan Zhang
Wei He
25
2
0
28 Nov 2023
Bitformer: An efficient Transformer with bitwise operation-based
  attention for Big Data Analytics at low-cost low-precision devices
Bitformer: An efficient Transformer with bitwise operation-based attention for Big Data Analytics at low-cost low-precision devices
Gaoxiang Duan
Junkai Zhang
Xiaoying Zheng
Yongxin Zhu
14
2
0
22 Nov 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
31
35
0
30 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
29
3
0
10 Oct 2023
UniHead: Unifying Multi-Perception for Detection Heads
UniHead: Unifying Multi-Perception for Detection Heads
Hantao Zhou
Rui Yang
Yachao Zhang
Haoran Duan
Yawen Huang
R. Hu
Xiu Li
Yefeng Zheng
16
12
0
23 Sep 2023
RMT: Retentive Networks Meet Vision Transformers
RMT: Retentive Networks Meet Vision Transformers
Qihang Fan
Huaibo Huang
Mingrui Chen
Hongmin Liu
Ran He
ViT
30
73
0
20 Sep 2023
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
Shiji Song
Li Erran Li
Gao Huang
ViT
19
22
0
04 Sep 2023
Vision Backbone Enhancement via Multi-Stage Cross-Scale Attention
Vision Backbone Enhancement via Multi-Stage Cross-Scale Attention
Liang Shang
Yanli Liu
Zhengyang Lou
Shuxue Quan
N. Adluru
Bochen Guan
W. Sethares
16
1
0
10 Aug 2023
Lightweight Vision Transformer with Bidirectional Interaction
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
27
27
0
01 Jun 2023
GPT4Tools: Teaching Large Language Model to Use Tools via
  Self-instruction
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction
Rui Yang
Lin Song
Yanwei Li
Sijie Zhao
Yixiao Ge
Xiu Li
Ying Shan
SyDa
MLLM
21
207
0
30 May 2023
Weakly-Supervised Concealed Object Segmentation with SAM-based Pseudo
  Labeling and Multi-scale Feature Grouping
Weakly-Supervised Concealed Object Segmentation with SAM-based Pseudo Labeling and Multi-scale Feature Grouping
Chunming He
Kai Li
Yachao Zhang
Guoxia Xu
Longxiang Tang
Yulun Zhang
Z. Guo
Xiu Li
12
90
0
18 May 2023
An End-to-End Network for Upright Adjustment of Panoramic Images
An End-to-End Network for Upright Adjustment of Panoramic Images
Heyu Chen
Jianfeng Li
Shigang Li
13
2
0
12 Apr 2023
Spectral Enhanced Rectangle Transformer for Hyperspectral Image
  Denoising
Spectral Enhanced Rectangle Transformer for Hyperspectral Image Denoising
Miaoyu Li
Ji Liu
Ying Fu
Yulun Zhang
Dejing Dou
ViT
8
55
0
03 Apr 2023
APPT : Asymmetric Parallel Point Transformer for 3D Point Cloud
  Understanding
APPT : Asymmetric Parallel Point Transformer for 3D Point Cloud Understanding
Hengjia Li
Tu Zheng
Zhihao Chi
Zheng Yang
Wenxiao Wang
Boxi Wu
Binbin Lin
Deng Cai
3DPC
30
1
0
31 Mar 2023
BoxSnake: Polygonal Instance Segmentation with Box Supervision
BoxSnake: Polygonal Instance Segmentation with Box Supervision
Rui Yang
Lin Song
Yixiao Ge
Xiu Li
ISeg
11
18
0
21 Mar 2023
BiFormer: Vision Transformer with Bi-Level Routing Attention
BiFormer: Vision Transformer with Bi-Level Routing Attention
Lei Zhu
Xinjiang Wang
Zhanghan Ke
Wayne Zhang
Rynson W. H. Lau
123
438
0
15 Mar 2023
CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale
  Attention
CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale Attention
Wenxiao Wang
Wei Chen
Qibo Qiu
Long Chen
Boxi Wu
Binbin Lin
Xiaofei He
Wei Liu
22
38
0
13 Mar 2023
SSGD: A smartphone screen glass dataset for defect detection
SSGD: A smartphone screen glass dataset for defect detection
Haonan Han
Rui Yang
Shuyan Li
R. Hu
Xiu Li
19
10
0
12 Mar 2023
Recursive Generalization Transformer for Image Super-Resolution
Recursive Generalization Transformer for Image Super-Resolution
Zheng Chen
Yulun Zhang
Jinjin Gu
L. Kong
Xiaokang Yang
ViT
21
27
0
11 Mar 2023
Masked autoencoders are effective solution to transformer data-hungry
Masked autoencoders are effective solution to transformer data-hungry
Jia-ju Mao
Honggu Zhou
Xuesong Yin
Binling Nie
MedIm
19
5
0
12 Dec 2022
Next-ViT: Next Generation Vision Transformer for Efficient Deployment in
  Realistic Industrial Scenarios
Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios
Jiashi Li
Xin Xia
W. Li
Huixia Li
Xing Wang
Xuefeng Xiao
Rui Wang
Min Zheng
Xin Pan
ViT
8
145
0
12 Jul 2022
UniInst: Unique Representation for End-to-End Instance Segmentation
UniInst: Unique Representation for End-to-End Instance Segmentation
Yimin Ou
Rui Yang
Lufan Ma
Yong Liu
Jiangpeng Yan
Shang Xu
Chengjie Wang
Xiu Li
ISeg
20
7
0
25 May 2022
Scalable and Efficient Training of Large Convolutional Neural Networks
  with Differential Privacy
Scalable and Efficient Training of Large Convolutional Neural Networks with Differential Privacy
Zhiqi Bu
J. Mao
Shiyun Xu
131
47
0
21 May 2022
Transformer in Transformer
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
282
1,518
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,604
0
24 Feb 2021
How Much Position Information Do Convolutional Neural Networks Encode?
How Much Position Information Do Convolutional Neural Networks Encode?
Md. Amirul Islam
Sen Jia
Neil D. B. Bruce
SSL
194
343
0
22 Jan 2020
Aggregated Residual Transformations for Deep Neural Networks
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
261
10,196
0
16 Nov 2016
Semantic Understanding of Scenes through the ADE20K Dataset
Semantic Understanding of Scenes through the ADE20K Dataset
Bolei Zhou
Hang Zhao
Xavier Puig
Tete Xiao
Sanja Fidler
Adela Barriuso
Antonio Torralba
SSeg
249
1,821
0
18 Aug 2016
1