Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.10619
Cited By
Scalable Vision Transformers with Hierarchical Pooling
19 March 2021
Zizheng Pan
Bohan Zhuang
Jing Liu
Haoyu He
Jianfei Cai
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scalable Vision Transformers with Hierarchical Pooling"
28 / 78 papers shown
Title
Clinically-Inspired Multi-Agent Transformers for Disease Trajectory Forecasting from Multimodal Data
Huy Hoang Nguyen
Matthew B. Blaschko
S. Saarakkala
A. Tiulpin
MedIm
AI4CE
43
15
0
25 Oct 2022
FocusFormer: Focusing on What We Need via Architecture Sampler
Jing Liu
Jianfei Cai
Bohan Zhuang
17
7
0
23 Aug 2022
TransMatting: Enhancing Transparent Objects Matting with Transformers
Huanqia Cai
Fanglei Xue
Lele Xu
Lili Guo
ViT
9
20
0
05 Aug 2022
An Efficient Spatio-Temporal Pyramid Transformer for Action Detection
Yuetian Weng
Zizheng Pan
Mingfei Han
Xiaojun Chang
Bohan Zhuang
ViT
19
25
0
21 Jul 2022
Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural Networks
Yongming Rao
Zuyan Liu
Wenliang Zhao
Jie Zhou
Jiwen Lu
ViT
42
36
0
04 Jul 2022
Triformer: Triangular, Variable-Specific Attentions for Long Sequence Multivariate Time Series Forecasting--Full Version
Razvan-Gabriel Cirstea
Chenjuan Guo
B. Yang
Tung Kieu
Xuanyi Dong
Shirui Pan
AI4TS
21
106
0
28 Apr 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
28
149
0
27 Apr 2022
DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers
Xianing Chen
Qiong Cao
Yujie Zhong
Jing Zhang
Shenghua Gao
Dacheng Tao
ViT
15
76
0
27 Apr 2022
ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
Yufei Xu
Jing Zhang
Qiming Zhang
Dacheng Tao
ViT
17
509
0
26 Apr 2022
Revealing Occlusions with 4D Neural Fields
Basile Van Hoorick
Purva Tendulkar
Dídac Surís
Dennis Park
Simon Stent
Carl Vondrick
16
16
0
22 Apr 2022
Dynamic Focus-aware Positional Queries for Semantic Segmentation
Haoyu He
Jianfei Cai
Zizheng Pan
Jing Liu
Jing Zhang
Dacheng Tao
Bohan Zhuang
29
16
0
04 Apr 2022
Pyramid-BERT: Reducing Complexity via Successive Core-set based Token Selection
Xin Huang
A. Khetan
Rene Bidart
Zohar S. Karnin
17
14
0
27 Mar 2022
Attribute Surrogates Learning and Spectral Tokens Pooling in Transformers for Few-shot Learning
Yang He
Weihan Liang
Dongyang Zhao
Hong-Yu Zhou
Weifeng Ge
Yizhou Yu
Wenqiang Zhang
ViT
17
45
0
17 Mar 2022
Unified Visual Transformer Compression
Shixing Yu
Tianlong Chen
Jiayi Shen
Huan Yuan
Jianchao Tan
Sen Yang
Ji Liu
Zhangyang Wang
ViT
12
91
0
15 Mar 2022
Deep Transformers Thirst for Comprehensive-Frequency Data
R. Xia
Chao Xue
Boyu Deng
Fang Wang
Jingchao Wang
ViT
17
0
0
14 Mar 2022
CF-ViT: A General Coarse-to-Fine Method for Vision Transformer
Mengzhao Chen
Mingbao Lin
Ke Li
Yunhang Shen
Yongjian Wu
Fei Chao
Rongrong Ji
ViT
38
59
0
08 Mar 2022
Multi-class Token Transformer for Weakly Supervised Semantic Segmentation
Lian Xu
Wanli Ouyang
Bennamoun
F. Boussaïd
Dan Xu
ViT
13
209
0
06 Mar 2022
Multi-Tailed Vision Transformer for Efficient Inference
Yunke Wang
Bo Du
Wenyuan Wang
Chang Xu
ViT
206
5
0
03 Mar 2022
SPViT: Enabling Faster Vision Transformers via Soft Token Pruning
Zhenglun Kong
Peiyan Dong
Xiaolong Ma
Xin Meng
Mengshu Sun
...
Geng Yuan
Bin Ren
Minghai Qin
H. Tang
Yanzhi Wang
ViT
21
141
0
27 Dec 2021
Adaptive Token Sampling For Efficient Vision Transformers
Mohsen Fayyaz
Soroush Abbasi Koohpayegani
F. Jafari
Sunando Sengupta
Hamid Reza Vaezi Joze
Eric Sommerlade
Hamed Pirsiavash
Juergen Gall
ViT
16
146
0
30 Nov 2021
Sparse Fusion for Multimodal Transformers
Yi Ding
Alex Rich
Mason Wang
Noah Stier
M. Turk
P. Sen
Tobias Höllerer
ViT
27
7
0
23 Nov 2021
Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers
B. Dong
Wenhai Wang
Deng-Ping Fan
Jinpeng Li
H. Fu
Ling Shao
ViT
MedIm
19
308
0
16 Aug 2021
Patch Slimming for Efficient Vision Transformers
Yehui Tang
Kai Han
Yunhe Wang
Chang Xu
Jianyuan Guo
Chao Xu
Dacheng Tao
ViT
11
163
0
05 Jun 2021
Gaze Estimation using Transformer
Yihua Cheng
Feng Lu
ViT
14
86
0
30 May 2021
End-to-end One-shot Human Parsing
Haoyu He
Bohan Zhuang
Jing Zhang
Jianfei Cai
Dacheng Tao
VLM
27
8
0
04 May 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,604
0
24 Feb 2021
Bottleneck Transformers for Visual Recognition
A. Srinivas
Tsung-Yi Lin
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
270
973
0
27 Jan 2021
Wider or Deeper: Revisiting the ResNet Model for Visual Recognition
Zifeng Wu
Chunhua Shen
A. Hengel
SSeg
243
1,489
0
30 Nov 2016
Previous
1
2