Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.03180
Cited By
Vision Transformers with Hierarchical Attention
6 June 2021
Yun-Hai Liu
Yu-Huan Wu
Guolei Sun
Le Zhang
Ajad Chhatkuli
Luc Van Gool
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Vision Transformers with Hierarchical Attention"
15 / 15 papers shown
Title
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
W. Xu
Shibiao Xu
ViT
130
0
0
06 May 2025
MMTL-UniAD: A Unified Framework for Multimodal and Multi-Task Learning in Assistive Driving Perception
Wenzhuo Liu
Wenshuo Wang
Yicheng Qiao
Qiannan Guo
Jiayin Zhu
...
Huiming Yang
Zhiwei Li
Lening Wang
Tiao Tan
Huaping Liu
48
1
0
03 Apr 2025
Fraesormer: Learning Adaptive Sparse Transformer for Efficient Food Recognition
Shun Zou
Yi Zou
Mingya Zhang
Shipeng Luo
Zhihao Chen
Guangwei Gao
ViT
48
0
0
15 Mar 2025
Treat Stillness with Movement: Remote Sensing Change Detection via Coarse-grained Temporal Foregrounds Mining
Xixi Wang
Zitian Wang
Jingtao Jiang
Lan Chen
Xiao Wang
Bo Jiang
VGen
25
0
0
15 Aug 2024
RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework
Yuelei Wang
Ting Zhang
Liangjin Zhao
Lin Hu
Zhechao Wang
...
Kaiqiang Chen
Xuan Zeng
Zhirui Wang
Hongqi Wang
Xian Sun
22
4
0
16 Sep 2023
Robust Principles: Architectural Design Principles for Adversarially Robust CNNs
Sheng-Hsuan Peng
Weilin Xu
Cory Cornelius
Matthew Hull
Kevin Li
Rahul Duggal
Mansi Phute
Jason Martin
Duen Horng Chau
AAML
13
46
0
30 Aug 2023
Learning Local and Global Temporal Contexts for Video Semantic Segmentation
Guolei Sun
Yun Liu
Henghui Ding
Min Wu
Luc Van Gool
25
32
0
07 Apr 2022
6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-based Instance Representation Learning
Lu Zou
Zhangjin Huang
Naijie Gu
Guoping Wang
ViT
23
45
0
10 Oct 2021
P2T: Pyramid Pooling Transformer for Scene Understanding
Yu-Huan Wu
Yun-Hai Liu
Xin Zhan
Mingg-Ming Cheng
ViT
24
219
0
22 Jun 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
259
2,603
0
04 May 2021
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
284
1,524
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
272
3,622
0
24 Feb 2021
Bottleneck Transformers for Visual Recognition
A. Srinivas
Tsung-Yi Lin
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
275
979
0
27 Jan 2021
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
950
20,561
0
17 Apr 2017
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
294
10,216
0
16 Nov 2016
1