Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.13621
Cited By
Exploring Self-attention for Image Recognition
28 April 2020
Hengshuang Zhao
Jiaya Jia
V. Koltun
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring Self-attention for Image Recognition"
50 / 316 papers shown
Title
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Yin Cui
Boqing Gong
ViT
248
577
0
22 Apr 2021
Variational Relational Point Completion Network
Liang Pan
Xinyi Chen
Zhongang Cai
Junzhe Zhang
Haiyu Zhao
Shuai Yi
Ziwei Liu
3DPC
195
176
0
20 Apr 2021
HoughNet: Integrating near and long-range evidence for visual detection
Nermin Samet
Samet Hicsonmez
Emre Akbas
ObjD
21
10
0
14 Apr 2021
Co-Scale Conv-Attentional Image Transformers
Weijian Xu
Yifan Xu
Tyler A. Chang
Z. Tu
ViT
11
373
0
13 Apr 2021
GAttANet: Global attention agreement for convolutional neural networks
R. V. Rullen
A. Alamia
ViT
13
2
0
12 Apr 2021
Fine-Grained Attention for Weakly Supervised Object Localization
Junghyo Sohn
Eunjin Jeon
Wonsik Jung
Eunsong Kang
Heung-Il Suk
WSOL
16
3
0
11 Apr 2021
SVDistNet: Self-Supervised Near-Field Distance Estimation on Surround View Fisheye Cameras
Varun Ravi Kumar
Marvin Klingner
S. Yogamani
Markus Bach
Stefan Milz
Tim Fingscheidt
Patrick Mäder
MDE
48
37
0
09 Apr 2021
Capturing Multi-Resolution Context by Dilated Self-Attention
Niko Moritz
Takaaki Hori
Jonathan Le Roux
11
7
0
07 Apr 2021
An Empirical Study of Training Self-Supervised Vision Transformers
Xinlei Chen
Saining Xie
Kaiming He
ViT
37
1,801
0
05 Apr 2021
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
Ben Graham
Alaaeldin El-Nouby
Hugo Touvron
Pierre Stock
Armand Joulin
Hervé Jégou
Matthijs Douze
ViT
11
768
0
02 Apr 2021
VisQA: X-raying Vision and Language Reasoning in Transformers
Theo Jaunet
Corentin Kervadec
Romain Vuillemot
G. Antipov
M. Baccouche
Christian Wolf
8
26
0
02 Apr 2021
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
25
986
0
31 Mar 2021
Dual Contrastive Loss and Attention for GANs
Ning Yu
Guilin Liu
Aysegül Dündar
Andrew Tao
Bryan Catanzaro
Larry S. Davis
Mario Fritz
GAN
22
60
0
31 Mar 2021
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Chun-Fu Chen
Quanfu Fan
Rameswar Panda
ViT
28
1,420
0
27 Mar 2021
TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization
Wei Gao
Fang Wan
Xingjia Pan
Zhiliang Peng
Qi Tian
Zhenjun Han
Bolei Zhou
QiXiang Ye
ViT
WSOL
12
198
0
27 Mar 2021
Understanding Robustness of Transformers for Image Classification
Srinadh Bhojanapalli
Ayan Chakrabarti
Daniel Glasner
Daliang Li
Thomas Unterthiner
Andreas Veit
ViT
14
378
0
26 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng-Wei Zhang
Stephen Lin
B. Guo
ViT
124
20,677
0
25 Mar 2021
Vision Transformers for Dense Prediction
René Ranftl
Alexey Bochkovskiy
V. Koltun
ViT
MDE
36
1,659
0
24 Mar 2021
Scaling Local Self-Attention for Parameter Efficient Visual Backbones
Ashish Vaswani
Prajit Ramachandran
A. Srinivas
Niki Parmar
Blake A. Hechtman
Jonathon Shlens
16
395
0
23 Mar 2021
Instance-level Image Retrieval using Reranking Transformers
Fuwen Tan
Jiangbo Yuan
Vicente Ordonez
ViT
21
89
0
22 Mar 2021
DeepViT: Towards Deeper Vision Transformer
Daquan Zhou
Bingyi Kang
Xiaojie Jin
Linjie Yang
Xiaochen Lian
Zihang Jiang
Qibin Hou
Jiashi Feng
ViT
42
510
0
22 Mar 2021
Incorporating Convolution Designs into Visual Transformers
Kun Yuan
Shaopeng Guo
Ziwei Liu
Aojun Zhou
F. Yu
Wei Wu
ViT
24
467
0
22 Mar 2021
Involution: Inverting the Inherence of Convolution for Visual Recognition
Duo Li
Jie Hu
Changhu Wang
Xiangtai Li
Qi She
Lei Zhu
Tong Zhang
Qifeng Chen
BDL
15
304
0
10 Mar 2021
ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis
Yinan He
Bei Gan
Siyu Chen
Yichun Zhou
Guojun Yin
Luchuan Song
Lu Sheng
Jing Shao
Ziwei Liu
AAML
24
129
0
09 Mar 2021
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
48
973
0
04 Mar 2021
Generative Adversarial Transformers
Drew A. Hudson
C. L. Zitnick
ViT
23
179
0
01 Mar 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
274
3,622
0
24 Feb 2021
Model-Attentive Ensemble Learning for Sequence Modeling
Victor D. Bourgin
Ioana Bica
M. Schaar
AI4TS
15
0
0
23 Feb 2021
UniT: Multimodal Multitask Learning with a Unified Transformer
Ronghang Hu
Amanpreet Singh
ViT
14
295
0
22 Feb 2021
Hard-Attention for Scalable Image Classification
Athanasios Papadopoulos
Pawel Korus
N. Memon
62
25
0
20 Feb 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
267
179
0
17 Feb 2021
OmniDet: Surround View Cameras based Multi-task Visual Perception Network for Autonomous Driving
Varun Ravi Kumar
S. Yogamani
Hazem Rashed
Ganesh Sitsu
Christian Witt
Isabelle Leang
Stefan Milz
Patrick Mäder
23
90
0
15 Feb 2021
Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition
Heeseung Kwon
Manjin Kim
Suha Kwak
Minsu Cho
TTA
19
39
0
14 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
280
1,981
0
09 Feb 2021
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Li-xin Yuan
Yunpeng Chen
Tao Wang
Weihao Yu
Yujun Shi
Zihang Jiang
Francis E. H. Tay
Jiashi Feng
Shuicheng Yan
ViT
6
1,904
0
28 Jan 2021
Bottleneck Transformers for Visual Recognition
A. Srinivas
Tsung-Yi Lin
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
290
979
0
27 Jan 2021
Shape or Texture: Understanding Discriminative Features in CNNs
Md. Amirul Islam
M. Kowal
Patrick Esser
Sen Jia
Bjorn Ommer
Konstantinos G. Derpanis
Neil D. B. Bruce
14
75
0
27 Jan 2021
Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Learning
Yu Tian
Guansong Pang
Yuanhong Chen
Rajvinder Singh
Johan W. Verjans
G. Carneiro
AI4TS
13
291
0
25 Jan 2021
SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation
Brendan Duke
Abdalla Ahmed
Christian Wolf
P. Aarabi
Graham W. Taylor
VOS
14
165
0
21 Jan 2021
Context-aware Attentional Pooling (CAP) for Fine-grained Visual Classification
Ardhendu Behera
Zachary Wharton
Pradeep Ruwan Padmasiri Galbokka Hewage
Asish Bera
59
108
0
17 Jan 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
227
2,428
0
04 Jan 2021
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Sixiao Zheng
Jiachen Lu
Hengshuang Zhao
Xiatian Zhu
Zekun Luo
...
Yanwei Fu
Jianfeng Feng
Tao Xiang
Philip H. S. Torr
Li Zhang
ViT
17
2,837
0
31 Dec 2020
Attention-based Image Upsampling
Souvik Kundu
Hesham Mostafa
S. N. Sridhar
Sairam Sundaresan
SupR
11
10
0
17 Dec 2020
Point Transformer
Hengshuang Zhao
Li Jiang
Jiaya Jia
Philip H. S. Torr
V. Koltun
3DPC
ViT
25
11
0
16 Dec 2020
Responsible Disclosure of Generative Models Using Scalable Fingerprinting
Ning Yu
Vladislav Skripniuk
Dingfan Chen
Larry S. Davis
Mario Fritz
WIGM
35
89
0
16 Dec 2020
Fine-grained Angular Contrastive Learning with Coarse Labels
Guy Bukchin
Eli Schwartz
Kate Saenko
Ori Shahar
Rogerio Feris
Raja Giryes
Leonid Karlinsky
27
52
0
07 Dec 2020
Deep Learning and the Global Workspace Theory
R. V. Rullen
Ryota Kanai
37
65
0
04 Dec 2020
Pre-Trained Image Processing Transformer
Hanting Chen
Yunhe Wang
Tianyu Guo
Chang Xu
Yiping Deng
Zhenhua Liu
Siwei Ma
Chunjing Xu
Chao Xu
Wen Gao
VLM
ViT
37
1,632
0
01 Dec 2020
Deeper or Wider Networks of Point Clouds with Self-attention?
Haoxi Ran
Li Lu
3DPC
19
1
0
29 Nov 2020
Reflective-Net: Learning from Explanations
Johannes Schneider
Michalis Vlachos
FAtt
OffRL
LRM
52
18
0
27 Nov 2020
Previous
1
2
3
4
5
6
7
Next