ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1911.03584
  4. Cited By
On the Relationship between Self-Attention and Convolutional Layers

On the Relationship between Self-Attention and Convolutional Layers

8 November 2019
Jean-Baptiste Cordonnier
Andreas Loukas
Martin Jaggi
ArXivPDFHTML

Papers citing "On the Relationship between Self-Attention and Convolutional Layers"

50 / 95 papers shown
Title
Vision-LSTM: xLSTM as Generic Vision Backbone
Vision-LSTM: xLSTM as Generic Vision Backbone
Benedikt Alkin
M. Beck
Korbinian Poppel
Sepp Hochreiter
Johannes Brandstetter
VLM
58
43
0
24 Feb 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
42
0
0
11 Feb 2025
Unified CNNs and transformers underlying learning mechanism reveals multi-head attention modus vivendi
Unified CNNs and transformers underlying learning mechanism reveals multi-head attention modus vivendi
Ella Koresh
Ronit D. Gross
Yuval Meir
Yarden Tzach
Tal Halevi
Ido Kanter
ViT
46
0
0
22 Jan 2025
Approximation Rate of the Transformer Architecture for Sequence Modeling
Approximation Rate of the Transformer Architecture for Sequence Modeling
Hao Jiang
Qianxiao Li
48
9
0
03 Jan 2025
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Yunshan Zhong
Yuyao Zhou
Yuxin Zhang
Shen Li
Yong Li
Fei Chao
Zhanpeng Zeng
Rongrong Ji
MQ
94
0
0
31 Dec 2024
LPViT: Low-Power Semi-structured Pruning for Vision Transformers
LPViT: Low-Power Semi-structured Pruning for Vision Transformers
Kaixin Xu
Zhe Wang
Chunyun Chen
Xue Geng
Jie Lin
Xulei Yang
Min-man Wu
Min Wu
Xiaoli Li
Weisi Lin
ViT
VLM
43
7
0
02 Jul 2024
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications
Jordy Van Landeghem
Subhajit Maity
Ayan Banerjee
Matthew Blaschko
Marie-Francine Moens
Josep Lladós
Sanket Biswas
41
2
0
12 Jun 2024
Grounding Continuous Representations in Geometry: Equivariant Neural Fields
Grounding Continuous Representations in Geometry: Equivariant Neural Fields
David R. Wessels
David M. Knigge
Samuele Papa
Riccardo Valperga
Sharvaree P. Vadgama
E. Gavves
Erik J. Bekkers
40
7
0
09 Jun 2024
SpiralMLP: A Lightweight Vision MLP Architecture
SpiralMLP: A Lightweight Vision MLP Architecture
Haojie Mu
Burhan Ul Tayyab
Nicholas Chua
43
0
0
31 Mar 2024
A Hyper-Transformer model for Controllable Pareto Front Learning with
  Split Feasibility Constraints
A Hyper-Transformer model for Controllable Pareto Front Learning with Split Feasibility Constraints
Tran Anh Tuan
Nguyen Viet Dung
Tran Ngoc Thang
31
3
0
04 Feb 2024
Integrating Human Vision Perception in Vision Transformers for
  Classifying Waste Items
Integrating Human Vision Perception in Vision Transformers for Classifying Waste Items
Akshat Shrivastava
Tapan K. Gandhi
21
1
0
19 Dec 2023
Concurrent ischemic lesion age estimation and segmentation of CT brain
  using a Transformer-based network
Concurrent ischemic lesion age estimation and segmentation of CT brain using a Transformer-based network
A. Marcus
P. Bentley
Daniel Rueckert
MedIm
13
9
0
21 Jun 2023
Preserving Locality in Vision Transformers for Class Incremental
  Learning
Preserving Locality in Vision Transformers for Class Incremental Learning
Bowen Zheng
Da-Wei Zhou
Han-Jia Ye
De-Chuan Zhan
CLL
19
5
0
14 Apr 2023
Inductive biases in deep learning models for weather prediction
Inductive biases in deep learning models for weather prediction
Jannik Thümmel
Matthias Karlbauer
S. Otte
C. Zarfl
Georg Martius
...
Thomas Scholten
Ulrich Friedrich
V. Wulfmeyer
B. Goswami
Martin Volker Butz
AI4CE
38
5
0
06 Apr 2023
Transformer-based Multi-Instance Learning for Weakly Supervised Object
  Detection
Transformer-based Multi-Instance Learning for Weakly Supervised Object Detection
Zhaofei Wang
Weijia Zhang
Min-Ling Zhang
ViT
WSOD
15
3
0
27 Mar 2023
Self-attention in Vision Transformers Performs Perceptual Grouping, Not
  Attention
Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention
Paria Mehrani
John K. Tsotsos
25
24
0
02 Mar 2023
STOA-VLP: Spatial-Temporal Modeling of Object and Action for
  Video-Language Pre-training
STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training
Weihong Zhong
Mao Zheng
Duyu Tang
Xuan Luo
Heng Gong
Xiaocheng Feng
Bing Qin
27
8
0
20 Feb 2023
VITAL: Vision Transformer Neural Networks for Accurate Smartphone
  Heterogeneity Resilient Indoor Localization
VITAL: Vision Transformer Neural Networks for Accurate Smartphone Heterogeneity Resilient Indoor Localization
Danish Gufran
Saideep Tiku
S. Pasricha
17
11
0
18 Feb 2023
From paintbrush to pixel: A review of deep neural networks in
  AI-generated art
From paintbrush to pixel: A review of deep neural networks in AI-generated art
Anne-Sofie Maerten
Derya Soydaner
34
22
0
14 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning,
  Generalization, and Sample Complexity
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
M. Wang
Sijia Liu
Pin-Yu Chen
ViT
MLT
35
56
0
12 Feb 2023
Scaling Vision Transformers to 22 Billion Parameters
Scaling Vision Transformers to 22 Billion Parameters
Mostafa Dehghani
Josip Djolonga
Basil Mustafa
Piotr Padlewski
Jonathan Heek
...
Mario Luvcić
Xiaohua Zhai
Daniel Keysers
Jeremiah Harmsen
N. Houlsby
MLLM
61
569
0
10 Feb 2023
A Survey on Efficient Training of Transformers
A Survey on Efficient Training of Transformers
Bohan Zhuang
Jing Liu
Zizheng Pan
Haoyu He
Yuetian Weng
Chunhua Shen
20
47
0
02 Feb 2023
Enhancing Face Recognition with Latent Space Data Augmentation and
  Facial Posture Reconstruction
Enhancing Face Recognition with Latent Space Data Augmentation and Facial Posture Reconstruction
Soroush Hashemifar
Abdolreza Marefat
Javad Hassannataj Joloudari
H. Hassanpour
CVBM
21
11
0
27 Jan 2023
Part-guided Relational Transformers for Fine-grained Visual Recognition
Part-guided Relational Transformers for Fine-grained Visual Recognition
Yifan Zhao
Jia Li
Xiaowu Chen
Yonghong Tian
ViT
29
34
0
28 Dec 2022
A Close Look at Spatial Modeling: From Attention to Convolution
A Close Look at Spatial Modeling: From Attention to Convolution
Xu Ma
Huan Wang
Can Qin
Kunpeng Li
Xing Zhao
Jie Fu
Yun Fu
ViT
3DPC
17
11
0
23 Dec 2022
Complex-Valued Time-Frequency Self-Attention for Speech Dereverberation
Complex-Valued Time-Frequency Self-Attention for Speech Dereverberation
Vinay Kothapally
John H. L. Hansen
23
9
0
22 Nov 2022
Prompt Tuning for Parameter-efficient Medical Image Segmentation
Prompt Tuning for Parameter-efficient Medical Image Segmentation
Marc Fischer
Alexander Bartler
Bin Yang
SSeg
14
18
0
16 Nov 2022
Attention-based Neural Cellular Automata
Attention-based Neural Cellular Automata
Mattie Tesfaldet
Derek Nowrouzezahrai
C. Pal
ViT
29
17
0
02 Nov 2022
Multi-Viewpoint and Multi-Evaluation with Felicitous Inductive Bias
  Boost Machine Abstract Reasoning Ability
Multi-Viewpoint and Multi-Evaluation with Felicitous Inductive Bias Boost Machine Abstract Reasoning Ability
Qinglai Wei
Diancheng Chen
Beiming Yuan
32
10
0
26 Oct 2022
Delving into Masked Autoencoders for Multi-Label Thorax Disease
  Classification
Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification
Junfei Xiao
Yutong Bai
Alan Yuille
Zongwei Zhou
MedIm
ViT
32
58
0
23 Oct 2022
Feature Embedding by Template Matching as a ResNet Block
Feature Embedding by Template Matching as a ResNet Block
Ada Gorgun
Y. Z. Gürbüz
Aydin Alatan
20
1
0
03 Oct 2022
Deep is a Luxury We Don't Have
Deep is a Luxury We Don't Have
Ahmed Taha
Yen Nhi Truong Vu
Brent Mombourquette
Thomas P. Matthews
Jason Su
Sadanand Singh
ViT
MedIm
18
2
0
11 Aug 2022
A Length Adaptive Algorithm-Hardware Co-design of Transformer on FPGA
  Through Sparse Attention and Dynamic Pipelining
A Length Adaptive Algorithm-Hardware Co-design of Transformer on FPGA Through Sparse Attention and Dynamic Pipelining
Hongwu Peng
Shaoyi Huang
Shiyang Chen
Bingbing Li
Tong Geng
...
Weiwen Jiang
Wujie Wen
J. Bi
Hang Liu
Caiwen Ding
45
54
0
07 Aug 2022
Pure Transformers are Powerful Graph Learners
Pure Transformers are Powerful Graph Learners
Jinwoo Kim
Tien Dat Nguyen
Seonwoo Min
Sungjun Cho
Moontae Lee
Honglak Lee
Seunghoon Hong
38
187
0
06 Jul 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary
  Algorithm
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
30
32
0
19 Jun 2022
A Survey on Deep Learning for Skin Lesion Segmentation
A Survey on Deep Learning for Skin Lesion Segmentation
Z. Mirikharaji
Kumar Abhishek
Alceu Bissoto
Catarina Barata
Sandra Avila
Eduardo Valle
M. Celebi
Ghassan Hamarneh
31
82
0
01 Jun 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
36
149
0
27 Apr 2022
VL-InterpreT: An Interactive Visualization Tool for Interpreting
  Vision-Language Transformers
VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers
Estelle Aflalo
Meng Du
Shao-Yen Tseng
Yongfei Liu
Chenfei Wu
Nan Duan
Vasudev Lal
23
45
0
30 Mar 2022
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Xiaohan Ding
X. Zhang
Yi Zhou
Jungong Han
Guiguang Ding
Jian-jun Sun
VLM
47
528
0
13 Mar 2022
The Principle of Diversity: Training Stronger Vision Transformers Calls
  for Reducing All Levels of Redundancy
The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy
Tianlong Chen
Zhenyu (Allen) Zhang
Yu Cheng
Ahmed Hassan Awadallah
Zhangyang Wang
ViT
33
37
0
12 Mar 2022
ChiTransformer:Towards Reliable Stereo from Cues
ChiTransformer:Towards Reliable Stereo from Cues
Qing Su
Shihao Ji
MDE
ViT
16
12
0
09 Mar 2022
Characterizing Renal Structures with 3D Block Aggregate Transformers
Characterizing Renal Structures with 3D Block Aggregate Transformers
Xin Yu
Yucheng Tang
Yinchi Zhou
Riqiang Gao
Qi Yang
...
Yuankai Huo
Zhoubing Xu
Thomas A. Lasko
R. Abramson
Bennett A. Landman
MedIm
ViT
21
3
0
04 Mar 2022
How Do Vision Transformers Work?
How Do Vision Transformers Work?
Namuk Park
Songkuk Kim
ViT
25
465
0
14 Feb 2022
Query Efficient Decision Based Sparse Attacks Against Black-Box Deep
  Learning Models
Query Efficient Decision Based Sparse Attacks Against Black-Box Deep Learning Models
Viet Vo
Ehsan Abbasnejad
D. Ranasinghe
AAML
22
14
0
31 Jan 2022
UniFormer: Unifying Convolution and Self-attention for Visual
  Recognition
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
144
361
0
24 Jan 2022
Representing Long-Range Context for Graph Neural Networks with Global
  Attention
Representing Long-Range Context for Graph Neural Networks with Global Attention
Zhanghao Wu
Paras Jain
Matthew A. Wright
Azalia Mirhoseini
Joseph E. Gonzalez
Ion Stoica
GNN
35
258
0
21 Jan 2022
Real-World Graph Convolution Networks (RW-GCNs) for Action Recognition
  in Smart Video Surveillance
Real-World Graph Convolution Networks (RW-GCNs) for Action Recognition in Smart Video Surveillance
Justin Sanchez
Christopher Neff
Hamed Tabkhi
GNN
30
9
0
15 Jan 2022
UniFormer: Unified Transformer for Efficient Spatiotemporal
  Representation Learning
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning
Kunchang Li
Yali Wang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
35
238
0
12 Jan 2022
Deep ViT Features as Dense Visual Descriptors
Deep ViT Features as Dense Visual Descriptors
Shirzad Amir
Yossi Gandelsman
Shai Bagon
Tali Dekel
MDE
ViT
36
271
0
10 Dec 2021
Couplformer:Rethinking Vision Transformer with Coupling Attention Map
Couplformer:Rethinking Vision Transformer with Coupling Attention Map
Hai Lan
Xihao Wang
Xian Wei
ViT
26
3
0
10 Dec 2021
12
Next