Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.04825
Cited By
v1
v2 (latest)
Fast Transformers with Clustered Attention
9 July 2020
Apoorv Vyas
Angelos Katharopoulos
Franccois Fleuret
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Fast Transformers with Clustered Attention"
50 / 58 papers shown
Title
CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution
Xin Liu
Jie Liu
J. Tang
Gangshan Wu
SupR
ViT
87
0
0
10 Mar 2025
Rethinking Transformer for Long Contextual Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Pingyi Chen
Zhongyi Shui
Chenglu Zhu
Lin Yang
MedIm
99
5
0
18 Oct 2024
ENACT: Entropy-based Clustering of Attention Input for Reducing the Computational Needs of Object Detection Transformers
Giorgos Savathrakis
Antonis Argyros
ViT
41
0
0
11 Sep 2024
CLIP-Decoder : ZeroShot Multilabel Classification using Multimodal CLIP Aligned Representation
Muhammad Ali
Salman Khan
VLM
133
15
0
21 Jun 2024
Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition
Lei Liu
Li Liu
Haizhou Li
82
7
0
31 Jan 2024
Fast Multipole Attention: A Divide-and-Conquer Attention Mechanism for Long Sequences
Yanming Kang
Giang Tran
H. Sterck
100
5
0
18 Oct 2023
MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers
Jakob Drachmann Havtorn
Amelie Royer
Tijmen Blankevoort
B. Bejnordi
81
8
0
05 Jul 2023
The emergence of clusters in self-attention dynamics
Borjan Geshkovski
Cyril Letrouit
Yury Polyanskiy
Philippe Rigollet
111
56
0
09 May 2023
Unlimiformer: Long-Range Transformers with Unlimited Length Input
Amanda Bertsch
Uri Alon
Graham Neubig
Matthew R. Gormley
RALM
211
130
0
02 May 2023
Efficient Long Sequence Modeling via State Space Augmented Transformer
Simiao Zuo
Xiaodong Liu
Jian Jiao
Denis Xavier Charles
Eren Manavoglu
Tuo Zhao
Jianfeng Gao
175
37
0
15 Dec 2022
A Survey on Artificial Intelligence for Music Generation: Agents, Domains and Perspectives
Carlos Hernandez-Olivan
Javier Hernandez-Olivan
J. R. Beltrán
MGen
93
7
0
25 Oct 2022
Diffuser: Efficient Transformers with Multi-hop Attention Diffusion for Long Sequences
Aosong Feng
Irene Li
Yuang Jiang
Rex Ying
79
18
0
21 Oct 2022
Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation
Botao Yu
Peiling Lu
Rui Wang
Wei Hu
Xu Tan
Wei Ye
Shikun Zhang
Tao Qin
Tie-Yan Liu
MGen
104
60
0
19 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Dianbo Sui
3DV
195
9
0
14 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Weicong Liang
Yuhui Yuan
Henghui Ding
Xiao Luo
Weihong Lin
Ding Jia
Zheng Zhang
Chao Zhang
Hanhua Hu
117
31
0
03 Oct 2022
TagRec++: Hierarchical Label Aware Attention Network for Question Categorization
Venktesh V
Mukesh Mohania
Vikram Goyal
BDL
60
2
0
10 Aug 2022
Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization
T. Nguyen
Richard G. Baraniuk
Robert M. Kirby
Stanley J. Osher
Bao Wang
127
9
0
01 Aug 2022
Attention and Self-Attention in Random Forests
Lev V. Utkin
A. Konstantinov
73
7
0
09 Jul 2022
FL-Tuning: Layer Tuning for Feed-Forward Network in Transformer
Jingping Liu
Yuqiu Song
Kui Xue
Hongli Sun
Chao Wang
Lihan Chen
Haiyun Jiang
Jiaqing Liang
Tong Ruan
72
2
0
30 Jun 2022
Long Range Language Modeling via Gated State Spaces
Harsh Mehta
Ankit Gupta
Ashok Cutkosky
Behnam Neyshabur
Mamba
140
243
0
27 Jun 2022
Online Segmentation of LiDAR Sequences: Dataset and Algorithm
Romain Loiseau
Mathieu Aubry
Loïc Landrieu
3DPC
100
15
0
16 Jun 2022
Separable Self-attention for Mobile Vision Transformers
Sachin Mehta
Mohammad Rastegari
ViT
MQ
105
265
0
06 Jun 2022
OnePose: One-Shot Object Pose Estimation without CAD Models
Jiaming Sun
Zihao Wang
Siyu Zhang
Xingyi He He
Hongcheng Zhao
Guofeng Zhang
Xiaowei Zhou
181
159
0
24 May 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
125
182
0
27 Apr 2022
Visual Attention Methods in Deep Learning: An In-Depth Survey
Mohammed Hassanin
Saeed Anwar
Ibrahim Radwan
Fahad Shahbaz Khan
Ajmal Mian
136
166
0
16 Apr 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
97
9
0
11 Apr 2022
Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds
Chenhang He
Ruihuang Li
Shuai Li
Lei Zhang
ViT
3DPC
87
173
0
19 Mar 2022
cosFormer: Rethinking Softmax in Attention
Zhen Qin
Weixuan Sun
Huicai Deng
Dongxu Li
Yunshen Wei
Baohong Lv
Junjie Yan
Lingpeng Kong
Yiran Zhong
95
222
0
17 Feb 2022
Flowformer: Linearizing Transformers with Conservation Flows
Haixu Wu
Jialong Wu
Jiehui Xu
Jianmin Wang
Mingsheng Long
64
92
0
13 Feb 2022
glassoformer: a query-sparse transformer for post-fault power grid voltage prediction
Yunling Zheng
Carson Hu
Guang Lin
Meng Yue
Bao Wang
Jack Xin
114
3
0
22 Jan 2022
Transformer Uncertainty Estimation with Hierarchical Stochastic Attention
Jiahuan Pei
Cheng-Yu Wang
Gyuri Szarvas
65
23
0
27 Dec 2021
Efficient Visual Tracking with Exemplar Transformers
Philippe Blatter
Menelaos Kanakis
Martin Danelljan
Luc Van Gool
ViT
128
84
0
17 Dec 2021
A deep language model to predict metabolic network equilibria
Franccois Charton
Amaury Hayat
Sean T. McQuade
Nathaniel J. Merrill
B. Piccoli
GNN
77
5
0
07 Dec 2021
Linear algebra with transformers
Franccois Charton
AIMat
104
59
0
03 Dec 2021
Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically Structured Sequences
Moritz Ibing
Gregor Kobsik
Leif Kobbelt
90
37
0
24 Nov 2021
Token Pooling in Vision Transformers
D. Marin
Jen-Hao Rick Chang
Anurag Ranjan
Anish K. Prabhu
Mohammad Rastegari
Oncel Tuzel
ViT
143
71
0
08 Oct 2021
ABC: Attention with Bounded-memory Control
Hao Peng
Jungo Kasai
Nikolaos Pappas
Dani Yogatama
Zhaofeng Wu
Lingpeng Kong
Roy Schwartz
Noah A. Smith
125
22
0
06 Oct 2021
Predicting Attention Sparsity in Transformers
Marcos Vinícius Treviso
António Góis
Patrick Fernandes
E. Fonseca
André F. T. Martins
154
14
0
24 Sep 2021
∞
\infty
∞
-former: Infinite Memory Transformer
Pedro Henrique Martins
Zita Marinho
André F. T. Martins
98
11
0
01 Sep 2021
FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention
T. Nguyen
Vai Suliafu
Stanley J. Osher
Long Chen
Bao Wang
70
36
0
05 Aug 2021
Grid Partitioned Attention: Efficient TransformerApproximation with Inductive Bias for High Resolution Detail Generation
Nikolay Jetchev
Gökhan Yildirim
Christian Bracher
Roland Vollgraf
26
0
0
08 Jul 2021
Learned Token Pruning for Transformers
Sehoon Kim
Sheng Shen
D. Thorsley
A. Gholami
Woosuk Kwon
Joseph Hassoun
Kurt Keutzer
86
157
0
02 Jul 2021
Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding
Shengjie Luo
Shanda Li
Tianle Cai
Di He
Dinglan Peng
Shuxin Zheng
Guolin Ke
Liwei Wang
Tie-Yan Liu
95
50
0
23 Jun 2021
Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation
Lei Ke
Xia Li
Martin Danelljan
Yu-Wing Tai
Chi-Keung Tang
Feng Yu
VOS
69
75
0
22 Jun 2021
Memory-efficient Transformers via Top-
k
k
k
Attention
Ankit Gupta
Guy Dar
Shaya Goodman
David Ciprut
Jonathan Berant
MQ
98
60
0
13 Jun 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
202
1,147
0
08 Jun 2021
On the Expressive Power of Self-Attention Matrices
Valerii Likhosherstov
K. Choromanski
Adrian Weller
95
36
0
07 Jun 2021
Container: Context Aggregation Network
Peng Gao
Jiasen Lu
Hongsheng Li
Roozbeh Mottaghi
Aniruddha Kembhavi
ViT
106
72
0
02 Jun 2021
FNet: Mixing Tokens with Fourier Transforms
James Lee-Thorp
Joshua Ainslie
Ilya Eckstein
Santiago Ontanon
134
536
0
09 May 2021
Attention for Image Registration (AiR): an unsupervised Transformer approach
Zihao Wang
H. Delingette
ViT
MedIm
29
7
0
05 May 2021
1
2
Next