Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.04768
Cited By
Linformer: Self-Attention with Linear Complexity
8 June 2020
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Linformer: Self-Attention with Linear Complexity"
50 / 645 papers shown
Title
WavSpA: Wavelet Space Attention for Boosting Transformers' Long Sequence Learning Ability
Yufan Zhuang
Zihan Wang
Fangbo Tao
Jingbo Shang
ViT
AI4TS
30
3
0
05 Oct 2022
Grouped self-attention mechanism for a memory-efficient Transformer
Bumjun Jung
Yusuke Mukuta
Tatsuya Harada
AI4TS
12
3
0
02 Oct 2022
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
50
105
0
30 Sep 2022
ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition
Martin H. Radfar
Rohit Barnwal
R. Swaminathan
Feng-Ju Chang
Grant P. Strimel
Nathan Susanj
Athanasios Mouchtaris
26
13
0
29 Sep 2022
Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
Fengyuan Shi
Ruopeng Gao
Weilin Huang
Limin Wang
17
23
0
28 Sep 2022
Liquid Structural State-Space Models
Ramin Hasani
Mathias Lechner
Tsun-Hsuan Wang
Makram Chahine
Alexander Amini
Daniela Rus
AI4TS
97
95
0
26 Sep 2022
Adapting Pretrained Text-to-Text Models for Long Text Sequences
Wenhan Xiong
Anchit Gupta
Shubham Toshniwal
Yashar Mehdad
Wen-tau Yih
RALM
VLM
49
30
0
21 Sep 2022
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Hongxiang Fan
Thomas C. P. Chau
Stylianos I. Venieris
Royson Lee
Alexandros Kouris
Wayne Luk
Nicholas D. Lane
Mohamed S. Abdelfattah
29
56
0
20 Sep 2022
Integrative Feature and Cost Aggregation with Transformers for Dense Correspondence
Sunghwan Hong
Seokju Cho
Seung Wook Kim
Stephen Lin
3DV
42
4
0
19 Sep 2022
Efficient Quantized Sparse Matrix Operations on Tensor Cores
Shigang Li
Kazuki Osawa
Torsten Hoefler
72
31
0
14 Sep 2022
Learning with Local Gradients at the Edge
M. Lomnitz
Z. Daniels
David C. Zhang
M. Piacentino
18
1
0
17 Aug 2022
Deep is a Luxury We Don't Have
Ahmed Taha
Yen Nhi Truong Vu
Brent Mombourquette
Thomas P. Matthews
Jason Su
Sadanand Singh
ViT
MedIm
18
2
0
11 Aug 2022
Sublinear Time Algorithm for Online Weighted Bipartite Matching
Han Hu
Zhao-quan Song
Runzhou Tao
Zhaozhuo Xu
Junze Yin
Danyang Zhuo
16
7
0
05 Aug 2022
Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization
T. Nguyen
Richard G. Baraniuk
Robert M. Kirby
Stanley J. Osher
Bao Wang
21
9
0
01 Aug 2022
Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation
Sunghwan Hong
Seokju Cho
Jisu Nam
Stephen Lin
Seung Wook Kim
ViT
19
122
0
22 Jul 2022
Recurrent Memory Transformer
Aydar Bulatov
Yuri Kuratov
Mikhail Burtsev
CLL
11
101
0
14 Jul 2022
Eliminating Gradient Conflict in Reference-based Line-Art Colorization
Zekun Li
Zhengyang Geng
Zhao Kang
Wenyu Chen
Yibo Yang
13
35
0
13 Jul 2022
Multi-Behavior Hypergraph-Enhanced Transformer for Sequential Recommendation
Yuhao Yang
Chao Huang
Lianghao Xia
Yuxuan Liang
Yanwei Yu
Chenliang Li
HAI
13
118
0
12 Jul 2022
Beyond Transfer Learning: Co-finetuning for Action Localisation
Anurag Arnab
Xuehan Xiong
A. Gritsenko
Rob Romijnders
Josip Djolonga
Mostafa Dehghani
Chen Sun
Mario Lucic
Cordelia Schmid
25
8
0
08 Jul 2022
Device-Cloud Collaborative Recommendation via Meta Controller
Jiangchao Yao
Feng Wang
Xichen Ding
Shaohu Chen
Bo Han
Jingren Zhou
Hongxia Yang
23
17
0
07 Jul 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
19
142
0
06 Jul 2022
Don't Pay Attention to the Noise: Learning Self-supervised Representations of Light Curves with a Denoising Time Series Transformer
M. Morvan
N. Nikolaou
K. H. Yip
Ingo P. Waldmann
AI4TS
86
9
0
06 Jul 2022
Pure Transformers are Powerful Graph Learners
Jinwoo Kim
Tien Dat Nguyen
Seonwoo Min
Sungjun Cho
Moontae Lee
Honglak Lee
Seunghoon Hong
32
187
0
06 Jul 2022
Long Range Language Modeling via Gated State Spaces
Harsh Mehta
Ankit Gupta
Ashok Cutkosky
Behnam Neyshabur
Mamba
26
231
0
27 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
30
32
0
19 Jun 2022
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation
Qihang Yu
Huiyu Wang
Dahun Kim
Siyuan Qiao
Maxwell D. Collins
Yukun Zhu
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
ViT
MedIm
27
89
0
17 Jun 2022
SimA: Simple Softmax-free Attention for Vision Transformers
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
14
25
0
17 Jun 2022
MMMNA-Net for Overall Survival Time Prediction of Brain Tumor Patients
Wen Tang
Haoyue Zhang
Pengxin Yu
Han Kang
Rongguo Zhang
28
6
0
13 Jun 2022
Bootstrapping Multi-view Representations for Fake News Detection
Qichao Ying
Xiaoxiao Hu
Yangming Zhou
Zhenxing Qian
Dan Zeng
Shiming Ge
22
45
0
12 Jun 2022
GateHUB: Gated History Unit with Background Suppression for Online Action Detection
Junwen Chen
Gaurav Mittal
Ye Yu
Yu Kong
Mei Chen
33
33
0
09 Jun 2022
Dynamic Linear Transformer for 3D Biomedical Image Segmentation
Zheyu Zhang
Ulas Bagci
ViT
MedIm
17
12
0
01 Jun 2022
Fair Comparison between Efficient Attentions
Jiuk Hong
Chaehyeon Lee
Soyoun Bang
Heechul Jung
17
1
0
01 Jun 2022
Chefs' Random Tables: Non-Trigonometric Random Features
Valerii Likhosherstov
K. Choromanski
Kumar Avinava Dubey
Frederick Liu
Tamás Sarlós
Adrian Weller
31
17
0
30 May 2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning
Aniket Didolkar
Kshitij Gupta
Anirudh Goyal
Nitesh B. Gundavarapu
Alex Lamb
Nan Rosemary Ke
Yoshua Bengio
AI4CE
110
17
0
30 May 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
56
2,020
0
27 May 2022
Leveraging Locality in Abstractive Text Summarization
Yixin Liu
Ansong Ni
Linyong Nan
Budhaditya Deb
Chenguang Zhu
Ahmed Hassan Awadallah
Dragomir R. Radev
21
18
0
25 May 2022
Recipe for a General, Powerful, Scalable Graph Transformer
Ladislav Rampášek
Mikhail Galkin
Vijay Prakash Dwivedi
A. Luu
Guy Wolf
Dominique Beaini
48
511
0
25 May 2022
Dynamic Query Selection for Fast Visual Perceiver
Corentin Dancette
Matthieu Cord
25
1
0
22 May 2022
Sampling Is All You Need on Modeling Long-Term User Behaviors for CTR Prediction
Yue Cao
Xiaojiang Zhou
Jiaqi Feng
Peihao Huang
Yao Xiao
Dayao Chen
Sheng Chen
82
39
0
20 May 2022
Vision Transformer: Vit and its Derivatives
Zujun Fu
ViT
33
6
0
12 May 2022
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers
Junting Pan
Adrian Bulat
Fuwen Tan
Xiatian Zhu
L. Dudziak
Hongsheng Li
Georgios Tzimiropoulos
Brais Martínez
ViT
23
180
0
06 May 2022
LayoutBERT: Masked Language Layout Model for Object Insertion
Kerem Turgutlu
Sanatan Sharma
J. Kumar
VLM
DiffM
17
2
0
30 Apr 2022
Depth Estimation with Simplified Transformer
John Yang
Le An
Anurag Dixit
Jinkyu Koo
Su Inn Park
MDE
28
21
0
28 Apr 2022
ClusterGNN: Cluster-based Coarse-to-Fine Graph Neural Network for Efficient Feature Matching
Yanxing Shi
Junxiong Cai
Yoli Shavit
Tai-Jiang Mu
Wensen Feng
Kai Zhang
GNN
19
77
0
25 Apr 2022
Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention
Tong Yu
Ruslan Khalitov
Lei Cheng
Zhirong Yang
MoE
21
10
0
22 Apr 2022
Efficient Linear Attention for Fast and Accurate Keypoint Matching
Suwichaya Suwanwimolkul
S. Komorita
3DPC
3DV
19
11
0
16 Apr 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
24
6
0
11 Apr 2022
Few-Shot Forecasting of Time-Series with Heterogeneous Channels
L. Brinkmeyer
Rafael Rêgo Drumond
Johannes Burchert
Lars Schmidt-Thieme
AI4TS
17
7
0
07 Apr 2022
Accelerating Attention through Gradient-Based Learned Runtime Pruning
Zheng Li
Soroush Ghodrati
Amir Yazdanbakhsh
H. Esmaeilzadeh
Mingu Kang
17
16
0
07 Apr 2022
ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound
Yan-Bo Lin
Jie Lei
Mohit Bansal
Gedas Bertasius
31
39
0
06 Apr 2022
Previous
1
2
3
...
10
11
12
13
Next