Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.11296
Cited By
Sparse Sinkhorn Attention
26 February 2020
Yi Tay
Dara Bahri
Liu Yang
Donald Metzler
Da-Cheng Juan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sparse Sinkhorn Attention"
29 / 79 papers shown
Title
Flowformer: Linearizing Transformers with Conservation Flows
Haixu Wu
Jialong Wu
Jiehui Xu
Jianmin Wang
Mingsheng Long
14
90
0
13 Feb 2022
FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
Tian Zhou
Ziqing Ma
Qingsong Wen
Xue Wang
Liang Sun
Rong Jin
AI4TS
21
1,303
0
30 Jan 2022
Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks
Lei Cheng
Ruslan Khalitov
Tong Yu
Zhirong Yang
25
32
0
06 Jan 2022
LongT5: Efficient Text-To-Text Transformer for Long Sequences
Mandy Guo
Joshua Ainslie
David C. Uthus
Santiago Ontanon
Jianmo Ni
Yun-hsuan Sung
Yinfei Yang
VLM
31
307
0
15 Dec 2021
AdaViT: Adaptive Tokens for Efficient Vision Transformer
Hongxu Yin
Arash Vahdat
J. Álvarez
Arun Mallya
Jan Kautz
Pavlo Molchanov
ViT
35
314
0
14 Dec 2021
Sinkformers: Transformers with Doubly Stochastic Attention
Michael E. Sander
Pierre Ablin
Mathieu Blondel
Gabriel Peyré
29
76
0
22 Oct 2021
Transformer Acceleration with Dynamic Sparse Attention
Liu Liu
Zheng Qu
Zhaodong Chen
Yufei Ding
Yuan Xie
19
20
0
21 Oct 2021
Token Pooling in Vision Transformers
D. Marin
Jen-Hao Rick Chang
Anurag Ranjan
Anish K. Prabhu
Mohammad Rastegari
Oncel Tuzel
ViT
76
66
0
08 Oct 2021
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Yi Tay
Mostafa Dehghani
J. Rao
W. Fedus
Samira Abnar
Hyung Won Chung
Sharan Narang
Dani Yogatama
Ashish Vaswani
Donald Metzler
206
110
0
22 Sep 2021
Greenformers: Improving Computation and Memory Efficiency in Transformer Models via Low-Rank Approximation
Samuel Cahyawijaya
26
12
0
24 Aug 2021
Learning to Match Features with Seeded Graph Matching Network
Hongkai Chen
Zixin Luo
Jiahui Zhang
Lei Zhou
Xuyang Bai
Zeyu Hu
Chiew-Lan Tai
Long Quan
17
111
0
19 Aug 2021
Learned Token Pruning for Transformers
Sehoon Kim
Sheng Shen
D. Thorsley
A. Gholami
Woosuk Kwon
Joseph Hassoun
Kurt Keutzer
14
145
0
02 Jul 2021
IA-RED
2
^2
2
: Interpretability-Aware Redundancy Reduction for Vision Transformers
Bowen Pan
Rameswar Panda
Yi Ding
Zhangyang Wang
Rogerio Feris
A. Oliva
VLM
ViT
39
153
0
23 Jun 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
50
1,088
0
08 Jun 2021
KVT: k-NN Attention for Boosting Vision Transformers
Pichao Wang
Xue Wang
F. Wang
Ming Lin
Shuning Chang
Hao Li
R. L. Jin
ViT
51
105
0
28 May 2021
T-EMDE: Sketching-based global similarity for cross-modal retrieval
Barbara Rychalska
Mikolaj Wieczorek
Jacek Dąbrowski
33
0
0
10 May 2021
FNet: Mixing Tokens with Fourier Transforms
James Lee-Thorp
Joshua Ainslie
Ilya Eckstein
Santiago Ontanon
26
517
0
09 May 2021
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
63
1,224
0
22 Apr 2021
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding
Pengchuan Zhang
Xiyang Dai
Jianwei Yang
Bin Xiao
Lu Yuan
Lei Zhang
Jianfeng Gao
ViT
29
329
0
29 Mar 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
227
2,430
0
04 Jan 2021
ERNIE-Doc: A Retrospective Long-Document Modeling Transformer
Siyu Ding
Junyuan Shang
Shuohuan Wang
Yu Sun
Hao Tian
Hua-Hong Wu
Haifeng Wang
71
52
0
31 Dec 2020
SMYRF: Efficient Attention using Asymmetric Clustering
Giannis Daras
Nikita Kitaev
Augustus Odena
A. Dimakis
28
44
0
11 Oct 2020
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
106
1,102
0
14 Sep 2020
Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding
Shuohang Wang
Luowei Zhou
Zhe Gan
Yen-Chun Chen
Yuwei Fang
S. Sun
Yu Cheng
Jingjing Liu
43
28
0
13 Sep 2020
Sparsifying Transformer Models with Trainable Representation Pooling
Michal Pietruszka
Łukasz Borchmann
Lukasz Garncarek
17
10
0
10 Sep 2020
Conformer-Kernel with Query Term Independence for Document Retrieval
Bhaskar Mitra
Sebastian Hofstatter
Hamed Zamani
Nick Craswell
19
21
0
20 Jul 2020
S2RMs: Spatially Structured Recurrent Modules
Nasim Rahaman
Anirudh Goyal
Muhammad Waleed Gondal
M. Wuthrich
Stefan Bauer
Yash Sharma
Yoshua Bengio
Bernhard Schölkopf
21
14
0
13 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
36
131
0
30 Jun 2020
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
218
7,926
0
17 Aug 2015
Previous
1
2