Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.06732
Cited By
Efficient Transformers: A Survey
14 September 2020
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Transformers: A Survey"
50 / 633 papers shown
Title
SOFT: Softmax-free Transformer with Linear Complexity
Jiachen Lu
Jinghan Yao
Junge Zhang
Martin Danelljan
Hang Xu
Weiguo Gao
Chunjing Xu
Thomas B. Schon
Li Zhang
13
159
0
22 Oct 2021
Transformer Acceleration with Dynamic Sparse Attention
Liu Liu
Zheng Qu
Zhaodong Chen
Yufei Ding
Yuan Xie
11
11
0
21 Oct 2021
Compositional Attention: Disentangling Search and Retrieval
Sarthak Mittal
Sharath Chandra Raparthy
Irina Rish
Yoshua Bengio
Guillaume Lajoie
11
20
0
18 Oct 2021
Energon: Towards Efficient Acceleration of Transformers Using Dynamic Sparse Attention
Zhe Zhou
Junling Liu
Zhenyu Gu
Guangyu Sun
56
39
0
18 Oct 2021
Sparse Distillation: Speeding Up Text Classification by Using Bigger Student Models
Qinyuan Ye
Madian Khabsa
M. Lewis
Sinong Wang
Xiang Ren
Aaron Jaech
24
5
0
16 Oct 2021
On Learning the Transformer Kernel
Sankalan Pal Chowdhury
Adamos Solomou
Kumar Avinava Dubey
Mrinmaya Sachan
ViT
31
14
0
15 Oct 2021
DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization
Ziming Mao
Chen Henry Wu
Ansong Ni
Yusen Zhang
Rui Zhang
Tao Yu
Budhaditya Deb
Chenguang Zhu
Ahmed Hassan Awadallah
Dragomir R. Radev
14
56
0
15 Oct 2021
StreaMulT: Streaming Multimodal Transformer for Heterogeneous and Arbitrary Long Sequential Data
Victor Pellegrain
Myriam Tami
M. Batteux
C´eline Hudelot
AI4TS
20
2
0
15 Oct 2021
How Does Momentum Benefit Deep Neural Networks Architecture Design? A Few Case Studies
Bao Wang
Hedi Xia
T. Nguyen
Stanley Osher
AI4CE
26
10
0
13 Oct 2021
Leveraging redundancy in attention with Reuse Transformers
Srinadh Bhojanapalli
Ayan Chakrabarti
Andreas Veit
Michal Lukasik
Himanshu Jain
Frederick Liu
Yin-Wen Chang
Sanjiv Kumar
8
23
0
13 Oct 2021
Pre-trained Language Models in Biomedical Domain: A Systematic Survey
Benyou Wang
Qianqian Xie
Jiahuan Pei
Zhihong Chen
Prayag Tiwari
Zhao Li
Jie Fu
LM&MA
AI4CE
23
160
0
11 Oct 2021
Token Pooling in Vision Transformers
D. Marin
Jen-Hao Rick Chang
Anurag Ranjan
Anish K. Prabhu
Mohammad Rastegari
Oncel Tuzel
ViT
65
65
0
08 Oct 2021
ABC: Attention with Bounded-memory Control
Hao Peng
Jungo Kasai
Nikolaos Pappas
Dani Yogatama
Zhaofeng Wu
Lingpeng Kong
Roy Schwartz
Noah A. Smith
61
22
0
06 Oct 2021
Ripple Attention for Visual Perception with Sub-quadratic Complexity
Lin Zheng
Huijie Pan
Lingpeng Kong
16
3
0
06 Oct 2021
PoNet: Pooling Network for Efficient Token Mixing in Long Sequences
Chao-Hong Tan
Qian Chen
Wen Wang
Qinglin Zhang
Siqi Zheng
Zhenhua Ling
ViT
6
11
0
06 Oct 2021
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
12
132
0
27 Sep 2021
Vision Transformer Hashing for Image Retrieval
S. Dubey
S. Singh
Wei Chu
ViT
25
47
0
26 Sep 2021
Long-Range Transformers for Dynamic Spatiotemporal Forecasting
J. E. Grigsby
Zhe Wang
Nam Nguyen
Yanjun Qi
AI4TS
58
83
0
24 Sep 2021
Named Entity Recognition and Classification on Historical Documents: A Survey
Maud Ehrmann
Ahmed Hamdi
Elvys Linhares Pontes
Matteo Romanello
A. Doucet
47
108
0
23 Sep 2021
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Yi Tay
Mostafa Dehghani
J. Rao
W. Fedus
Samira Abnar
Hyung Won Chung
Sharan Narang
Dani Yogatama
Ashish Vaswani
Donald Metzler
185
110
0
22 Sep 2021
Audiomer: A Convolutional Transformer For Keyword Spotting
Surya Kant Sahu
Sai Mitheran
Juhi Kamdar
Meet Gandhi
16
8
0
21 Sep 2021
Survey: Transformer based Video-Language Pre-training
Ludan Ruan
Qin Jin
VLM
ViT
61
44
0
21 Sep 2021
General Cross-Architecture Distillation of Pretrained Language Models into Matrix Embeddings
Lukas Galke
Isabelle Cuber
Christophe Meyer
Henrik Ferdinand Nolscher
Angelina Sonderecker
A. Scherp
17
2
0
17 Sep 2021
SHAPE: Shifted Absolute Position Embedding for Transformers
Shun Kiyono
Sosuke Kobayashi
Jun Suzuki
Kentaro Inui
223
44
0
13 Sep 2021
Query-driven Segment Selection for Ranking Long Documents
Youngwoo Kim
Razieh Rahimi
Hamed Bonab
James Allan
RALM
23
5
0
10 Sep 2021
MATE: Multi-view Attention for Table Transformer Efficiency
Julian Martin Eisenschlos
Maharshi Gor
Thomas Müller
William W. Cohen
LMTD
67
92
0
09 Sep 2021
Sparsity and Sentence Structure in Encoder-Decoder Attention of Summarization Systems
Potsawee Manakul
Mark J. F. Gales
13
5
0
08 Sep 2021
Memory and Knowledge Augmented Language Models for Inferring Salience in Long-Form Stories
David Wilmot
Frank Keller
RALM
KELM
8
21
0
08 Sep 2021
PermuteFormer: Efficient Relative Position Encoding for Long Sequences
Peng-Jen Chen
20
21
0
06 Sep 2021
∞
\infty
∞
-former: Infinite Memory Transformer
Pedro Henrique Martins
Zita Marinho
André F. T. Martins
28
11
0
01 Sep 2021
SHIFT15M: Fashion-specific dataset for set-to-set matching with several distribution shifts
Masanari Kimura
Takuma Nakamura
Yuki Saito
OOD
20
3
0
30 Aug 2021
A Web Scale Entity Extraction System
Xuanting Cai
Quanbin Ma
Pan Li
Jianyu Liu
Qi Zeng
Zhengkan Yang
Pushkar Tripathi
17
0
0
27 Aug 2021
Greenformers: Improving Computation and Memory Efficiency in Transformer Models via Low-Rank Approximation
Samuel Cahyawijaya
26
12
0
24 Aug 2021
Smart Bird: Learnable Sparse Attention for Efficient and Effective Transformer
Chuhan Wu
Fangzhao Wu
Tao Qi
Binxing Jiao
Daxin Jiang
Yongfeng Huang
Xing Xie
19
3
0
20 Aug 2021
Fastformer: Additive Attention Can Be All You Need
Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang
Xing Xie
22
112
0
20 Aug 2021
FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention
T. Nguyen
Vai Suliafu
Stanley J. Osher
Long Chen
Bao Wang
10
35
0
05 Aug 2021
Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer
Yifan Xu
Zhijie Zhang
Mengdan Zhang
Kekai Sheng
Ke Li
Weiming Dong
Liqing Zhang
Changsheng Xu
Xing Sun
ViT
18
201
0
03 Aug 2021
Representation learning for neural population activity with Neural Data Transformers
Joel Ye
C. Pandarinath
AI4TS
AI4CE
6
51
0
02 Aug 2021
A Survey of Human-in-the-loop for Machine Learning
Xingjiao Wu
Luwei Xiao
Yixuan Sun
Junhang Zhang
Tianlong Ma
Liangbo He
SyDa
31
499
0
02 Aug 2021
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Andrew Jaegle
Sebastian Borgeaud
Jean-Baptiste Alayrac
Carl Doersch
Catalin Ionescu
...
Olivier J. Hénaff
M. Botvinick
Andrew Zisserman
Oriol Vinyals
João Carreira
MLLM
VLM
GNN
17
561
0
30 Jul 2021
H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences
Zhenhai Zhu
Radu Soricut
95
41
0
25 Jul 2021
Clinical Relation Extraction Using Transformer-based Models
Xi Yang
Zehao Yu
Yi Guo
Jiang Bian
Yonghui Wu
LM&MA
MedIm
22
20
0
19 Jul 2021
Video Crowd Localization with Multi-focus Gaussian Neighborhood Attention and a Large-Scale Benchmark
Haopeng Li
Lingbo Liu
Kunlin Yang
Shinan Liu
Junyuan Gao
Bin Zhao
Rui Zhang
Jun Hou
37
14
0
19 Jul 2021
STAR: Sparse Transformer-based Action Recognition
Feng Shi
Chonghan Lee
Liang Qiu
Yizhou Zhao
Tianyi Shen
Shivran Muralidhar
Tian Han
Song-Chun Zhu
V. Narayanan
ViT
16
26
0
15 Jul 2021
Efficient Transformer for Direct Speech Translation
Belen Alastruey
Gerard I. Gállego
Marta R. Costa-jussá
6
7
0
07 Jul 2021
Poly-NL: Linear Complexity Non-local Layers with Polynomials
F. Babiloni
Ioannis Marras
Filippos Kokkinos
Jiankang Deng
Grigorios G. Chrysos
S. Zafeiriou
23
6
0
06 Jul 2021
Clustering and attention model based for intelligent trading
Mimansa Rana
Nanxiang Mao
Ming Ao
Xiaohui Wu
Poning Liang
Matloob Khushi
10
1
0
06 Jul 2021
Vision Xformers: Efficient Attention for Image Classification
Pranav Jeevan
Amit Sethi
ViT
6
13
0
05 Jul 2021
A Primer on Pretrained Multilingual Language Models
Sumanth Doddapaneni
Gowtham Ramesh
Mitesh M. Khapra
Anoop Kunchukuttan
Pratyush Kumar
LRM
30
73
0
01 Jul 2021
Improving the Efficiency of Transformers for Resource-Constrained Devices
Hamid Tabani
Ajay Balasubramaniam
Shabbir Marzban
Elahe Arani
Bahram Zonooz
17
20
0
30 Jun 2021
Previous
1
2
3
...
10
11
12
13
Next