Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.06732
Cited By
Efficient Transformers: A Survey
14 September 2020
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Transformers: A Survey"
50 / 633 papers shown
Title
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts
Basil Mustafa
C. Riquelme
J. Puigcerver
Rodolphe Jenatton
N. Houlsby
VLM
MoE
28
183
0
06 Jun 2022
Exploring Transformers for Behavioural Biometrics: A Case Study in Gait Recognition
Paula Delgado-Santos
Ruben Tolosana
R. Guest
F. Deravi
R. Vera-Rodríguez
13
30
0
03 Jun 2022
BayesFormer: Transformer with Uncertainty Estimation
Karthik Abinav Sankararaman
Sinong Wang
Han Fang
UQCV
BDL
14
10
0
02 Jun 2022
Fair Comparison between Efficient Attentions
Jiuk Hong
Chaehyeon Lee
Soyoun Bang
Heechul Jung
17
1
0
01 Jun 2022
Transformer with Fourier Integral Attentions
T. Nguyen
Minh Pham
Tam Nguyen
Khai Nguyen
Stanley J. Osher
Nhat Ho
17
4
0
01 Jun 2022
Transformers for Multi-Object Tracking on Point Clouds
Felicia Ruppel
F. Faion
Claudius Gläser
Klaus C. J. Dietmayer
3DPC
4
16
0
31 May 2022
Prompt Injection: Parameterization of Fixed Inputs
Eunbi Choi
Yongrae Jo
Joel Jang
Minjoon Seo
8
29
0
31 May 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
56
2,017
0
27 May 2022
Probabilistic Transformer: Modelling Ambiguities and Distributions for RNA Folding and Molecule Design
Jörg K.H. Franke
Frederic Runge
Frank Hutter
15
14
0
27 May 2022
Training Language Models with Memory Augmentation
Zexuan Zhong
Tao Lei
Danqi Chen
RALM
232
127
0
25 May 2022
ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions
Difan Liu
Sandesh Shetty
Tobias Hinz
Matthew Fisher
Richard Y. Zhang
Taesung Park
E. Kalogerakis
ViT
14
29
0
24 May 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
124
344
0
21 May 2022
Exploring Extreme Parameter Compression for Pre-trained Language Models
Yuxin Ren
Benyou Wang
Lifeng Shang
Xin Jiang
Qun Liu
20
18
0
20 May 2022
Transformer with Memory Replay
R. Liu
Barzan Mozafari
OffRL
62
4
0
19 May 2022
Text Detection & Recognition in the Wild for Robot Localization
Z. Raisi
John S. Zelek
12
0
0
17 May 2022
Transkimmer: Transformer Learns to Layer-wise Skim
Yue Guan
Zhengyi Li
Jingwen Leng
Zhouhan Lin
Minyi Guo
61
38
0
15 May 2022
Symphony Generation with Permutation Invariant Language Model
Jiafeng Liu
Yuanliang Dong
Zehua Cheng
Xinran Zhang
Xiaobing Li
Feng Yu
Maosong Sun
8
39
0
10 May 2022
Transformer-Empowered 6G Intelligent Networks: From Massive MIMO Processing to Semantic Communication
Yang Wang
Zhen Gao
Dezhi Zheng
Sheng Chen
Deniz Gündüz
H. Vincent Poor
AI4CE
11
81
0
08 May 2022
Transformers in Time-series Analysis: A Tutorial
Sabeen Ahmed
Ian E. Nielsen
Aakash Tripathi
Shamoon Siddiqui
Ghulam Rasool
R. Ramachandran
AI4TS
19
142
0
28 Apr 2022
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
Han Cai
Ji Lin
Yujun Lin
Zhijian Liu
Haotian Tang
Hanrui Wang
Ligeng Zhu
Song Han
19
107
0
25 Apr 2022
ChapterBreak: A Challenge Dataset for Long-Range Language Models
Simeng Sun
Katherine Thai
Mohit Iyyer
10
19
0
22 Apr 2022
On the Locality of Attention in Direct Speech Translation
Belen Alastruey
Javier Ferrando
Gerard I. Gállego
Marta R. Costa-jussá
6
7
0
19 Apr 2022
Exploring Dimensionality Reduction Techniques in Multilingual Transformers
Álvaro Huertas-García
Alejandro Martín
Javier Huertas-Tato
David Camacho
19
7
0
18 Apr 2022
Usage of specific attention improves change point detection
Anna Dmitrienko
Evgenia Romanenkova
Alexey Zaytsev
11
0
0
18 Apr 2022
Multi-Frame Self-Supervised Depth with Transformers
Vitor Campagnolo Guizilini
Rares Ambrus
Di Chen
Sergey Zakharov
Adrien Gaidon
ViT
MDE
10
84
0
15 Apr 2022
Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models
Phyllis Ang
Bhuwan Dhingra
Lisa Wu Wills
19
6
0
15 Apr 2022
Revisiting Transformer-based Models for Long Document Classification
Xiang Dai
Ilias Chalkidis
S. Darkner
Desmond Elliott
VLM
6
68
0
14 Apr 2022
Malceiver: Perceiver with Hierarchical and Multi-modal Features for Android Malware Detection
Niall McLaughlin
17
2
0
12 Apr 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
24
6
0
11 Apr 2022
Linear Complexity Randomized Self-attention Mechanism
Lin Zheng
Chong-Jun Wang
Lingpeng Kong
11
31
0
10 Apr 2022
BERTuit: Understanding Spanish language in Twitter through a native transformer
Javier Huertas-Tato
Alejandro Martín
David Camacho
11
9
0
07 Apr 2022
Scaling Language Model Size in Cross-Device Federated Learning
Jae Hun Ro
Theresa Breiner
Lara McConnaughey
Mingqing Chen
A. Suresh
Shankar Kumar
Rajiv Mathews
FedML
21
24
0
31 Mar 2022
MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
Alan Baade
Puyuan Peng
David F. Harwath
21
95
0
30 Mar 2022
Diagonal State Spaces are as Effective as Structured State Spaces
Ankit Gupta
Albert Gu
Jonathan Berant
34
288
0
27 Mar 2022
A General Survey on Attention Mechanisms in Deep Learning
Gianni Brauwers
Flavius Frasincar
11
294
0
27 Mar 2022
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Yuting Yang
Licheng Jiao
Xuantong Liu
F. Liu
Shuyuan Yang
Zhixi Feng
Xu Tang
ViT
MedIm
20
28
0
24 Mar 2022
Linearizing Transformer with Key-Value Memory
Yizhe Zhang
Deng Cai
12
5
0
23 Mar 2022
PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers
Ryan Grainger
Thomas Paniagua
Xi Song
Naresh P. Cuntoor
Mun Wai Lee
Tianfu Wu
ViT
10
7
0
22 Mar 2022
Mask Usage Recognition using Vision Transformer with Transfer Learning and Data Augmentation
Hensel Donato Jahja
N. Yudistira
Sutrisno
ViT
12
8
0
22 Mar 2022
Efficient Classification of Long Documents Using Transformers
Hyunji Hayley Park
Yogarshi Vyas
Kashif Shah
4
51
0
21 Mar 2022
Memorizing Transformers
Yuhuai Wu
M. Rabe
DeLesley S. Hutchins
Christian Szegedy
RALM
11
170
0
16 Mar 2022
Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks with Unified Vision-and-Language BERTs
Taichi Iki
Akiko Aizawa
LLMAG
11
6
0
15 Mar 2022
Block-Recurrent Transformers
DeLesley S. Hutchins
Imanol Schlag
Yuhuai Wu
Ethan Dyer
Behnam Neyshabur
13
94
0
11 Mar 2022
WaveMix: Resource-efficient Token Mixing for Images
Pranav Jeevan
A. Sethi
6
10
0
07 Mar 2022
Dynamic N:M Fine-grained Structured Sparse Attention Mechanism
Zhaodong Chen
Yuying Quan
Zheng Qu
L. Liu
Yufei Ding
Yuan Xie
14
21
0
28 Feb 2022
PMC-Patients: A Large-scale Dataset of Patient Summaries and Relations for Benchmarking Retrieval-based Clinical Decision Support Systems
Zhengyun Zhao
Qiao Jin
Fangyuan Chen
Tuorui Peng
Sheng Yu
PINN
4
34
0
28 Feb 2022
A Differential Attention Fusion Model Based on Transformer for Time Series Forecasting
Benhan Li
Shengdong Du
Tianrui Li
AI4TS
12
2
0
23 Feb 2022
Ligandformer: A Graph Neural Network for Predicting Compound Property with Robust Interpretation
Jinjiang Guo
Qi Liu
Han Guo
Xi Lu
AI4CE
11
3
0
21 Feb 2022
Deep Learning for Hate Speech Detection: A Comparative Study
Jitendra Malik
Hezhe Qiao
Guansong Pang
A. Hengel
35
43
0
19 Feb 2022
The NLP Task Effectiveness of Long-Range Transformers
Guanghui Qin
Yukun Feng
Benjamin Van Durme
10
27
0
16 Feb 2022
Previous
1
2
3
...
10
11
12
13
8
9
Next