Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.06732
Cited By
Efficient Transformers: A Survey
14 September 2020
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Transformers: A Survey"
50 / 633 papers shown
Title
MuLD: The Multitask Long Document Benchmark
G. Hudson
Noura Al Moubayed
8
10
0
15 Feb 2022
Hindi/Bengali Sentiment Analysis Using Transfer Learning and Joint Dual Input Learning with Self Attention
Shahrukh Khan
Mahnoor Shahid
20
1
0
11 Feb 2022
Deep Learning for Computational Cytology: A Survey
Hao Jiang
Yanning Zhou
Yi-Mou Lin
R. Chan
Jiangshu Liu
Hao Chen
11
74
0
10 Feb 2022
Universal Hopfield Networks: A General Framework for Single-Shot Associative Memory Models
Beren Millidge
Tommaso Salvatori
Yuhang Song
Thomas Lukasiewicz
Rafal Bogacz
VLM
14
52
0
09 Feb 2022
pNLP-Mixer: an Efficient all-MLP Architecture for Language
Francesco Fusco
Damian Pascual
Peter W. J. Staar
Diego Antognini
21
29
0
09 Feb 2022
Structured Time Series Prediction without Structural Prior
Darko Drakulic
J. Andreoli
DiffM
AI4TS
9
3
0
07 Feb 2022
Patch-Based Stochastic Attention for Image Editing
Nicolas Cherel
Andrés Almansa
Y. Gousseau
A. Newson
17
6
0
07 Feb 2022
Structure-Aware Transformer for Graph Representation Learning
Dexiong Chen
Leslie O’Bray
Karsten M. Borgwardt
26
236
0
07 Feb 2022
GatorTron: A Large Clinical Language Model to Unlock Patient Information from Unstructured Electronic Health Records
Xi Yang
Aokun Chen
Nima M. Pournejatian
Hoo-Chang Shin
Kaleb E. Smith
...
Duane A. Mitchell
W. Hogan
E. Shenkman
Jiang Bian
Yonghui Wu
AI4MH
LM&MA
22
499
0
02 Feb 2022
Corpus for Automatic Structuring of Legal Documents
Prathamesh Kalamkar
Aman Tiwari
Astha Agarwal
S. Karn
Smita Gupta
Vivek Raghavan
Ashutosh Modi
AILaw
31
48
0
31 Jan 2022
BOAT: Bilateral Local Attention Vision Transformer
Tan Yu
Gangming Zhao
Ping Li
Yizhou Yu
ViT
20
27
0
31 Jan 2022
Fast Monte-Carlo Approximation of the Attention Mechanism
Hyunjun Kim
Jeonggil Ko
6
2
0
30 Jan 2022
Transformers in Medical Imaging: A Survey
Fahad Shamshad
Salman Khan
Syed Waqas Zamir
Muhammad Haris Khan
Munawar Hayat
F. Khan
H. Fu
ViT
LM&MA
MedIm
106
653
0
24 Jan 2022
Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks
Haoyu Dong
Zhoujun Cheng
Xinyi He
Mengyuan Zhou
Anda Zhou
Fan Zhou
Ao Liu
Shi Han
Dongmei Zhang
LMTD
60
62
0
24 Jan 2022
Continual Transformers: Redundancy-Free Attention for Online Inference
Lukas Hedegaard
Arian Bakhtiarnia
Alexandros Iosifidis
CLL
14
11
0
17 Jan 2022
Control of Dual-Sourcing Inventory Systems using Recurrent Neural Networks
Lucas Böttcher
Thomas Asikis
I. Fragkos
BDL
6
10
0
16 Jan 2022
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
20
101
0
16 Jan 2022
GateFormer: Speeding Up News Feed Recommendation with Input Gated Transformers
Peitian Zhang
Zheng liu
AI4TS
8
1
0
12 Jan 2022
SCROLLS: Standardized CompaRison Over Long Language Sequences
Uri Shaham
Elad Segal
Maor Ivgi
Avia Efrat
Ori Yoran
...
Ankit Gupta
Wenhan Xiong
Mor Geva
Jonathan Berant
Omer Levy
RALM
20
132
0
10 Jan 2022
Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks
Lei Cheng
Ruslan Khalitov
Tong Yu
Zhirong Yang
20
32
0
06 Jan 2022
Transformer Uncertainty Estimation with Hierarchical Stochastic Attention
Jiahuan Pei
Cheng-Yu Wang
Gyuri Szarvas
6
22
0
27 Dec 2021
Video Joint Modelling Based on Hierarchical Transformer for Co-summarization
Haopeng Li
Qiuhong Ke
Mingming Gong
Zhang Rui
ViT
15
22
0
27 Dec 2021
Domain Adaptation with Pre-trained Transformers for Query Focused Abstractive Text Summarization
Md Tahmid Rahman Laskar
Enamul Hoque
J. Huang
28
44
0
22 Dec 2021
Efficient Visual Tracking with Exemplar Transformers
Philippe Blatter
Menelaos Kanakis
Martin Danelljan
Luc Van Gool
ViT
8
79
0
17 Dec 2021
Block-Skim: Efficient Question Answering for Transformer
Yue Guan
Zhengyi Li
Jingwen Leng
Zhouhan Lin
Minyi Guo
Yuhao Zhu
14
30
0
16 Dec 2021
Mask-combine Decoding and Classification Approach for Punctuation Prediction with real-time Inference Constraints
Christoph Minixhofer
Ondˇrej Klejch
P. Bell
AI4CE
15
0
0
15 Dec 2021
LongT5: Efficient Text-To-Text Transformer for Long Sequences
Mandy Guo
Joshua Ainslie
David C. Uthus
Santiago Ontanon
Jianmo Ni
Yun-hsuan Sung
Yinfei Yang
VLM
29
306
0
15 Dec 2021
Simple Local Attentions Remain Competitive for Long-Context Tasks
Wenhan Xiong
Barlas Ouguz
Anchit Gupta
Xilun Chen
Diana Liskovich
Omer Levy
Wen-tau Yih
Yashar Mehdad
25
29
0
14 Dec 2021
Couplformer:Rethinking Vision Transformer with Coupling Attention Map
Hai Lan
Xihao Wang
Xian Wei
ViT
15
3
0
10 Dec 2021
Sketching as a Tool for Understanding and Accelerating Self-attention for Long Sequences
Yifan Chen
Qi Zeng
Dilek Z. Hakkani-Tür
Di Jin
Heng Ji
Yun Yang
17
4
0
10 Dec 2021
Recurrent Glimpse-based Decoder for Detection with Transformer
Zhe Chen
Jing Zhang
Dacheng Tao
ViT
11
27
0
09 Dec 2021
FastSGD: A Fast Compressed SGD Framework for Distributed Machine Learning
Keyu Yang
Lu Chen
Zhihao Zeng
Yunjun Gao
13
9
0
08 Dec 2021
Attention-Based Model and Deep Reinforcement Learning for Distribution of Event Processing Tasks
A. Mazayev
F. Al-Tam
N. Correia
10
5
0
07 Dec 2021
Graph Conditioned Sparse-Attention for Improved Source Code Understanding
Junyan Cheng
Iordanis Fostiropoulos
Barry W. Boehm
11
1
0
01 Dec 2021
Score Transformer: Generating Musical Score from Note-level Representation
Masahiro Suzuki
16
8
0
01 Dec 2021
Sparse is Enough in Scaling Transformers
Sebastian Jaszczur
Aakanksha Chowdhery
Afroz Mohiuddin
Lukasz Kaiser
Wojciech Gajewski
Henryk Michalewski
Jonni Kanerva
MoE
16
100
0
24 Nov 2021
Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically Structured Sequences
Moritz Ibing
Gregor Kobsik
Leif Kobbelt
10
37
0
24 Nov 2021
Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers
John Guibas
Morteza Mardani
Zong-Yi Li
Andrew Tao
Anima Anandkumar
Bryan Catanzaro
17
222
0
24 Nov 2021
SimpleTRON: Simple Transformer with O(N) Complexity
Uladzislau Yorsh
Alexander Kovalenko
Vojtvech Vanvcura
Daniel Vavsata
Pavel Kordík
Tomávs Mikolov
17
1
0
23 Nov 2021
PointMixer: MLP-Mixer for Point Cloud Understanding
Jaesung Choe
Chunghyun Park
François Rameau
Jaesik Park
In So Kweon
3DPC
32
98
0
22 Nov 2021
Quality and Cost Trade-offs in Passage Re-ranking Task
P. Podberezko
V. Mitskevich
Raman Makouski
P. Goncharov
Andrei Khobnia
Nikolay A Bushkov
Marina Chernyshevich
14
0
0
18 Nov 2021
HiRID-ICU-Benchmark -- A Comprehensive Machine Learning Benchmark on High-resolution ICU Data
Hugo Yèche
Rita Kuznetsova
M. Zimmermann
Matthias Huser
Xinrui Lyu
M. Faltys
Gunnar Rätsch
29
40
0
16 Nov 2021
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
66
325
0
11 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Yinghui Li
Li Tao
Dun Liang
Haitao Zheng
77
96
0
07 Nov 2021
CoreLM: Coreference-aware Language Model Fine-Tuning
Nikolaos Stylianou
I. Vlahavas
6
2
0
04 Nov 2021
Transformers for prompt-level EMA non-response prediction
Supriya Nagesh
Alexander Moreno
Stephanie M Carpenter
Jamie Yap
Soujanya Chatterjee
...
Santosh Kumar
Cho Lam
D. Wetter
Inbal Nahum-Shani
James M. Rehg
4
0
0
01 Nov 2021
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nyström Method
Yifan Chen
Qi Zeng
Heng Ji
Yun Yang
8
49
0
29 Oct 2021
Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems
Wenqing Zheng
Qiangqiang Guo
H. Yang
Peihao Wang
Zhangyang Wang
AI4CE
6
11
0
29 Oct 2021
Scatterbrain: Unifying Sparse and Low-rank Attention Approximation
Beidi Chen
Tri Dao
Eric Winsor
Zhao-quan Song
Atri Rudra
Christopher Ré
18
125
0
28 Oct 2021
The Efficiency Misnomer
Daoyuan Chen
Liuyi Yao
Dawei Gao
Ashish Vaswani
Yaliang Li
23
96
0
25 Oct 2021
Previous
1
2
3
...
10
11
12
13
9
Next