Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2001.04451
Cited By
Reformer: The Efficient Transformer
13 January 2020
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reformer: The Efficient Transformer"
50 / 375 papers shown
Title
Preformer: Predictive Transformer with Multi-Scale Segment-wise Correlations for Long-Term Time Series Forecasting
Dazhao Du
Bing-Huang Su
Zhewei Wei
AI4TS
11
43
0
23 Feb 2022
Transformer Quality in Linear Time
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
73
222
0
21 Feb 2022
cosFormer: Rethinking Softmax in Attention
Zhen Qin
Weixuan Sun
Huicai Deng
Dongxu Li
Yunshen Wei
Baohong Lv
Junjie Yan
Lingpeng Kong
Yiran Zhong
24
211
0
17 Feb 2022
Graph Masked Autoencoders with Transformers
Sixiao Zhang
Hongxu Chen
Haoran Yang
Xiangguo Sun
Philip S. Yu
Guandong Xu
21
18
0
17 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
35
65
0
15 Feb 2022
CATs++: Boosting Cost Aggregation with Convolutions and Transformers
Seokju Cho
Sunghwan Hong
Seung Wook Kim
ViT
19
34
0
14 Feb 2022
Flowformer: Linearizing Transformers with Conservation Flows
Haixu Wu
Jialong Wu
Jiehui Xu
Jianmin Wang
Mingsheng Long
14
90
0
13 Feb 2022
ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning
J. Tan
Y. Tan
C. Chan
Joon Huang Chuah
VLM
ViT
24
15
0
11 Feb 2022
Exploiting Spatial Sparsity for Event Cameras with Visual Transformers
Zuowen Wang
Yuhuang Hu
Shih-Chii Liu
ViT
28
33
0
10 Feb 2022
Particle Transformer for Jet Tagging
H. Qu
Congqiao Li
Sitian Qian
ViT
MedIm
22
97
0
08 Feb 2022
Fast Monte-Carlo Approximation of the Attention Mechanism
Hyunjun Kim
Jeonggil Ko
17
2
0
30 Jan 2022
Clinical-Longformer and Clinical-BigBird: Transformers for long clinical sequences
Yikuan Li
R. M. Wehbe
F. Ahmad
Hanyin Wang
Yuan Luo
VLM
MedIm
140
84
0
27 Jan 2022
CsFEVER and CTKFacts: Acquiring Czech data for fact verification
Herbert Ullrich
Jan Drchal
Martin Rýpar
Hana Vincourová
Václav Moravec
HILM
19
9
0
26 Jan 2022
glassoformer: a query-sparse transformer for post-fault power grid voltage prediction
Yunling Zheng
Carson Hu
Guang Lin
Meng Yue
Bao Wang
Jack Xin
68
2
0
22 Jan 2022
Representing Long-Range Context for Graph Neural Networks with Global Attention
Zhanghao Wu
Paras Jain
Matthew A. Wright
Azalia Mirhoseini
Joseph E. Gonzalez
Ion Stoica
GNN
35
258
0
21 Jan 2022
A Literature Survey of Recent Advances in Chatbots
Guendalina Caldarini
Sardar F. Jaf
K. McGarry
AI4CE
32
274
0
17 Jan 2022
Continual Transformers: Redundancy-Free Attention for Online Inference
Lukas Hedegaard
Arian Bakhtiarnia
Alexandros Iosifidis
CLL
20
11
0
17 Jan 2022
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
22
103
0
16 Jan 2022
A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models
Hanqing Zhang
Haolin Song
Shaoyu Li
Ming Zhou
Dawei Song
38
213
0
14 Jan 2022
Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks
Lei Cheng
Ruslan Khalitov
Tong Yu
Zhirong Yang
25
32
0
06 Jan 2022
ChunkFormer: Learning Long Time Series with Multi-stage Chunked Transformer
Yue Ju
Alka Isac
Yimin Nie
AI4TS
9
3
0
30 Dec 2021
A Lightweight and Accurate Spatial-Temporal Transformer for Traffic Forecasting
Guanyao Li
Shuhan Zhong
Shueng-Han Gary Chan
Ruiyuan Li
Chih-Chieh Hung
Wen-Chih Peng
AI4TS
29
26
0
30 Dec 2021
ELSA: Enhanced Local Self-Attention for Vision Transformer
Jingkai Zhou
Pichao Wang
Fan Wang
Qiong Liu
Hao Li
Rong Jin
ViT
34
37
0
23 Dec 2021
Domain Adaptation with Pre-trained Transformers for Query Focused Abstractive Text Summarization
Md Tahmid Rahman Laskar
Enamul Hoque
J. Huang
28
44
0
22 Dec 2021
Efficient Visual Tracking with Exemplar Transformers
Philippe Blatter
Menelaos Kanakis
Martin Danelljan
Luc Van Gool
ViT
21
79
0
17 Dec 2021
Self-attention Does Not Need
O
(
n
2
)
O(n^2)
O
(
n
2
)
Memory
M. Rabe
Charles Staats
LRM
18
139
0
10 Dec 2021
Couplformer:Rethinking Vision Transformer with Coupling Attention Map
Hai Lan
Xihao Wang
Xian Wei
ViT
26
3
0
10 Dec 2021
3D Medical Point Transformer: Introducing Convolution to Attention Networks for Medical Point Cloud Analysis
Jianhui Yu
Chaoyi Zhang
Heng Wang
Dingxin Zhang
Yang Song
Tiange Xiang
Dongnan Liu
Weidong (Tom) Cai
ViT
MedIm
19
32
0
09 Dec 2021
STJLA: A Multi-Context Aware Spatio-Temporal Joint Linear Attention Network for Traffic Forecasting
Yuchen Fang
Yanjun Qin
Haiyong Luo
Fang Zhao
Chenxing Wang
GNN
AI4TS
19
1
0
04 Dec 2021
Multi-View Stereo with Transformer
Jie Zhu
Bo Peng
Wanqing Li
Haifeng Shen
Zhe Zhang
Jianjun Lei
ViT
29
27
0
01 Dec 2021
Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models
Tri Dao
Beidi Chen
Kaizhao Liang
Jiaming Yang
Zhao-quan Song
Atri Rudra
Christopher Ré
25
75
0
30 Nov 2021
Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity
Byungseok Roh
Jaewoong Shin
Wuhyun Shin
Saehoon Kim
ViT
11
142
0
29 Nov 2021
Sparse Fusion for Multimodal Transformers
Yi Ding
Alex Rich
Mason Wang
Noah Stier
M. Turk
P. Sen
Tobias Höllerer
ViT
27
7
0
23 Nov 2021
Efficient Video Transformers with Spatial-Temporal Token Selection
Junke Wang
Xitong Yang
Hengduo Li
Li Liu
Zuxuan Wu
Yu-Gang Jiang
ViT
21
63
0
23 Nov 2021
ProSTformer: Pre-trained Progressive Space-Time Self-attention Model for Traffic Flow Forecasting
Xiao Yan
X. Gan
Jingjing Tang
Rui Wang
ViT
AI4TS
18
0
0
03 Nov 2021
Comparative Study of Long Document Classification
Vedangi Wagh
Snehal Khandve
Isha Joshi
Apurva Wani
Geetanjali Kale
Raviraj Joshi
24
25
0
01 Nov 2021
SOFT: Softmax-free Transformer with Linear Complexity
Jiachen Lu
Jinghan Yao
Junge Zhang
Martin Danelljan
Hang Xu
Weiguo Gao
Chunjing Xu
Thomas B. Schon
Li Zhang
18
161
0
22 Oct 2021
Transformer Acceleration with Dynamic Sparse Attention
Liu Liu
Zheng Qu
Zhaodong Chen
Yufei Ding
Yuan Xie
19
20
0
21 Oct 2021
Speech Summarization using Restricted Self-Attention
Roshan S. Sharma
Shruti Palaskar
A. Black
Florian Metze
17
33
0
12 Oct 2021
Does Preprocessing Help Training Over-parameterized Neural Networks?
Zhao-quan Song
Shuo Yang
Ruizhe Zhang
27
49
0
09 Oct 2021
Token Pooling in Vision Transformers
D. Marin
Jen-Hao Rick Chang
Anurag Ranjan
Anish K. Prabhu
Mohammad Rastegari
Oncel Tuzel
ViT
70
66
0
08 Oct 2021
Ripple Attention for Visual Perception with Sub-quadratic Complexity
Lin Zheng
Huijie Pan
Lingpeng Kong
23
3
0
06 Oct 2021
MoEfication: Transformer Feed-forward Layers are Mixtures of Experts
Zhengyan Zhang
Yankai Lin
Zhiyuan Liu
Peng Li
Maosong Sun
Jie Zhou
MoE
19
117
0
05 Oct 2021
Classification of hierarchical text using geometric deep learning: the case of clinical trials corpus
Sohrab Ferdowsi
Nikolay Borissov
J. Knafou
P. Amini
Douglas Teodoro
16
7
0
04 Oct 2021
UFO-ViT: High Performance Linear Vision Transformer without Softmax
Jeonggeun Song
ViT
111
20
0
29 Sep 2021
Digital Signal Processing Using Deep Neural Networks
Brian Shevitski
Y. Watkins
Nicole Man
Michael Girard
AI4CE
18
4
0
21 Sep 2021
Do Long-Range Language Models Actually Use Long-Range Context?
Simeng Sun
Kalpesh Krishna
Andrew Mattarella-Micke
Mohit Iyyer
RALM
25
80
0
19 Sep 2021
PnP-DETR: Towards Efficient Visual Analysis with Transformers
Tao Wang
Li Yuan
Yunpeng Chen
Jiashi Feng
Shuicheng Yan
ViT
22
82
0
15 Sep 2021
SHAPE: Shifted Absolute Position Embedding for Transformers
Shun Kiyono
Sosuke Kobayashi
Jun Suzuki
Kentaro Inui
233
45
0
13 Sep 2021
A Strong Baseline for Query Efficient Attacks in a Black Box Setting
Rishabh Maheshwary
Saket Maheshwary
Vikram Pudi
AAML
21
30
0
10 Sep 2021
Previous
1
2
3
4
5
6
7
8
Next