Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2001.04451
Cited By
Reformer: The Efficient Transformer
13 January 2020
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reformer: The Efficient Transformer"
50 / 381 papers shown
Title
OLinear: A Linear Model for Time Series Forecasting in Orthogonally Transformed Domain
Wenzhen Yue
Y. Liu
Haoxuan Li
Hao Wang
Xianghua Ying
Ruohao Guo
Bowei Xing
Ji Shi
AI4TS
OOD
31
0
0
12 May 2025
Hierarchical Sparse Attention Framework for Computationally Efficient Classification of Biological Cells
Elad Yoshai
Dana Yagoda-Aharoni
Eden Dotan
N. Shaked
26
0
0
12 May 2025
Graph Laplacian Wavelet Transformer via Learnable Spectral Decomposition
Andrew Kiruluta
Eric Lundy
Priscilla Burity
24
0
0
09 May 2025
Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering
Yiming Niu
Jinliang Deng
L. Zhang
Zimu Zhou
Yongxin Tong
AI4TS
26
0
0
09 May 2025
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
W. Xu
Shibiao Xu
ViT
139
0
0
06 May 2025
SCFormer: Structured Channel-wise Transformer with Cumulative Historical State for Multivariate Time Series Forecasting
Shiwei Guo
Z. Chen
Yupeng Ma
Yunfei Han
Yi Wang
AI4TS
136
0
0
05 May 2025
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing
Piotr Piekos
Róbert Csordás
Jürgen Schmidhuber
MoE
VLM
96
1
0
01 May 2025
Scalable Meta-Learning via Mixed-Mode Differentiation
Iurii Kemaev
Dan A Calian
Luisa M Zintgraf
Gregory Farquhar
H. V. Hasselt
57
0
0
01 May 2025
From Attention to Atoms: Spectral Dictionary Learning for Fast, Interpretable Language Models
Andrew Kiruluta
24
0
0
29 Apr 2025
SFi-Former: Sparse Flow Induced Attention for Graph Transformer
Z. Li
J. Q. Shi
X. Zhang
Miao Zhang
B. Li
44
0
0
29 Apr 2025
Multimodal Conditioned Diffusive Time Series Forecasting
Chen Su
Yuanhe Tian
Yan Song
DiffM
AI4TS
60
0
0
28 Apr 2025
CANet: ChronoAdaptive Network for Enhanced Long-Term Time Series Forecasting under Non-Stationarity
Mert Sonmezer
Seyda Ertekin
AI4TS
26
0
0
24 Apr 2025
Pets: General Pattern Assisted Architecture For Time Series Analysis
Xiangkai Ma
Xiaobin Hong
Wenzhong Li
Sanglu Lu
AI4TS
32
0
0
19 Apr 2025
Cognitive Memory in Large Language Models
Lianlei Shan
Shixian Luo
Zezhou Zhu
Yu Yuan
Yong Wu
LLMAG
KELM
154
1
0
03 Apr 2025
Predicting Team Performance from Communications in Simulated Search-and-Rescue
Ali Jalal-Kamali
Nikolos Gurney
David Pynadath
AI4TS
113
8
0
05 Mar 2025
Attention Condensation via Sparsity Induced Regularized Training
Eli Sason
Darya Frolova
Boris Nazarov
Felix Goldberd
175
0
0
03 Mar 2025
PFformer: A Position-Free Transformer Variant for Extreme-Adaptive Multivariate Time Series Forecasting
Yanhong Li
D. Anastasiu
AI4TS
36
0
0
27 Feb 2025
Low-Rank Thinning
Annabelle Michael Carrell
Albert Gong
Abhishek Shetty
Raaz Dwivedi
Lester W. Mackey
58
0
0
17 Feb 2025
Vision-Enhanced Time Series Forecasting via Latent Diffusion Models
Weilin Ruan
Siru Zhong
Haomin Wen
Yuxuan Liang
AI4TS
74
1
0
16 Feb 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
45
0
0
11 Feb 2025
LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models
Tzu-Tao Chang
Shivaram Venkataraman
VLM
164
0
0
04 Feb 2025
ZETA: Leveraging Z-order Curves for Efficient Top-k Attention
Qiuhao Zeng
Jerry Huang
Peng Lu
Gezheng Xu
Boxing Chen
Charles X. Ling
Boyu Wang
49
1
0
24 Jan 2025
Unified CNNs and transformers underlying learning mechanism reveals multi-head attention modus vivendi
Ella Koresh
Ronit D. Gross
Yuval Meir
Yarden Tzach
Tal Halevi
Ido Kanter
ViT
46
0
0
22 Jan 2025
Harnessing the Potential of Large Language Models in Modern Marketing Management: Applications, Future Directions, and Strategic Recommendations
Raha Aghaei
Ali A. Kiaei
Mahnaz Boush
Javad Vahidi
Mohammad Zavvar
Zeynab Barzegar
Mahan Rofoosheh
OffRL
38
1
0
18 Jan 2025
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
Di Liu
Meng Chen
Baotong Lu
Huiqiang Jiang
Zhenhua Han
...
K. Zhang
C. L. P. Chen
Fan Yang
Y. Yang
Lili Qiu
46
29
0
03 Jan 2025
TOTEM: TOkenized Time Series EMbeddings for General Time Series Analysis
Sabera Talukder
Yisong Yue
Georgia Gkioxari
AI4TS
48
12
0
03 Jan 2025
A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames
Pinelopi Papalampidi
Skanda Koppula
Shreya Pathak
Justin T Chiu
Joseph Heyward
Viorica Patraucean
Jiajun Shen
Antoine Miech
Andrew Zisserman
Aida Nematzdeh
VLM
60
24
0
31 Dec 2024
DRDM: A Disentangled Representations Diffusion Model for Synthesizing Realistic Person Images
Enbo Huang
Yuan Zhang
Faliang Huang
Guangyu Zhang
Y. Liu
DiffM
37
0
0
25 Dec 2024
Does Self-Attention Need Separate Weights in Transformers?
Md. Kowsher
Nusrat Jahan Prottasha
Chun-Nam Yu
O. Garibay
Niloofar Yousefi
183
0
0
30 Nov 2024
MAS-Attention: Memory-Aware Stream Processing for Attention Acceleration on Resource-Constrained Edge Devices
Mohammadali Shakerdargah
Shan Lu
Chao Gao
Di Niu
70
0
0
20 Nov 2024
PSformer: Parameter-efficient Transformer with Segment Attention for Time Series Forecasting
Yanlong Wang
J. Xu
Fei Ma
Shao-Lun Huang
Danny Dongning Sun
Xiao-Ping Zhang
AI4TS
45
1
0
03 Nov 2024
RAM: Replace Attention with MLP for Efficient Multivariate Time Series Forecasting
Suhan Guo
Jiahong Deng
Yi Wei
Hui Dou
F. Shen
Jian Zhao
AI4TS
133
0
0
31 Oct 2024
LiNo: Advancing Recursive Residual Decomposition of Linear and Nonlinear Patterns for Robust Time Series Forecasting
Guoqi Yu
Yaoming Li
Xiaoyu Guo
Dayu Wang
Zirui Liu
Shujun Wang
Tong Yang
AI4TS
121
0
0
22 Oct 2024
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
Jerry Huang
Prasanna Parthasarathi
Mehdi Rezagholizadeh
Boxing Chen
Sarath Chandar
50
0
0
22 Oct 2024
TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis
Shiyu Wang
Jiawei Li
X. Shi
Zhou Ye
Baichuan Mo
Wenze Lin
Shengtong Ju
Zhixuan Chu
Ming Jin
AI4TS
38
9
0
21 Oct 2024
Rethinking Transformer for Long Contextual Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Pingyi Chen
Zhongyi Shui
Chenglu Zhu
Lin Yang
MedIm
34
4
0
18 Oct 2024
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Yizhao Gao
Zhichen Zeng
Dayou Du
Shijie Cao
Hayden Kwok-Hay So
...
Junjie Lai
Mao Yang
Ting Cao
Fan Yang
M. Yang
50
18
0
17 Oct 2024
In-context KV-Cache Eviction for LLMs via Attention-Gate
Zihao Zeng
Bokai Lin
Tianqi Hou
Hao Zhang
Zhijie Deng
38
1
0
15 Oct 2024
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
Haotian Tang
Yecheng Wu
Shang Yang
Enze Xie
Junsong Chen
Junyu Chen
Zhuoyang Zhang
Han Cai
Y. Lu
Song Han
63
33
0
14 Oct 2024
Token Pruning using a Lightweight Background Aware Vision Transformer
Sudhakar Sah
Ravish Kumar
Honnesh Rohmetra
Ehsan Saboori
ViT
23
1
0
12 Oct 2024
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Mutian He
Philip N. Garner
82
0
0
09 Oct 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
62
15
0
06 Oct 2024
S7: Selective and Simplified State Space Layers for Sequence Modeling
Taylan Soydan
Nikola Zubić
Nico Messikommer
Siddhartha Mishra
Davide Scaramuzza
35
4
0
04 Oct 2024
Local Attention Mechanism: Boosting the Transformer Architecture for Long-Sequence Time Series Forecasting
Ignacio Aguilera-Martos
Andrés Herrera-Poyatos
Julián Luengo
Francisco Herrera
AI4TS
28
0
0
04 Oct 2024
Oscillatory State-Space Models
T. Konstantin Rusch
Daniela Rus
AI4TS
133
5
0
04 Oct 2024
Tuning Frequency Bias of State Space Models
Annan Yu
Dongwei Lyu
S. H. Lim
Michael W. Mahoney
N. Benjamin Erichson
42
3
0
02 Oct 2024
FlashMask: Efficient and Rich Mask Extension of FlashAttention
Guoxia Wang
Jinle Zeng
Xiyuan Xiao
Siming Wu
Jiabin Yang
Lujing Zheng
Zeyu Chen
Jiang Bian
Dianhai Yu
Haifeng Wang
118
2
0
02 Oct 2024
Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer
Jinghan Yao
Sam Ade Jacobs
Masahiro Tanaka
Olatunji Ruwase
A. Shafi
D. Panda
33
2
0
30 Aug 2024
Ex3: Automatic Novel Writing by Extracting, Excelsior and Expanding
Lei Huang
Jiaming Guo
Guanhua He
Xishan Zhang
Rui Zhang
Shaohui Peng
Shaoli Liu
Tianshi Chen
26
2
0
16 Aug 2024
Sampling Foundational Transformer: A Theoretical Perspective
Viet Anh Nguyen
Minh Lenhat
Khoa Nguyen
Duong Duc Hieu
Dao Huu Hung
Truong Son-Hy
44
0
0
11 Aug 2024
1
2
3
4
5
6
7
8
Next