50
0

RAM: Replace Attention with MLP for Efficient Multivariate Time Series Forecasting

Abstract

Attention-based architectures have become ubiquitous in time series forecasting tasks, including spatio-temporal (STF) and long-term time series forecasting (LTSF). Yet, our understanding of the reasons for their effectiveness remains limited. In this work, we propose a novel pruning strategy, R\textbf{R}eplace A\textbf{A}ttention with M\textbf{M}LP (RAM), that approximates the attention mechanism using only feedforward layers, residual connections, and layer normalization for temporal and/or spatial modeling in multivariate time series forecasting. Specifically, the Q, K, and V projections, the attention score calculation, the dot-product between the attention score and the V, and the final projection can be removed from the attention-based networks without significantly degrading the performance, so that the given network remains the top-tier compared to other SOTA methods. RAM achieves a 62.579%62.579\% reduction in FLOPs for spatio-temporal models with less than 2.5%2.5\% performance drop, and a 42.233%42.233\% FLOPs reduction for LTSF models with less than 2%2\% performance drop.

View on arXiv
@article{guo2025_2410.24023,
  title={ RAM: Replace Attention with MLP for Efficient Multivariate Time Series Forecasting },
  author={ Suhan Guo and Jiahong Deng and Yi Wei and Hui Dou and Furao Shen and Jian Zhao },
  journal={arXiv preprint arXiv:2410.24023},
  year={ 2025 }
}
Comments on this paper