Deep learning-based multivariate and multistep-ahead traffic forecasting models are typically trained with the mean squared error (MSE) or mean absolute error (MAE) as the loss function in a sequence-to-sequence setting, simply assuming that the errors follow an independent and isotropic Gaussian or Laplacian distributions. However, such assumptions are often unrealistic for real-world traffic forecasting tasks, where the probabilistic distribution of spatiotemporal forecasting is very complex with strong concurrent correlations across both sensors and forecasting horizons in a time-varying manner. In this paper, we model the time-varying distribution for the matrix-variate error process as a dynamic mixture of zero-mean Gaussian distributions. To achieve efficiency, flexibility, and scalability, we parameterize each mixture component using a matrix normal distribution and allow the mixture weight to change and be predictable over time. The proposed method can be seamlessly integrated into existing deep-learning frameworks with only a few additional parameters to be learned. We evaluate the performance of the proposed method on a traffic speed forecasting task and find that our method not only improves model performance but also provides interpretable spatiotemporal correlation structures.
View on arXiv@article{choi2025_2212.06653, title={ Scalable Dynamic Mixture Model with Full Covariance for Probabilistic Traffic Forecasting }, author={ Seongjin Choi and Nicolas Saunier and Vincent Zhihao Zheng and Martin Trepanier and Lijun Sun }, journal={arXiv preprint arXiv:2212.06653}, year={ 2025 } }