STDiff: A State Transition Diffusion Framework for Time Series Imputation in Industrial Systems
- DiffMAI4CE

Incomplete sensor data is a major obstacle in industrial time-series analytics. In wastewater treatment plants (WWTPs), key sensors show long, irregular gaps caused by fouling, maintenance, and outages. We introduce STDiff and STDiff-W, diffusion-based imputers that cast gap filling as state-space simulation under partial observability, where targets, controls, and exogenous signals may all be intermittently missing. STDiff learns a one-step transition model conditioned on observed values and masks, while STDiff-W extends this with a context encoder that jointly inpaints contiguous blocks, combining long-range consistency with short-term detail. On two WWTP datasets (one with synthetic block gaps from Agtrup and another with natural outages from Avedøre), STDiff-W achieves state-of-the-art accuracy compared with strong neural baselines such as SAITS, BRITS, and CSDI. Beyond point-error metrics, its reconstructions preserve realistic dynamics including oscillations, spikes, and regime shifts, and they achieve top or tied-top downstream one-step forecasting performance compared with strong neural baselines, indicating that preserving dynamics does not come at the expense of predictive utility. Ablation studies that drop, shuffle, or add noise to control or exogenous inputs consistently degrade NH4 and PO4 performance, with the largest deterioration observed when exogenous signals are removed, showing that the model captures meaningful dependencies. We conclude with practical guidance for deployment: evaluate performance beyond MAE using task-oriented and visual checks, include exogenous drivers, and balance computational cost against robustness to structured outages.
View on arXiv