Transformer-based Modeling of Physical Systems: Improved Latent
Representations
- AI4TS
Many phenomena from physics and engineering require highly flexible models, and have ample data with which to fit. However, this data is often irregularly sampled, and cannot be processed as it is by standard deep learning architecture. We propose a transformer-based model for forecasting physical processes at arbitrary spatial points given information on a related process at possibly different points. This architecture is particularly well-suited for high-altitude wind forecasting, as it can effectively leverage large volumes of data recorded along plane trajectories, which are sparse in space. We test at different scales for two different dynamical systems previously studied in the literature: the Poisson equation and Darcy Flow equation. In both cases, our transformer-based model outperforms alternative methods. We hypothesize that this superior performance is due to a more flexible latent representation. To support this hypothesis, we design a simple synthetic experiment to show that the latent representation of the other models suffers from excessive bottlenecking that is, in some cases, preventing the efficient use of the information and slowing training.
View on arXiv