TerraMesh: A Planetary Mosaic of Multimodal Earth Observation Data

Large-scale foundation models in Earth Observation can learn versatile, label-efficient representations by leveraging massive amounts of unlabeled data. However, existing public datasets are often limited in scale, geographic coverage, or sensor variety. We introduce TerraMesh, a new globally diverse, multimodal dataset combining optical, synthetic aperture radar, elevation, and land-cover modalities in an Analysis-Ready Data format. TerraMesh includes over 9 million samples with eight spatiotemporal aligned modalities, enabling large-scale pre-training and fostering robust cross-modal correlation learning. We provide detailed data processing steps, comprehensive statistics, and empirical evidence demonstrating improved model performance when pre-trained on TerraMesh. The dataset will be made publicly available with a permissive license.
View on arXiv@article{blumenstiel2025_2504.11172, title={ TerraMesh: A Planetary Mosaic of Multimodal Earth Observation Data }, author={ Benedikt Blumenstiel and Paolo Fraccaro and Valerio Marsocci and Johannes Jakubik and Stefano Maurogiovanni and Mikolaj Czerkawski and Rocco Sedona and Gabriele Cavallaro and Thomas Brunschwiler and Juan Bernabe-Moreno and Nicolas Longépé }, journal={arXiv preprint arXiv:2504.11172}, year={ 2025 } }