MESSI: A Multi-Elevation Semantic Segmentation Image Dataset of an Urban Environment

This paper presents a Multi-Elevation Semantic Segmentation Image (MESSI) dataset comprising 2525 images taken by a drone flying over dense urban environments. MESSI is unique in two main features. First, it contains images from various altitudes, allowing us to investigate the effect of depth on semantic segmentation. Second, it includes images taken from several different urban regions (at different altitudes). This is important since the variety covers the visual richness captured by a drone's 3D flight, performing horizontal and vertical maneuvers. MESSI contains images annotated with location, orientation, and the camera's intrinsic parameters and can be used to train a deep neural network for semantic segmentation or other applications of interest (e.g., localization, navigation, and tracking). This paper describes the dataset and provides annotation details. It also explains how semantic segmentation was performed using several neural network models and shows several relevant statistics. MESSI will be published in the public domain to serve as an evaluation benchmark for semantic segmentation using images captured by a drone or similar vehicle flying over a dense urban environment.
View on arXiv@article{pinkovich2025_2505.08589, title={ MESSI: A Multi-Elevation Semantic Segmentation Image Dataset of an Urban Environment }, author={ Barak Pinkovich and Boaz Matalon and Ehud Rivlin and Hector Rotstein }, journal={arXiv preprint arXiv:2505.08589}, year={ 2025 } }