Accurate hyperspectral image (HSI) interpretation is critical for providing valuable insights into various earth observation-related applications such as urban planning, precision agriculture, and environmental monitoring. However, existing HSI processing methods are predominantly task-specific and scene-dependent, which severely limits their ability to transfer knowledge across tasks and scenes, thereby reducing the practicality in real-world applications. To address these challenges, we present HyperSIGMA, a vision transformer-based foundation model that unifies HSI interpretation across tasks and scenes, scalable to over one billion parameters. To overcome the spectral and spatial redundancy inherent in HSIs, we introduce a novel sparse sampling attention (SSA) mechanism, which effectively promotes the learning of diverse contextual features and serves as the basic block of HyperSIGMA. HyperSIGMA integrates spatial and spectral features using a specially designed spectral enhancement module. In addition, we construct a large-scale hyperspectral dataset, HyperGlobal-450K, for pre-training, which contains about 450K hyperspectral images, significantly surpassing existing datasets in scale. Extensive experiments on various high-level and low-level HSI tasks demonstrate HyperSIGMA's versatility and superior representational capability compared to current state-of-the-art methods. Moreover, HyperSIGMA shows significant advantages in scalability, robustness, cross-modal transferring capability, real-world applicability, and computational efficiency. The code and models will be released atthis https URL.

View on arXiv

@article{wang2025_2406.11519,
  title={ HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model },
  author={ Di Wang and Meiqi Hu and Yao Jin and Yuchun Miao and Jiaqi Yang and Yichu Xu and Xiaolei Qin and Jiaqi Ma and Lingyu Sun and Chenxing Li and Chuan Fu and Hongruixuan Chen and Chengxi Han and Naoto Yokoya and Jing Zhang and Minqiang Xu and Lin Liu and Lefei Zhang and Chen Wu and Bo Du and Dacheng Tao and Liangpei Zhang },
  journal={arXiv preprint arXiv:2406.11519},
  year={ 2025 }
}

Comments on this paper