We present a self-supervised framework that learns population-level codes for arbitrary ensembles of neural recordings at scale. We address key challenges in scaling models with neural time-series data, namely, sparse and variable electrode distribution across subjects and datasets. The Population Transformer (PopT) stacks on top of pretrained temporal embeddings and enhances downstream decoding by enabling learned aggregation of multiple spatially-sparse data channels. The pretrained PopT lowers the amount of data required for downstream decoding experiments, while increasing accuracy, even on held-out subjects and tasks. Compared to end-to-end methods, this approach is computationally lightweight, while achieving similar or better decoding performance. We further show how our framework is generalizable to multiple time-series embeddings and neural data modalities. Beyond decoding, we interpret the pretrained and fine-tuned PopT models to show how they can be used to extract neuroscience insights from large amounts of data. We release our code as well as a pretrained PopT to enable off-the-shelf improvements in multi-channel intracranial data decoding and interpretability. Code is available atthis https URL.
View on arXiv@article{chau2025_2406.03044, title={ Population Transformer: Learning Population-level Representations of Neural Activity }, author={ Geeling Chau and Christopher Wang and Sabera Talukder and Vighnesh Subramaniam and Saraswati Soedarmadji and Yisong Yue and Boris Katz and Andrei Barbu }, journal={arXiv preprint arXiv:2406.03044}, year={ 2025 } }