BiGSCoder: State Space Model for Code Understanding

We present BiGSCoder, a novel encoder-only bidirectional state-space model (SSM) featuring a gated architecture, pre-trained for code understanding on a code dataset using masked language modeling. Our work aims to systematically evaluate SSMs' capabilities in coding tasks compared to traditional transformer architectures; BiGSCoder is built for this purpose. Through comprehensive experiments across diverse pre-training configurations and code understanding benchmarks, we demonstrate that BiGSCoder outperforms transformer-based models, despite utilizing simpler pre-training strategies and much less training data. Our results indicate that BiGSCoder can serve as a more sample-efficient alternative to conventional transformer models. Furthermore, our study shows that SSMs perform better without positional embeddings and can effectively extrapolate to longer sequences during fine-tuning.
View on arXiv@article{verma2025_2505.01475, title={ BiGSCoder: State Space Model for Code Understanding }, author={ Shweta Verma and Abhinav Anand and Mira Mezini }, journal={arXiv preprint arXiv:2505.01475}, year={ 2025 } }