NeuronSpark: A Spiking Neural Network Language Model with Selective State Space Dynamics

17 March 2026

Zhengzheng Tang

MILM

MoE

LRM

ArXiv (abs)PDF HTML Github

Main:10 Pages

6 Figures

Bibliography:2 Pages

7 Tables

Appendix:1 Pages

Abstract

We ask whether a pure spiking backbone can learn large-scale language modeling from random initialization, without Transformer distillation. We introduce NeuronSpark, a 0.9B-parameter SNN language model trained with next-token prediction and surrogate gradients. The model combines selective state-space spiking dynamics, leakage-current inter-layer communication, PonderNet adaptive timesteps, fused Triton PLIF kernels, and stabilization techniques (residual centering, lateral-inhibition normalization, and natural-gradient compensation). Under a constrained budget (about 1.4B pretraining tokens and 6.5K SFT steps), NeuronSpark-0.9B reaches 3.6 pretraining loss and shows early multi-turn dialogue behavior after SFT. These results support the feasibility of end-to-end language modeling with a pure SNN architecture at this scale.

View on arXiv

Comments on this paper