99

Budgeted Broadcast: An Activity-Dependent Pruning Rule for Neural Network Efficiency

Main:10 Pages
8 Figures
1 Tables
Appendix:3 Pages
Abstract

Most pruning methods remove parameters ranked by impact on loss (e.g., magnitude or gradient). We propose Budgeted Broadcast (BB), which gives each unit a local traffic budget (the product of its long-term on-rate aia_i and fan-out kik_i). A constrained-entropy analysis shows that maximizing coding entropy under a global traffic budget yields a selectivity-audience balance, log1aiai=βki\log\frac{1-a_i}{a_i}=\beta k_i. BB enforces this balance with simple local actuators that prune either fan-in (to lower activity) or fan-out (to reduce broadcast). In practice, BB increases coding entropy and decorrelation and improves accuracy at matched sparsity across Transformers for ASR, ResNets for face identification, and 3D U-Nets for synapse prediction, sometimes exceeding dense baselines. On electron microscopy images, it attains state-of-the-art F1 and PR-AUC under our evaluation protocol. BB is easy to integrate and suggests a path toward learning more diverse and efficient representations.

View on arXiv
Comments on this paper