QuarterMap: Efficient Post-Training Token Pruning for Visual State Space Models

13 July 2025

Tien-Yu Chi

Hung-Yueh Chiang

Diana Marculescu

Kai-Chiang Wu

Mamba

ArXiv (abs)PDF HTML

Main:4 Pages

7 Figures

Bibliography:3 Pages

9 Tables

Appendix:7 Pages

Abstract

State space models (SSMs) reduce the quadratic complexity of transformers by leveraging linear recurrence. Recently, VMamba has emerged as a strong SSM-based vision backbone, yet remains bottlenecked by spatial redundancy in its four-directional scan. We propose QuarterMap, a post-training activation pruning method that removes redundant spatial activations before scanning and restores dimensions via nearest-neighbor upsampling. Our method improves throughput without retraining. On ImageNet-1K, QuarterMap achieves up to 11% speedup on VMamba with less than 0.9% accuracy drop, and yields similar gains on ADE20K segmentation. Beyond VMamba, we validate QuarterMap on MedMamba, a domain-specific model that shares the same four-directional scanning structure, where it consistently improves throughput while preserving accuracy across multiple medical imaging tasks. Compared to token merging methods like ToMe, QuarterMap is tailored for SSMs and avoids costly merge-unmerge operations. Our method offers a plug-and-play tool for deployment-time efficiency without compromising transferability.

View on arXiv

Comments on this paper