ZipR1: Reinforcing Token Sparsity in MLLMs
Main:9 Pages
7 Figures
Bibliography:3 Pages
8 Tables
Appendix:3 Pages
Abstract
Sparse attention mechanisms aim to reduce computational overhead by selectively processing a subset of salient tokens while preserving model performance. Despite the effectiveness of such designs, how to actively encourage token sparsity of well-posed MLLMs remains under-explored, which fundamentally limits the achievable acceleration effect during inference. In this paper, we propose a simple RL-based post-training method named \textbf{ZipR1} that treats the token reduction ratio as the efficiency reward and answer accuracy as the performance reward.
View on arXivComments on this paper
