396

ZipR1: Reinforcing Token Sparsity in MLLMs

Main:9 Pages
7 Figures
Bibliography:3 Pages
8 Tables
Appendix:3 Pages
Abstract

Sparse attention mechanisms aim to reduce computational overhead by selectively processing a subset of salient tokens while preserving model performance. Despite the effectiveness of such designs, how to actively encourage token sparsity of well-posed MLLMs remains under-explored, which fundamentally limits the achievable acceleration effect during inference. In this paper, we propose a simple RL-based post-training method named \textbf{ZipR1} that treats the token reduction ratio as the efficiency reward and answer accuracy as the performance reward.

View on arXiv
Comments on this paper