ICE-Pruning: An Iterative Cost-Efficient Pruning Pipeline for Deep Neural Networks

12 May 2025

Wenhao Hu

Paul Henderson

José Cano

ArXiv (abs)PDF HTML Github

Main:7 Pages

3 Figures

Bibliography:1 Pages

3 Tables

Abstract

Pruning is a widely used method for compressing Deep Neural Networks (DNNs), where less relevant parameters are removed from a DNN model to reduce its size. However, removing parameters reduces model accuracy, so pruning is typically combined with fine-tuning, and sometimes other operations such as rewinding weights, to recover accuracy. A common approach is to repeatedly prune and then fine-tune, with increasing amounts of model parameters being removed in each step. While straightforward to implement, pruning pipelines that follow this approach are computationally expensive due to the need for repeated fine-tuning.

View on arXiv

Comments on this paper