462

ICE-Pruning: An Iterative Cost-Efficient Pruning Pipeline for Deep Neural Networks

Main:7 Pages
3 Figures
Bibliography:1 Pages
3 Tables
Abstract

Pruning is a widely used method for compressing Deep Neural Networks (DNNs), where less relevant parameters are removed from a DNN model to reduce its size. However, removing parameters reduces model accuracy, so pruning is typically combined with fine-tuning, and sometimes other operations such as rewinding weights, to recover accuracy. A common approach is to repeatedly prune and then fine-tune, with increasing amounts of model parameters being removed in each step. While straightforward to implement, pruning pipelines that follow this approach are computationally expensive due to the need for repeated fine-tuning.

View on arXiv
Comments on this paper