ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.03095
23
0

A Novel Structure-Agnostic Multi-Objective Approach for Weight-Sharing Compression in Deep Neural Networks

6 January 2025
Rasa Khosrowshahli
Shahryar Rahnamayan
Beatrice Ombuki-Berman
    MQ
ArXivPDFHTML
Abstract

Deep neural networks suffer from storing millions and billions of weights in memory post-training, making challenging memory-intensive models to deploy on embedded devices. The weight-sharing technique is one of the popular compression approaches that use fewer weight values and share across specific connections in the network. In this paper, we propose a multi-objective evolutionary algorithm (MOEA) based compression framework independent of neural network architecture, dimension, task, and dataset. We use uniformly sized bins to quantize network weights into a single codebook (lookup table) for efficient weight representation. Using MOEA, we search for Pareto optimal kkk bins by optimizing two objectives. Then, we apply the iterative merge technique to non-dominated Pareto frontier solutions by combining neighboring bins without degrading performance to decrease the number of bins and increase the compression ratio. Our approach is model- and layer-independent, meaning the weights are mixed in the clusters from any layer, and the uniform quantization method used in this work has O(N)O(N)O(N) complexity instead of non-uniform quantization methods such as k-means with O(Nkt)O(Nkt)O(Nkt) complexity. In addition, we use the center of clusters as the shared weight values instead of retraining shared weights, which is computationally expensive. The advantage of using evolutionary multi-objective optimization is that it can obtain non-dominated Pareto frontier solutions with respect to performance and shared weights. The experimental results show that we can reduce the neural network memory by 13.72∼14.98×13.72 \sim14.98 \times13.72∼14.98× on CIFAR-10, 11.61∼12.99×11.61 \sim 12.99\times11.61∼12.99× on CIFAR-100, and 7.44∼8.58×7.44 \sim 8.58\times7.44∼8.58× on ImageNet showcasing the effectiveness of the proposed deep neural network compression framework.

View on arXiv
Comments on this paper