ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.08117
13
1

MBQuant: A Novel Multi-Branch Topology Method for Arbitrary Bit-width Network Quantization

14 May 2023
Yunshan Zhong
Yuyao Zhou
Fei Chao
Rongrong Ji
    MQ
ArXivPDFHTML
Abstract

Arbitrary bit-width network quantization has received significant attention due to its high adaptability to various bit-width requirements during runtime. However, in this paper, we investigate existing methods and observe a significant accumulation of quantization errors caused by switching weight and activations bit-widths, leading to limited performance. To address this issue, we propose MBQuant, a novel method that utilizes a multi-branch topology for arbitrary bit-width quantization. MBQuant duplicates the network body into multiple independent branches, where the weights of each branch are quantized to a fixed 2-bit and the activations remain in the input bit-width. The computation of a desired bit-width is completed by selecting an appropriate number of branches that satisfy the original computational constraint. By fixing the weight bit-width, this approach substantially reduces quantization errors caused by switching weight bit-widths. Additionally, we introduce an amortization branch selection strategy to distribute quantization errors caused by switching activation bit-widths among branches to improve performance. Finally, we adopt an in-place distillation strategy that facilitates guidance between branches to further enhance MBQuant's performance. Extensive experiments demonstrate that MBQuant achieves significant performance gains compared to existing arbitrary bit-width quantization methods. Code is at https://github.com/zysxmu/MultiQuant.

View on arXiv
Comments on this paper