ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.14024
25
0

Two Heads are Better Than One: Neural Networks Quantization with 2D Hilbert Curve-based Output Representation

22 May 2024
Mykhail M. Uss
Ruslan Yermolenko
Olena Kolodiazhna
Oleksii Shashko
Ivan Safonov
Volodymyr Savin
Yoonjae Yeo
Seowon Ji
Jaeyun Jeong
    MQ
ArXivPDFHTML
Abstract

Quantization is widely used to increase deep neural networks' (DNN) memory, computation, and power efficiency. Various techniques, such as post-training quantization and quantization-aware training, have been proposed to improve quantization quality. We introduce a novel approach for DNN quantization that uses a redundant representation of DNN's output. We represent the target quantity as a point on a 2D parametric curve. The DNN model is modified to predict 2D points that are mapped back to the target quantity at a post-processing stage. We demonstrate that this mapping can reduce quantization error. For the low-order parametric Hilbert curve, Depth-From-Stereo task, and two models represented by U-Net architecture and vision transformer, we achieved a quantization error reduction by about 5 times for the INT8 model at both CPU and DSP delegates. This gain comes with a minimal inference time increase (less than 7%). Our approach can be applied to other tasks, including segmentation, object detection, and key-points prediction.

View on arXiv
Comments on this paper