ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.03289
  4. Cited By
Deep Neural Network Compression with Single and Multiple Level
  Quantization
v1v2 (latest)

Deep Neural Network Compression with Single and Multiple Level Quantization

6 March 2018
Yuhui Xu
Yongzhuang Wang
Aojun Zhou
Weiyao Lin
H. Xiong
    MQ
ArXiv (abs)PDFHTML

Papers citing "Deep Neural Network Compression with Single and Multiple Level Quantization"

42 / 42 papers shown
Neuronal Group Communication for Efficient Neural representation
Neuronal Group Communication for Efficient Neural representation
Zhengqi Pei
Qingming Huang
Shuhui Wang
111
0
0
19 Oct 2025
Deep Lookup Network
Deep Lookup NetworkIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Yulan Guo
Longguang Wang
Wendong Mao
Xiaoyu Dong
Yingqian Wang
Li Liu
W. An
112
0
0
17 Sep 2025
Universality of Layer-Level Entropy-Weighted Quantization Beyond Model Architecture and Size
Alireza Behtash
Marijan Fofonjka
Ethan Baird
Tyler Mauer
Hossein Moghimifam
David Stout
Joel Dennison
MQ
437
3
0
06 Mar 2025
SKIM: Any-bit Quantization Pushing The Limits of Post-Training
  Quantization
SKIM: Any-bit Quantization Pushing The Limits of Post-Training Quantization
Runsheng Bai
Qiang Liu
B. Liu
MQ
360
2
0
05 Dec 2024
On GNN explanability with activation rules
On GNN explanability with activation rules
Luca Veyrin-Forrer
Ataollah Kamal
Stefan Duffner
Marc Plantevit
C. Robardet
AI4CE
146
2
0
17 Jun 2024
One-Step Forward and Backtrack: Overcoming Zig-Zagging in Loss-Aware
  Quantization Training
One-Step Forward and Backtrack: Overcoming Zig-Zagging in Loss-Aware Quantization Training
Lianbo Ma
Yuee Zhou
Jianlun Ma
Guo-Ding Yu
Qing Li
MQ
211
5
0
30 Jan 2024
Automated Heterogeneous Low-Bit Quantization of Multi-Model Deep
  Learning Inference Pipeline
Automated Heterogeneous Low-Bit Quantization of Multi-Model Deep Learning Inference Pipeline
Jayeeta Mondal
Swarnava Dey
Arijit Mukherjee
MQ
273
1
0
10 Nov 2023
SqueezeLLM: Dense-and-Sparse Quantization
SqueezeLLM: Dense-and-Sparse QuantizationInternational Conference on Machine Learning (ICML), 2023
Sehoon Kim
Coleman Hooper
A. Gholami
Zhen Dong
Xiuyu Li
Sheng Shen
Michael W. Mahoney
Kurt Keutzer
MQ
460
259
0
13 Jun 2023
On Model Compression for Neural Networks: Framework, Algorithm, and
  Convergence Guarantee
On Model Compression for Neural Networks: Framework, Algorithm, and Convergence Guarantee
Chenyang Li
Jihoon Chung
Mengnan Du
Haimin Wang
Xianlian Zhou
Bohao Shen
209
1
0
13 Mar 2023
LAB: Learnable Activation Binarizer for Binary Neural Networks
LAB: Learnable Activation Binarizer for Binary Neural NetworksIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Sieger Falkena
Hadi Jamali Rad
Jan van Gemert
MQ
224
3
0
25 Oct 2022
Context-Aware Streaming Perception in Dynamic Environments
Context-Aware Streaming Perception in Dynamic EnvironmentsEuropean Conference on Computer Vision (ECCV), 2022
Gur-Eyal Sela
Ionel Gog
J. Wong
Kumar Krishna Agrawal
Xiangxi Mo
...
Eric Leong
Xin Wang
Bharathan Balaji
Joseph E. Gonzalez
Ion Stoica
205
12
0
16 Aug 2022
STN: Scalable Tensorizing Networks via Structure-Aware Training and
  Adaptive Compression
STN: Scalable Tensorizing Networks via Structure-Aware Training and Adaptive Compression
Chang Nie
Haiquan Wang
Lu Zhao
185
0
0
30 May 2022
A Comprehensive Survey on Model Quantization for Deep Neural Networks in
  Image Classification
A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image ClassificationACM Transactions on Intelligent Systems and Technology (ACM TIST), 2022
Babak Rokh
A. Azarpeyvand
Alireza Khanteymoori
MQ
450
172
0
14 May 2022
Croesus: Multi-Stage Processing and Transactions for Video-Analytics in
  Edge-Cloud Systems
Croesus: Multi-Stage Processing and Transactions for Video-Analytics in Edge-Cloud SystemsIEEE International Conference on Data Engineering (ICDE), 2021
Samaa Gazzaz
Vishal Chakraborty
Faisal Nawab
164
11
0
31 Dec 2021
Low-rank Tensor Decomposition for Compression of Convolutional Neural
  Networks Using Funnel Regularization
Low-rank Tensor Decomposition for Compression of Convolutional Neural Networks Using Funnel Regularization
Bo-Shiuan Chu
Che-Rung Lee
181
14
0
07 Dec 2021
LVAC: Learned Volumetric Attribute Compression for Point Clouds using
  Coordinate Based Networks
LVAC: Learned Volumetric Attribute Compression for Point Clouds using Coordinate Based Networks
Berivan Isik
P. Chou
S. Hwang
Nick Johnston
G. Toderici
3DPC
239
32
0
17 Nov 2021
CHIP: CHannel Independence-based Pruning for Compact Neural Networks
CHIP: CHannel Independence-based Pruning for Compact Neural Networks
Yang Sui
Miao Yin
Yi Xie
Huy Phan
S. Zonouz
Bo Yuan
VLM
244
164
0
26 Oct 2021
Adapt to Adaptation: Learning Personalization for Cross-Silo Federated
  Learning
Adapt to Adaptation: Learning Personalization for Cross-Silo Federated Learning
Jun Luo
Shandong Wu
OODMQ
447
117
0
15 Oct 2021
FIDNet: LiDAR Point Cloud Semantic Segmentation with Fully Interpolation
  Decoding
FIDNet: LiDAR Point Cloud Semantic Segmentation with Fully Interpolation DecodingIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2021
Yiming Zhao
Lin Bai
Xinming Huang
3DV3DPC
153
78
0
08 Sep 2021
Towards Efficient Tensor Decomposition-Based DNN Model Compression with
  Optimization Framework
Towards Efficient Tensor Decomposition-Based DNN Model Compression with Optimization FrameworkComputer Vision and Pattern Recognition (CVPR), 2021
Miao Yin
Yang Sui
Siyu Liao
Bo Yuan
134
97
0
26 Jul 2021
HEMP: High-order Entropy Minimization for neural network comPression
HEMP: High-order Entropy Minimization for neural network comPression
Enzo Tartaglione
Stéphane Lathuilière
Attilio Fiandrotti
Marco Cagnazzo
Marco Grangetto
MQ
153
7
0
12 Jul 2021
Model compression as constrained optimization, with application to
  neural nets. Part V: combining compressions
Model compression as constrained optimization, with application to neural nets. Part V: combining compressions
Miguel Á. Carreira-Perpiñán
Yerlan Idelbayev
208
6
0
09 Jul 2021
Learnable Companding Quantization for Accurate Low-bit Neural Networks
Learnable Companding Quantization for Accurate Low-bit Neural NetworksComputer Vision and Pattern Recognition (CVPR), 2021
Kohei Yamamoto
MQ
184
83
0
12 Mar 2021
Personalized Federated Learning with First Order Model Optimization
Personalized Federated Learning with First Order Model OptimizationInternational Conference on Learning Representations (ICLR), 2020
Michael Zhang
Karan Sapra
Sanja Fidler
Serena Yeung
J. Álvarez
FedML
322
377
0
15 Dec 2020
TRP: Trained Rank Pruning for Efficient Deep Neural Networks
TRP: Trained Rank Pruning for Efficient Deep Neural NetworksInternational Joint Conference on Artificial Intelligence (IJCAI), 2020
Yuhui Xu
Yuxi Li
Shuai Zhang
W. Wen
Botao Wang
Y. Qi
Yiran Chen
Weiyao Lin
H. Xiong
AAML
168
75
0
30 Apr 2020
A Generic Network Compression Framework for Sequential Recommender
  Systems
A Generic Network Compression Framework for Sequential Recommender Systems
Yang Sun
Fajie Yuan
Ming Yang
Guoao Wei
Zhou Zhao
Duo Liu
203
59
0
21 Apr 2020
Learning Sparse & Ternary Neural Networks with Entropy-Constrained
  Trained Ternarization (EC2T)
Learning Sparse & Ternary Neural Networks with Entropy-Constrained Trained Ternarization (EC2T)
Arturo Marbán
Daniel Becking
Simon Wiedemann
Wojciech Samek
MQ
161
14
0
02 Apr 2020
Cluster Pruning: An Efficient Filter Pruning Method for Edge AI Vision
  Applications
Cluster Pruning: An Efficient Filter Pruning Method for Edge AI Vision ApplicationsIEEE Journal on Selected Topics in Signal Processing (JSTSP), 2020
Chinthaka Gamanayake
Lahiru Jayasinghe
Benny Kai Kiat Ng
Chau Yuen
VLM
280
49
0
05 Mar 2020
FQ-Conv: Fully Quantized Convolution for Efficient and Accurate
  Inference
FQ-Conv: Fully Quantized Convolution for Efficient and Accurate Inference
Bram-Ernst Verhoef
Nathan Laubeuf
S. Cosemans
P. Debacker
Ioannis A. Papistas
A. Mallik
D. Verkest
MQ
187
16
0
19 Dec 2019
Progressive Compressed Records: Taking a Byte out of Deep Learning Data
Progressive Compressed Records: Taking a Byte out of Deep Learning DataProceedings of the VLDB Endowment (PVLDB), 2019
Michael Kuchnik
George Amvrosiadis
Virginia Smith
476
10
0
01 Nov 2019
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit
  Neural Networks
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural NetworksIEEE International Conference on Computer Vision (ICCV), 2019
Yazhe Niu
Xianglong Liu
Shenghu Jiang
Tian-Hao Li
Peng Hu
Jiazhen Lin
F. Yu
Junjie Yan
MQ
271
512
0
14 Aug 2019
Multi-loss-aware Channel Pruning of Deep Networks
Multi-loss-aware Channel Pruning of Deep NetworksInternational Conference on Information Photonics (ICIP), 2019
Yiming Hu
Siyang Sun
Jianquan Li
Jiagang Zhu
Xingang Wang
Qingyi Gu
145
8
0
27 Feb 2019
Tensorized Embedding Layers for Efficient Model Compression
Tensorized Embedding Layers for Efficient Model Compression
Oleksii Hrinchuk
Valentin Khrulkov
L. Mirvakhabova
Elena Orlova
Ivan Oseledets
229
75
0
30 Jan 2019
DAC: Data-free Automatic Acceleration of Convolutional Networks
DAC: Data-free Automatic Acceleration of Convolutional Networks
Xin Li
Shuai Zhang
Bolan Jiang
Y. Qi
Mooi Choo Choo Chuah
N. Bi
279
8
0
20 Dec 2018
Trained Rank Pruning for Efficient Deep Neural Networks
Trained Rank Pruning for Efficient Deep Neural Networks
Yuhui Xu
Yuxi Li
Shuai Zhang
W. Wen
Botao Wang
Y. Qi
Yiran Chen
Weiyao Lin
H. Xiong
AAML
284
51
0
06 Dec 2018
DNQ: Dynamic Network Quantization
DNQ: Dynamic Network Quantization
Yuhui Xu
Shuai Zhang
Y. Qi
Jiaxian Guo
Weiyao Lin
H. Xiong
MQ
163
9
0
06 Dec 2018
Efficient non-uniform quantizer for quantized neural network targeting
  reconfigurable hardware
Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware
Natan Liss
Chaim Baskin
A. Mendelson
A. Bronstein
Raja Giryes
MQ
104
5
0
27 Nov 2018
NICE: Noise Injection and Clamping Estimation for Neural Network
  Quantization
NICE: Noise Injection and Clamping Estimation for Neural Network Quantization
Chaim Baskin
Natan Liss
Yoav Chai
Evgenii Zheltonozhskii
Eli Schwartz
Raja Giryes
A. Mendelson
A. Bronstein
MQ
214
65
0
29 Sep 2018
Discovering Low-Precision Networks Close to Full-Precision Networks for
  Efficient Embedded Inference
Discovering Low-Precision Networks Close to Full-Precision Networks for Efficient Embedded Inference
J. McKinstry
S. K. Esser
R. Appuswamy
Deepika Bablani
John V. Arthur
Izzet B. Yildiz
D. Modha
MQ
198
98
0
11 Sep 2018
GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model
  Shrinking
GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking
Patrick H. Chen
Si Si
Yang Li
Ciprian Chelba
Cho-Jui Hsieh
168
75
0
18 Jun 2018
Convolutional neural network compression for natural language processing
Convolutional neural network compression for natural language processing
Krzysztof Wróbel
Marcin Pietroñ
Maciej Wielgosz
Michał Karwatowski
K. Wiatr
VLM
78
9
0
28 May 2018
UNIQ: Uniform Noise Injection for Non-Uniform Quantization of Neural
  Networks
UNIQ: Uniform Noise Injection for Non-Uniform Quantization of Neural Networks
Chaim Baskin
Eli Schwartz
Evgenii Zheltonozhskii
Natan Liss
Raja Giryes
A. Bronstein
A. Mendelson
MQ
315
43
0
29 Apr 2018
1