Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1612.01543
Cited By
Towards the Limit of Network Quantization
5 December 2016
Yoojin Choi
Mostafa El-Khamy
Jungwon Lee
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards the Limit of Network Quantization"
24 / 24 papers shown
Title
GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance
Jinuk Kim
Marwa El Halabi
W. Park
Clemens JS Schaefer
Deokjae Lee
Yeonhong Park
Jae W. Lee
Hyun Oh Song
MQ
29
0
0
11 May 2025
BackSlash: Rate Constrained Optimized Training of Large Language Models
Jun Wu
Jiangtao Wen
Yuxing Han
34
0
0
23 Apr 2025
Object Motion Sensitivity: A Bio-inspired Solution to the Ego-motion Problem for Event-based Cameras
Shay Snyder
Hunter Thompson
Md. Abdullah-Al Kaiser
Gregory Schwartz
Akhilesh R. Jaiswal
Maryam Parsa
33
2
0
24 Mar 2023
PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization
Sanae Lotfi
Marc Finzi
Sanyam Kapoor
Andres Potapczynski
Micah Goldblum
A. Wilson
BDL
MLT
AI4CE
19
51
0
24 Nov 2022
HFedMS: Heterogeneous Federated Learning with Memorable Data Semantics in Industrial Metaverse
Shenglai Zeng
Zonghang Li
Hongfang Yu
Zhihao Zhang
Long Luo
Bo-wen Li
Dusit Niyato
32
42
0
07 Nov 2022
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
18
11
0
11 Aug 2022
Beyond Transmitting Bits: Context, Semantics, and Task-Oriented Communications
Deniz Gunduz
Zhijin Qin
Iñaki Estella Aguerri
Harpreet S. Dhillon
Zhaohui Yang
Aylin Yener
Kai‐Kit Wong
C. Chae
16
431
0
19 Jul 2022
Low-rank Tensor Decomposition for Compression of Convolutional Neural Networks Using Funnel Regularization
Bo-Shiuan Chu
Che-Rung Lee
18
11
0
07 Dec 2021
Multi-Glimpse Network: A Robust and Efficient Classification Architecture based on Recurrent Downsampled Attention
S. Tan
Runpei Dong
Kaisheng Ma
22
2
0
03 Nov 2021
Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes
Sanghyun Hong
Michael-Andrei Panaitescu-Liess
Yigitcan Kaya
Tudor Dumitras
MQ
52
13
0
26 Oct 2021
Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations
Xinyu Zhang
Ian Colbert
Ken Kreutz-Delgado
Srinjoy Das
MQ
24
11
0
15 Oct 2021
Compacting Deep Neural Networks for Internet of Things: Methods and Applications
Ke Zhang
Hanbo Ying
Hongning Dai
Lin Li
Yuangyuang Peng
Keyi Guo
Hongfang Yu
16
38
0
20 Mar 2021
Fixed-point Quantization of Convolutional Neural Networks for Quantized Inference on Embedded Platforms
Rishabh Goyal
Joaquin Vanschoren
V. V. Acht
S. Nijssen
MQ
14
23
0
03 Feb 2021
A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects
Zewen Li
Wenjie Yang
Shouheng Peng
Fan Liu
HAI
3DV
54
2,595
0
01 Apr 2020
A Survey of Methods for Low-Power Deep Learning and Computer Vision
Abhinav Goel
Caleb Tung
Yung-Hsiang Lu
George K. Thiruvathukal
VLM
10
92
0
24 Mar 2020
Impact of Low-bitwidth Quantization on the Adversarial Robustness for Embedded Neural Networks
Rémi Bernhard
Pierre-Alain Moëllic
J. Dutertre
AAML
MQ
19
18
0
27 Sep 2019
DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks
Simon Wiedemann
H. Kirchhoffer
Stefan Matlage
Paul Haase
Arturo Marbán
...
Ahmed Osman
D. Marpe
H. Schwarz
Thomas Wiegand
Wojciech Samek
41
92
0
27 Jul 2019
And the Bit Goes Down: Revisiting the Quantization of Neural Networks
Pierre Stock
Armand Joulin
Rémi Gribonval
Benjamin Graham
Hervé Jégou
MQ
29
149
0
12 Jul 2019
A Targeted Acceleration and Compression Framework for Low bit Neural Networks
Biao Qian
Yang Wang
MQ
21
0
0
09 Jul 2019
Quantization for Rapid Deployment of Deep Neural Networks
J. Lee
Sangwon Ha
Saerom Choi
Won-Jo Lee
Seungwon Lee
MQ
14
48
0
12 Oct 2018
Rate Distortion For Model Compression: From Theory To Practice
Weihao Gao
Yu-Han Liu
Chong-Jun Wang
Sewoong Oh
25
31
0
09 Oct 2018
A Survey on Methods and Theories of Quantized Neural Networks
Yunhui Guo
MQ
27
230
0
13 Aug 2018
BitNet: Bit-Regularized Deep Neural Networks
Aswin Raghavan
Mohamed R. Amer
S. Chai
Graham Taylor
MQ
27
10
0
16 Aug 2017
ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks
Denis A. Gudovskiy
Luca Rigazio
MQ
19
52
0
07 Jun 2017
1