Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.07190
Cited By
Loss Aware Post-training Quantization
17 November 2019
Yury Nahshan
Brian Chmiel
Chaim Baskin
Evgenii Zheltonozhskii
Ron Banner
A. Bronstein
A. Mendelson
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Loss Aware Post-training Quantization"
50 / 84 papers shown
Title
Radio: Rate-Distortion Optimization for Large Language Model Compression
Sean I. Young
MQ
21
0
0
05 May 2025
FP4DiT: Towards Effective Floating Point Quantization for Diffusion Transformers
Ruichen Chen
Keith G. Mills
Di Niu
MQ
54
0
0
19 Mar 2025
Task Vector Quantization for Memory-Efficient Model Merging
Youngeun Kim
Seunghwan Lee
Aecheon Jung
Bogon Ryu
Sungeun Hong
MQ
MoMe
52
0
0
10 Mar 2025
UPAQ: A Framework for Real-Time and Energy-Efficient 3D Object Detection in Autonomous Vehicles
Abhishek Balasubramaniam
Febin P. Sunny
S. Pasricha
3DPC
39
0
0
08 Jan 2025
Improving Quantization-aware Training of Low-Precision Network via Block Replacement on Full-Precision Counterpart
Chengting Yu
Shu Yang
Fengzhao Zhang
Hanzhi Ma
Aili Wang
Er-ping Li
MQ
77
2
0
20 Dec 2024
IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models
Hang Guo
Yawei Li
Tao Dai
Shu-Tao Xia
Luca Benini
MQ
29
1
0
29 Oct 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
145
0
0
29 Oct 2024
DAQ: Density-Aware Post-Training Weight-Only Quantization For LLMs
Yingsong Luo
Ling Chen
MQ
21
0
0
16 Oct 2024
Scaling laws for post-training quantized large language models
Zifei Xu
Alexander Lan
W. Yazar
T. Webb
Sayeh Sharify
Xin Eric Wang
MQ
28
0
0
15 Oct 2024
Error Diffusion: Post Training Quantization with Block-Scaled Number Formats for Neural Networks
Alireza Khodamoradi
K. Denolf
Eric Dellinger
MQ
32
0
0
15 Oct 2024
QEFT: Quantization for Efficient Fine-Tuning of LLMs
Changhun Lee
Jun-gyu Jin
Younghyun Cho
Eunhyeok Park
MQ
40
1
0
11 Oct 2024
Q-VLM: Post-training Quantization for Large Vision-Language Models
Changyuan Wang
Ziwei Wang
Xiuwei Xu
Yansong Tang
Jie Zhou
Jiwen Lu
MQ
32
1
0
10 Oct 2024
On Efficient Variants of Segment Anything Model: A Survey
Xiaorui Sun
J. Liu
H. Shen
Xiaofeng Zhu
Ping Hu
VLM
45
4
0
07 Oct 2024
PTQ4RIS: Post-Training Quantization for Referring Image Segmentation
Xiaoyan Jiang
Hang Yang
Kaiying Zhu
Xihe Qiu
Shibo Zhao
Sifan Zhou
MQ
26
0
0
25 Sep 2024
Foundations of Large Language Model Compression -- Part 1: Weight Quantization
Sean I. Young
MQ
40
1
0
03 Sep 2024
MetaAug: Meta-Data Augmentation for Post-Training Quantization
Cuong Pham
Hoang Anh Dung
Cuong C. Nguyen
Trung Le
Dinh Q. Phung
Gustavo Carneiro
Thanh-Toan Do
MQ
40
0
0
20 Jul 2024
decoupleQ: Towards 2-bit Post-Training Uniform Quantization via decoupling Parameters into Integer and Floating Points
Yi Guo
Fanliu Kong
Xiaoyang Li
Hui Li
Wei-Neng Chen
Xiaogang Tian
Jinping Cai
Yang Zhang
Shouda Liu
MQ
24
6
0
19 Apr 2024
SQUAT: Stateful Quantization-Aware Training in Recurrent Spiking Neural Networks
Sreyes P. Venkatesh
Razvan Marinescu
Jason Eshraghian
MQ
33
5
0
15 Apr 2024
Frame Quantization of Neural Networks
Wojciech Czaja
Sanghoon Na
32
1
0
11 Apr 2024
Instance-Aware Group Quantization for Vision Transformers
Jaehyeon Moon
Dohyung Kim
Junyong Cheon
Bumsub Ham
MQ
ViT
27
6
0
01 Apr 2024
Self-Supervised Quantization-Aware Knowledge Distillation
Kaiqi Zhao
Ming Zhao
MQ
33
2
0
17 Mar 2024
Achieving Pareto Optimality using Efficient Parameter Reduction for DNNs in Resource-Constrained Edge Environment
Atah Nuh Mih
Alireza Rahimi
Asfia Kawnine
Francis Palma
Monica Wachowicz
R. Dubay
Hung Cao
21
0
0
14 Mar 2024
RQP-SGD: Differential Private Machine Learning through Noisy SGD and Randomized Quantization
Ce Feng
Parv Venkitasubramaniam
27
1
0
09 Feb 2024
QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning
Haoxuan Wang
Yuzhang Shang
Zhihang Yuan
Junyi Wu
Yan Yan
DiffM
MQ
11
28
0
06 Feb 2024
GenQ: Quantization in Low Data Regimes with Generative Synthetic Data
Yuhang Li
Youngeun Kim
Donghyun Lee
Souvik Kundu
Priyadarshini Panda
MQ
25
2
0
07 Dec 2023
Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models
Miaoxi Zhu
Qihuang Zhong
Li Shen
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
MQ
VLM
29
1
0
20 Oct 2023
AI/ML-based Load Prediction in IEEE 802.11 Enterprise Networks
Francesc Wilhelmi
Dariush Salami
Gianluca Fontanesi
Lorenzo Galati-Giordano
Mika Kasslin
11
1
0
11 Oct 2023
QuATON: Quantization Aware Training of Optical Neurons
Hasindu Kariyawasam
Ramith Hettiarachchi
Quansan Yang
Alex Matlock
Takahiro Nambara
Hiroyuki Kusaka
Yuichiro Kunai
Peter T C So
Edward S Boyden
D. Wadduwage
MQ
24
1
0
04 Oct 2023
MixQuant: Mixed Precision Quantization with a Bit-width Optimization Search
Yichen Xie
Wei Le
MQ
16
4
0
29 Sep 2023
EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian
Ofir Gordon
H. Habi
Arnon Netzer
MQ
33
1
0
20 Sep 2023
Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers
N. Frumkin
Dibakar Gope
Diana Marculescu
MQ
33
16
0
21 Aug 2023
EQ-Net: Elastic Quantization Neural Networks
Ke Xu
Lei Han
Ye Tian
Shangshang Yang
Xingyi Zhang
MQ
37
7
0
15 Aug 2023
Model Compression Methods for YOLOv5: A Review
Mohammad Jani
Jamil Fayyad
Younes Al Younes
H. Najjaran
31
14
0
21 Jul 2023
Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization
Clemens J. S. Schaefer
Navid Lambert-Shirzad
Xiaofan Zhang
Chia-Wei Chou
T. Jablin
Jian Li
Elfie Guo
Caitlin Stanton
S. Joshi
Yu Emma Wang
MQ
28
2
0
08 Jun 2023
OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models
Changhun Lee
Jungyu Jin
Taesu Kim
Hyungjun Kim
Eunhyeok Park
MQ
11
49
0
04 Jun 2023
FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization
J. H. Lee
Jeonghoon Kim
S. Kwon
Dongsoo Lee
MQ
22
33
0
01 Jun 2023
Towards Accurate Post-training Quantization for Diffusion Models
Changyuan Wang
Ziwei Wang
Xiuwei Xu
Yansong Tang
Jie Zhou
Jiwen Lu
MQ
30
20
0
30 May 2023
Improving Post-Training Quantization on Object Detection with Task Loss-Guided Lp Metric
Lin Niu
Jia-Wen Liu
Zhihang Yuan
Dawei Yang
Xinggang Wang
Wenyu Liu
MQ
33
2
0
19 Apr 2023
Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective
Yuexiao Ma
Huixia Li
Xiawu Zheng
Xuefeng Xiao
Rui Wang
Shilei Wen
Xin Pan
Fei Chao
Rongrong Ji
MQ
10
12
0
21 Mar 2023
Mixed Precision Post Training Quantization of Neural Networks with Sensitivity Guided Search
Clemens J. S. Schaefer
Elfie Guo
Caitlin Stanton
Xiaofan Zhang
T. Jablin
Navid Lambert-Shirzad
Jian Li
Chia-Wei Chou
Siddharth Joshi
Yu Wang
MQ
21
3
0
02 Feb 2023
ACQ: Improving Generative Data-free Quantization Via Attention Correction
Jixing Li
Xiaozhou Guo
Benzhe Dai
Guoliang Gong
Min Jin
Gang Chen
Wenyu Mao
Huaxiang Lu
MQ
30
4
0
18 Jan 2023
PD-Quant: Post-Training Quantization based on Prediction Difference Metric
Jiawei Liu
Lin Niu
Zhihang Yuan
Dawei Yang
Xinggang Wang
Wenyu Liu
MQ
96
68
0
14 Dec 2022
Error-aware Quantization through Noise Tempering
Zheng Wang
Juncheng Billy Li
Shuhui Qu
Florian Metze
Emma Strubell
MQ
11
2
0
11 Dec 2022
Vertical Layering of Quantized Neural Networks for Heterogeneous Inference
Hai Wu
Ruifei He
Hao Hao Tan
Xiaojuan Qi
Kaibin Huang
MQ
21
2
0
10 Dec 2022
Post-training Quantization on Diffusion Models
Yuzhang Shang
Zhihang Yuan
Bin Xie
Bingzhe Wu
Yan Yan
DiffM
MQ
15
157
0
28 Nov 2022
CPT-V: A Contrastive Approach to Post-Training Quantization of Vision Transformers
N. Frumkin
Dibakar Gope
Diana Marculescu
ViT
MQ
21
1
0
17 Nov 2022
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Elias Frantar
Saleh Ashkboos
Torsten Hoefler
Dan Alistarh
MQ
19
882
0
31 Oct 2022
Sub-8-bit quantization for on-device speech recognition: a regularization-free approach
Kai Zhen
Martin H. Radfar
Hieu Duy Nguyen
Grant P. Strimel
Nathan Susanj
Athanasios Mouchtaris
MQ
15
8
0
17 Oct 2022
SQuAT: Sharpness- and Quantization-Aware Training for BERT
Zheng Wang
Juncheng Billy Li
Shuhui Qu
Florian Metze
Emma Strubell
MQ
18
7
0
13 Oct 2022
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models
Xiuying Wei
Yunchen Zhang
Xiangguo Zhang
Ruihao Gong
Shanghang Zhang
Qi Zhang
F. Yu
Xianglong Liu
MQ
22
145
0
27 Sep 2022
1
2
Next