Layer-Wise Quantization: A Pragmatic and Effective Method for Quantizing LLMs Beyond Integer Bit-Levels

25 June 2024

Vikas Yadav

Papers citing "Layer-Wise Quantization: A Pragmatic and Effective Method for Quantizing LLMs Beyond Integer Bit-Levels"

2 / 2 papers shown

Title
LSAQ: Layer-Specific Adaptive Quantization for Large Language Model Deployment Binrui Zeng Bin Ji Xiaodong Liu Jie Yu Shasha Li Jun Ma Xiaopeng Li Shangwen Wang Xinran Hong Yongtao Tang MQ 36 1 0 24 Dec 2024
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT Sheng Shen Zhen Dong Jiayu Ye Linjian Ma Z. Yao A. Gholami Michael W. Mahoney Kurt Keutzer MQ 217 571 0 12 Sep 2019