Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks

29 October 2020

Papers citing "Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks"

8 / 8 papers shown

Title
Generalizing Weather Forecast to Fine-grained Temporal Scales via Physics-AI Hybrid Modeling Wanghan Xu Fenghua Ling Wenlong Zhang Tao Han Hao Chen Wanli Ouyang Lei Bai AI4CE 34 5 0 22 May 2024
GPTVQ: The Blessing of Dimensionality for LLM Quantization M. V. Baalen Andrey Kuzmin Markus Nagel Peter Couperus Cédric Bastoul E. Mahurin Tijmen Blankevoort Paul N. Whatmough MQ 34 28 0 23 Feb 2024
Hyperspherical Quantization: Toward Smaller and More Accurate Models Dan Liu X. Chen Chen-li Ma Xue Liu MQ 22 3 0 24 Dec 2022
Deep learning model compression using network sensitivity and gradients M. Sakthi N. Yadla Raj Pawate 19 2 0 11 Oct 2022
Multi-modal Streaming 3D Object Detection Mazen Abdelfattah Kaiwen Yuan Z. J. Wang Rabab Ward 3DPC 21 7 0 12 Sep 2022
Enabling On-Device Smartphone GPU based Training: Lessons Learned Anish Das Young D. Kwon Jagmohan Chauhan Cecilia Mascolo 3DH 27 10 0 21 Feb 2022
Toward Compact Parameter Representations for Architecture-Agnostic Neural Network Compression Yuezhou Sun Wenlong Zhao Lijun Zhang Xiao Liu Hui Guan Matei A. Zaharia 21 0 0 19 Nov 2021
Fine-grained Data Distribution Alignment for Post-Training Quantization Yunshan Zhong Mingbao Lin Mengzhao Chen Ke Li Yunhang Shen Fei Chao Yongjian Wu Rongrong Ji MQ 84 19 0 09 Sep 2021