Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision Quantization

23 December 2023

Papers citing "Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision Quantization"

7 / 7 papers shown

Title
Automatic Joint Structured Pruning and Quantization for Efficient Neural Network Training and Compression Xiaoyi Qu David Aponte Colby R. Banbury Daniel P. Robinson Tianyu Ding K. Koishida Ilya Zharkov Tianyi Chen MQ 59 1 0 23 Feb 2025
Membership Inference Risks in Quantized Models: A Theoretical and Empirical Study Eric Aubinais Philippe Formont Pablo Piantanida Elisabeth Gassiat 38 0 0 10 Feb 2025
Applications of Knowledge Distillation in Remote Sensing: A Survey Yassine Himeur N. Aburaed O. Elharrouss Iraklis Varlamis Shadi Atalla W. Mansoor Hussain Al Ahmad 29 4 0 18 Sep 2024
Mixed-precision Neural Networks on RISC-V Cores: ISA extensions for Multi-Pumped Soft SIMD Operations Giorgos Armeniakos Alexis Maras S. Xydis Dimitrios Soudris MQ 19 3 0 19 Jul 2024
An LLM-Tool Compiler for Fused Parallel Function Calling Simranjit Singh Andreas Karatzas Michael Fore Iraklis Anagnostopoulos Dimitrios Stamoulis LLMAG 27 6 0 07 May 2024
OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization Peng Hu Xi Peng Hongyuan Zhu M. Aly Jie Lin MQ 31 59 0 23 May 2022
NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications Tien-Ju Yang Andrew G. Howard Bo Chen Xiao Zhang Alec Go Mark Sandler Vivienne Sze Hartwig Adam 88 515 0 09 Apr 2018