v1v2v3v4 (latest)

Neural Network Compression Framework for fast model inference

20 February 2020

ArXiv (abs)PDF HTML Github (1034★)

Papers citing "Neural Network Compression Framework for fast model inference"

22 / 22 papers shown

Quantization Range Estimation for Convolutional Neural Networks

243

05 Oct 2025

Side-Channel Analysis of OpenVINO-based Neural Network Models

446

23 Jul 2024

PowerYOLO: Mixed Precision Model for Hardware Efficient Object Detection with Event Data

Dominika Przewlocka-Rus

T. Kryjak

M. Gorgon

368

11 Jul 2024

Effective Interplay between Sparsity and Quantization: From Theory to Practice

...

461

31 May 2024

FlexNN: A Dataflow-aware Flexible Deep Learning Accelerator for Energy-Efficient Edge Devices

Arnab Raha

Deepak A. Mathaikutty

Soumendu Kumar Ghosh

Shamik Kundu

227

14 Mar 2024

Benchmarking Adversarial Robustness of Compressed Deep Learning Models

280

16 Aug 2023

EfficientBioAI: Making Bioimaging AI Models Efficient in Energy, Latency and Representation

181

09 Jun 2023

QFT: Post-training quantization via fast joint finetuning of all degrees of freedom

Alexander Finkelstein

192

05 Dec 2022

CheckINN: Wide Range Neural Network Verification in Imandra (Extended)ACM-SIGPLAN International Conference on Principles and Practice of Declarative Programming (PPDP), 2022

Remi Desmartin

Grant Passmore

Ekaterina Komendantskaya

M. Daggitt

279

21 Jul 2022

Anomalib: A Deep Learning Library for Anomaly DetectionInternational Conference on Information Photonics (ICIP), 2022

364

154

16 Feb 2022

Enabling NAS with Automated Super-Network Generation

237

20 Dec 2021

Predicting the success of Gradient Descent for a particular Dataset-Architecture-Initialization (DAI)

Umang Jain

H. G. Ramaswamy

AI4CE

103

25 Nov 2021

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark

Yuhang Li

208

05 Nov 2021

Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision QuantizationAnnual Conference on Genetic and Evolutionary Computation (GECCO), 2021

204

14 Jun 2021

119

20 May 2021

Environmental Sound Classification on the Edge: A Pipeline for Deep Acoustic Networks on Extremely Resource-Constrained DevicesPattern Recognition (Pattern Recogn.), 2021

440

05 Mar 2021

SparseDNN: Fast Sparse Deep Learning Inference on CPUs

Ziheng Wang

437

20 Jan 2021

Generalized Operating Procedure for Deep Learning: an Unconstrained Optimal Design Perspective

203

31 Dec 2020

Paralinguistic Privacy Protection at the Edge

Ranya Aloufi

Hamed Haddadi

David E. Boyle

373

04 Nov 2020

A flexible, extensible software framework for model compression based on the LC algorithm

Yerlan Idelbayev

Miguel Á. Carreira-Perpiñán

219

15 May 2020

Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation

Paulius Micikevicius

270

450

20 Apr 2020

LadaBERT: Lightweight Adaptation of BERT through Hybrid Model CompressionInternational Conference on Computational Linguistics (COLING), 2020

Yujing Wang

Jing Bai

221

08 Apr 2020