ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.08679
  4. Cited By
Neural Network Compression Framework for fast model inference
v1v2v3v4 (latest)

Neural Network Compression Framework for fast model inference

20 February 2020
Alexander Kozlov
Ivan Lazarevich
Vasily Shamporov
N. Lyalyushkin
Yury Gorbachev
ArXiv (abs)PDFHTMLGithub (1034★)

Papers citing "Neural Network Compression Framework for fast model inference"

22 / 22 papers shown
Quantization Range Estimation for Convolutional Neural Networks
Quantization Range Estimation for Convolutional Neural Networks
Bingtao Yang
Yujia Wang
Mengzhi Jiao
Hongwei Huo
MQ
243
0
0
05 Oct 2025
Side-Channel Analysis of OpenVINO-based Neural Network Models
Side-Channel Analysis of OpenVINO-based Neural Network Models
Dirmanto Jap
J. Breier
Zdenko Lehocký
S. Bhasin
Xiaolu Hou
FedML
446
3
0
23 Jul 2024
PowerYOLO: Mixed Precision Model for Hardware Efficient Object Detection
  with Event Data
PowerYOLO: Mixed Precision Model for Hardware Efficient Object Detection with Event Data
Dominika Przewlocka-Rus
T. Kryjak
M. Gorgon
368
2
0
11 Jul 2024
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Simla Burcu Harma
Ayan Chakraborty
Elizaveta Kostenok
Danila Mishin
Dongho Ha
...
Martin Jaggi
Ming Liu
Yunho Oh
Suvinay Subramanian
Amir Yazdanbakhsh
MQ
461
22
0
31 May 2024
FlexNN: A Dataflow-aware Flexible Deep Learning Accelerator for
  Energy-Efficient Edge Devices
FlexNN: A Dataflow-aware Flexible Deep Learning Accelerator for Energy-Efficient Edge Devices
Arnab Raha
Deepak A. Mathaikutty
Soumendu Kumar Ghosh
Shamik Kundu
227
14
0
14 Mar 2024
Benchmarking Adversarial Robustness of Compressed Deep Learning Models
Benchmarking Adversarial Robustness of Compressed Deep Learning Models
Brijesh Vora
Kartik Patwari
Syed Mahbub Hafiz
Zubair Shafiq
Chen-Nee Chuah
AAML
280
3
0
16 Aug 2023
EfficientBioAI: Making Bioimaging AI Models Efficient in Energy, Latency
  and Representation
EfficientBioAI: Making Bioimaging AI Models Efficient in Energy, Latency and Representation
Yu Zhou
Justin Sonneck
Sweta Banerjee
Stefanie Dorr
Anika Gruneboom
Kristina Lorenz
Jianxu Chen
MedIm
181
4
0
09 Jun 2023
QFT: Post-training quantization via fast joint finetuning of all degrees
  of freedom
QFT: Post-training quantization via fast joint finetuning of all degrees of freedom
Alexander Finkelstein
Ella Fuchs
Idan Tal
Mark Grobman
Niv Vosco
Eldad Meller
MQ
192
9
0
05 Dec 2022
CheckINN: Wide Range Neural Network Verification in Imandra (Extended)
CheckINN: Wide Range Neural Network Verification in Imandra (Extended)ACM-SIGPLAN International Conference on Principles and Practice of Declarative Programming (PPDP), 2022
Remi Desmartin
Grant Passmore
Ekaterina Komendantskaya
M. Daggitt
279
5
0
21 Jul 2022
Anomalib: A Deep Learning Library for Anomaly Detection
Anomalib: A Deep Learning Library for Anomaly DetectionInternational Conference on Information Photonics (ICIP), 2022
S. Akçay
Dick Ameln
Ashwin Vaidya
B. Lakshmanan
Nilesh A. Ahuja
Ergin Utku Genc
364
154
0
16 Feb 2022
Enabling NAS with Automated Super-Network Generation
Enabling NAS with Automated Super-Network Generation
J. P. Muñoz
N. Lyalyushkin
Yash Akhauri
A. Senina
Alexander Kozlov
Nilesh Jain
237
18
0
20 Dec 2021
Predicting the success of Gradient Descent for a particular
  Dataset-Architecture-Initialization (DAI)
Predicting the success of Gradient Descent for a particular Dataset-Architecture-Initialization (DAI)
Umang Jain
H. G. Ramaswamy
AI4CE
103
1
0
25 Nov 2021
MQBench: Towards Reproducible and Deployable Model Quantization
  Benchmark
MQBench: Towards Reproducible and Deployable Model Quantization Benchmark
Yuhang Li
Mingzhu Shen
Jian Ma
Yan Ren
Mingxin Zhao
Tao Gui
Yazhe Niu
F. Yu
Junjie Yan
MQ
208
70
0
05 Nov 2021
Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision
  Quantization
Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision QuantizationAnnual Conference on Genetic and Evolutionary Computation (GECCO), 2021
Santiago Miret
Vui Seng Chua
Mattias Marder
Mariano Phielipp
Nilesh Jain
Somdeb Majumdar
204
10
0
14 Jun 2021
Model Compression
Model Compression
Arhum Ishtiaq
Sara Mahmood
M. Anees
Neha Mumtaz
119
0
0
20 May 2021
Environmental Sound Classification on the Edge: A Pipeline for Deep
  Acoustic Networks on Extremely Resource-Constrained Devices
Environmental Sound Classification on the Edge: A Pipeline for Deep Acoustic Networks on Extremely Resource-Constrained DevicesPattern Recognition (Pattern Recogn.), 2021
Md Mohaimenuzzaman
Christoph Bergmeir
I. West
B. Meyer
440
61
0
05 Mar 2021
SparseDNN: Fast Sparse Deep Learning Inference on CPUs
SparseDNN: Fast Sparse Deep Learning Inference on CPUs
Ziheng Wang
MQ
437
22
0
20 Jan 2021
Generalized Operating Procedure for Deep Learning: an Unconstrained
  Optimal Design Perspective
Generalized Operating Procedure for Deep Learning: an Unconstrained Optimal Design Perspective
Shen Chen
Mingwei Zhang
Jiamin Cui
Wei Yao
CVBM
203
0
0
31 Dec 2020
Paralinguistic Privacy Protection at the Edge
Paralinguistic Privacy Protection at the Edge
Ranya Aloufi
Hamed Haddadi
David E. Boyle
373
18
0
04 Nov 2020
A flexible, extensible software framework for model compression based on
  the LC algorithm
A flexible, extensible software framework for model compression based on the LC algorithm
Yerlan Idelbayev
Miguel Á. Carreira-Perpiñán
219
10
0
15 May 2020
Integer Quantization for Deep Learning Inference: Principles and
  Empirical Evaluation
Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation
Hao Wu
Patrick Judd
Xiaojie Zhang
Mikhail Isaev
Paulius Micikevicius
MQ
270
450
0
20 Apr 2020
LadaBERT: Lightweight Adaptation of BERT through Hybrid Model
  Compression
LadaBERT: Lightweight Adaptation of BERT through Hybrid Model CompressionInternational Conference on Computational Linguistics (COLING), 2020
Yihuan Mao
Yujing Wang
Chufan Wu
Chen Zhang
Yang-Feng Wang
Yaming Yang
Quanlu Zhang
Yunhai Tong
Jing Bai
221
81
0
08 Apr 2020
1
Page 1 of 1