ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.00149
  4. Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding
v1v2v3v4v5 (latest)

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015
Song Han
Huizi Mao
W. Dally
    3DGS
ArXiv (abs)PDFHTML

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,631 papers shown
Developmental Plasticity-inspired Adaptive Pruning for Deep Spiking and
  Artificial Neural Networks
Developmental Plasticity-inspired Adaptive Pruning for Deep Spiking and Artificial Neural NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Bing Han
Feifei Zhao
Yi Zeng
Guobin Shen
163
11
0
23 Nov 2022
FedDCT: Federated Learning of Large Convolutional Neural Networks on
  Resource Constrained Devices using Divide and Collaborative Training
FedDCT: Federated Learning of Large Convolutional Neural Networks on Resource Constrained Devices using Divide and Collaborative TrainingIEEE Transactions on Network and Service Management (IEEE TNSM), 2022
Quan Nguyen
Hieu H. Pham
Kok-Seng Wong
Phi Le Nguyen
Truong Thao Nguyen
Minh N. Do
FedML
261
9
0
20 Nov 2022
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large
  Language Models
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language ModelsInternational Conference on Machine Learning (ICML), 2022
Guangxuan Xiao
Ji Lin
Mickael Seznec
Hao Wu
Julien Demouth
Song Han
MQ
801
1,207
0
18 Nov 2022
Structured Pruning Adapters
Structured Pruning AdaptersPattern Recognition (Pattern Recogn.), 2022
Lukas Hedegaard
Aman Alok
Juby Jose
Alexandros Iosifidis
277
15
0
17 Nov 2022
Is Smaller Always Faster? Tradeoffs in Compressing Self-Supervised Speech Transformers
Is Smaller Always Faster? Tradeoffs in Compressing Self-Supervised Speech Transformers
Tzu-Quan Lin
Tsung-Huan Yang
Chun-Yao Chang
Kuang-Ming Chen
Tzu-hsun Feng
Hung-yi Lee
Hao Tang
295
6
0
17 Nov 2022
Structured Knowledge Distillation Towards Efficient and Compact
  Multi-View 3D Detection
Structured Knowledge Distillation Towards Efficient and Compact Multi-View 3D Detection
Linfeng Zhang
Yukang Shi
Hung-Shuo Tai
Zhipeng Zhang
Yuan He
Ke Wang
Kaisheng Ma
264
4
0
14 Nov 2022
Pruning Very Deep Neural Network Channels for Efficient Inference
Pruning Very Deep Neural Network Channels for Efficient Inference
Yihui He
235
2
0
14 Nov 2022
Robust Training of Graph Neural Networks via Noise Governance
Robust Training of Graph Neural Networks via Noise GovernanceWeb Search and Data Mining (WSDM), 2022
Siyi Qian
Haochao Ying
Renjun Hu
Jingbo Zhou
Jintai Chen
Benlin Liu
Jian Wu
NoLa
269
55
0
12 Nov 2022
Streaming, fast and accurate on-device Inverse Text Normalization for
  Automatic Speech Recognition
Streaming, fast and accurate on-device Inverse Text Normalization for Automatic Speech RecognitionSpoken Language Technology Workshop (SLT), 2022
Yashesh Gaur
Nick Kibre
Jian Xue
Kangyuan Shu
Yuhui Wang
Issac Alphonso
Jinyu Li
Jiawei Liu
118
7
0
07 Nov 2022
RUBICON: A Framework for Designing Efficient Deep Learning-Based Genomic
  Basecallers
RUBICON: A Framework for Designing Efficient Deep Learning-Based Genomic BasecallersGenome Biology (GB), 2022
Gagandeep Singh
M. Alser
K. Denolf
Can Firtina
Alireza Khodamoradi
Meryem Banu Cavlak
Henk Corporaal
O. Mutlu
371
17
0
06 Nov 2022
Multi-Objective Evolutionary for Object Detection Mobile Architectures
  Search
Multi-Objective Evolutionary for Object Detection Mobile Architectures Search
Haichao Zhang
Jiashi Li
Xin Xia
K. Hao
Xuefeng Xiao
183
2
0
05 Nov 2022
Intriguing Properties of Compression on Multilingual Models
Intriguing Properties of Compression on Multilingual ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Kelechi Ogueji
Orevaoghene Ahia
Gbemileke Onilude
Sebastian Gehrmann
Sara Hooker
Julia Kreutzer
302
15
0
04 Nov 2022
Soft Masking for Cost-Constrained Channel Pruning
Soft Masking for Cost-Constrained Channel PruningEuropean Conference on Computer Vision (ECCV), 2022
Ryan Humble
Maying Shen
J. Latorre
Eric Darve1
J. Álvarez
174
18
0
04 Nov 2022
Efficient Spatially Sparse Inference for Conditional GANs and Diffusion
  Models
Efficient Spatially Sparse Inference for Conditional GANs and Diffusion ModelsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Zhekai Zhang
Ji Lin
Chenlin Meng
Stefano Ermon
Song Han
Jun-Yan Zhu
DiffM
497
61
0
03 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech ProcessingNeural Information Processing Systems (NeurIPS), 2022
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
354
10
0
02 Nov 2022
Model Compression for DNN-based Speaker Verification Using Weight
  Quantization
Model Compression for DNN-based Speaker Verification Using Weight QuantizationInterspeech (Interspeech), 2022
Jingyu Li
W. Liu
Zhaoyang Zhang
Jiong Wang
Tan Lee
MQ
381
3
0
31 Oct 2022
FusionFormer: Fusing Operations in Transformer for Efficient Streaming
  Speech Recognition
FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition
Xingcheng Song
Di Wu
Binbin Zhang
Zhiyong Wu
Wenpeng Li
...
Peng Zhang
Zhendong Peng
Fuping Pan
Changbao Zhu
Zhongqin Wu
133
2
0
31 Oct 2022
LearningGroup: A Real-Time Sparse Training on FPGA via Learnable Weight
  Grouping for Multi-Agent Reinforcement Learning
LearningGroup: A Real-Time Sparse Training on FPGA via Learnable Weight Grouping for Multi-Agent Reinforcement LearningInternational Conference on Field-Programmable Technology (ICFPT), 2022
Jenny Yang
Jaeuk Kim
Joo-Young Kim
196
2
0
29 Oct 2022
LOFT: Finding Lottery Tickets through Filter-wise Training
LOFT: Finding Lottery Tickets through Filter-wise TrainingInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Qihan Wang
Chen Dun
Fangshuo Liao
C. Jermaine
Anastasios Kyrillidis
181
3
0
28 Oct 2022
Class Based Thresholding in Early Exit Semantic Segmentation Networks
Class Based Thresholding in Early Exit Semantic Segmentation NetworksIEEE Signal Processing Letters (SPL), 2022
Alperen Görmez
Erdem Koyuncu
152
6
0
27 Oct 2022
Efficient ECG-based Atrial Fibrillation Detection via Parameterised
  Hypercomplex Neural Networks
Efficient ECG-based Atrial Fibrillation Detection via Parameterised Hypercomplex Neural NetworksEuropean Signal Processing Conference (EUSIPCO), 2022
Leonie Basso
Zhao Ren
Wolfgang Nejdl
320
3
0
27 Oct 2022
Gradient-based Weight Density Balancing for Robust Dynamic Sparse
  Training
Gradient-based Weight Density Balancing for Robust Dynamic Sparse Training
Mathias Parger
Alexander Ertl
Paul Eibensteiner
J. H. Mueller
Martin Winter
M. Steinberger
151
1
0
25 Oct 2022
Pruning's Effect on Generalization Through the Lens of Training and
  Regularization
Pruning's Effect on Generalization Through the Lens of Training and RegularizationNeural Information Processing Systems (NeurIPS), 2022
Tian Jin
Michael Carbin
Daniel M. Roy
Jonathan Frankle
Gintare Karolina Dziugaite
236
33
0
25 Oct 2022
Pushing the Efficiency Limit Using Structured Sparse Convolutions
Pushing the Efficiency Limit Using Structured Sparse ConvolutionsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Vinay Kumar Verma
Nikhil Mehta
Shijing Si
Ricardo Henao
Lawrence Carin
183
3
0
23 Oct 2022
Towards Global Neural Network Abstractions with Locally-Exact
  Reconstruction
Towards Global Neural Network Abstractions with Locally-Exact ReconstructionNeural Networks (NN), 2022
Edoardo Manino
I. Bessa
Lucas C. Cordeiro
225
3
0
21 Oct 2022
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work
When Expressivity Meets Trainability: Fewer than nnn Neurons Can WorkNeural Information Processing Systems (NeurIPS), 2022
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Tian Ding
Jianfeng Yao
326
10
0
21 Oct 2022
Learning Robust Dynamics through Variational Sparse Gating
Learning Robust Dynamics through Variational Sparse GatingNeural Information Processing Systems (NeurIPS), 2022
A. Jain
Shivakanth Sujit
S. Joshi
Vincent Michalski
Danijar Hafner
Samira Ebrahimi Kahou
154
11
0
21 Oct 2022
Pruning by Active Attention Manipulation
Pruning by Active Attention Manipulation
Z. Babaiee
Lucas Liebenwein
Ramin Hasani
Daniela Rus
Radu Grosu
156
1
0
20 Oct 2022
Attaining Class-level Forgetting in Pretrained Model using Few Samples
Attaining Class-level Forgetting in Pretrained Model using Few SamplesEuropean Conference on Computer Vision (ECCV), 2022
Pravendra Singh
Pratik Mazumder
M. A. Karim
VLMCLLMU
96
3
0
19 Oct 2022
Tempo: Accelerating Transformer-Based Model Training through Memory
  Footprint Reduction
Tempo: Accelerating Transformer-Based Model Training through Memory Footprint ReductionNeural Information Processing Systems (NeurIPS), 2022
Muralidhar Andoorveedu
Zhanda Zhu
Bojian Zheng
Gennady Pekhimenko
185
8
0
19 Oct 2022
Approximating Continuous Convolutions for Deep Network Compression
Approximating Continuous Convolutions for Deep Network CompressionBritish Machine Vision Conference (BMVC), 2022
Theo W. Costain
V. Prisacariu
175
0
0
17 Oct 2022
Packed-Ensembles for Efficient Uncertainty Estimation
Packed-Ensembles for Efficient Uncertainty EstimationInternational Conference on Learning Representations (ICLR), 2022
Olivier Laurent
Adrien Lafage
Enzo Tartaglione
Geoffrey Daniel
Jean-Marc Martinez
Andrei Bursuc
Gianni Franchi
OODD
464
40
0
17 Oct 2022
HQNAS: Auto CNN deployment framework for joint quantization and
  architecture search
HQNAS: Auto CNN deployment framework for joint quantization and architecture search
Hongjiang Chen
Yang Wang
Leibo Liu
Shaojun Wei
Shouyi Yin
MQ
111
3
0
16 Oct 2022
The Effects of Partitioning Strategies on Energy Consumption in
  Distributed CNN Inference at The Edge
The Effects of Partitioning Strategies on Energy Consumption in Distributed CNN Inference at The Edge
Erqian Tang
Xiaotian Guo
T. Stefanov
120
1
0
15 Oct 2022
Deep Differentiable Logic Gate Networks
Deep Differentiable Logic Gate NetworksNeural Information Processing Systems (NeurIPS), 2022
Felix Petersen
Christian Borgelt
Hilde Kuehne
Oliver Deussen
AI4CE
191
61
0
15 Oct 2022
Post-Training Quantization for Energy Efficient Realization of Deep
  Neural Networks
Post-Training Quantization for Energy Efficient Realization of Deep Neural NetworksInternational Conference on Machine Learning and Applications (ICMLA), 2022
Cecilia Latotzke
Batuhan Balim
T. Gemmeke
MQ
75
3
0
14 Oct 2022
CAP: Correlation-Aware Pruning for Highly-Accurate Sparse Vision Models
CAP: Correlation-Aware Pruning for Highly-Accurate Sparse Vision ModelsNeural Information Processing Systems (NeurIPS), 2022
Denis Kuznedelev
Eldar Kurtic
Elias Frantar
Dan Alistarh
VLMViT
174
21
0
14 Oct 2022
Parameter-Efficient Masking Networks
Parameter-Efficient Masking NetworksNeural Information Processing Systems (NeurIPS), 2022
Yue Bai
Huan Wang
Xu Ma
Yitian Zhang
Zhiqiang Tao
Yun Fu
148
11
0
13 Oct 2022
Structural Pruning via Latency-Saliency Knapsack
Structural Pruning via Latency-Saliency KnapsackNeural Information Processing Systems (NeurIPS), 2022
Maying Shen
Hongxu Yin
Pavlo Molchanov
Lei Mao
Jianna Liu
J. Álvarez
340
60
0
13 Oct 2022
SeKron: A Decomposition Method Supporting Many Factorization Structures
SeKron: A Decomposition Method Supporting Many Factorization Structures
Marawan Gamal Abdel Hameed
A. Mosleh
Marzieh S. Tahaei
V. Nia
164
1
0
12 Oct 2022
SaiT: Sparse Vision Transformers through Adaptive Token Pruning
SaiT: Sparse Vision Transformers through Adaptive Token Pruning
Ling Li
D. Thorsley
Joseph Hassoun
ViT
135
20
0
11 Oct 2022
Edge-Cloud Cooperation for DNN Inference via Reinforcement Learning and
  Supervised Learning
Edge-Cloud Cooperation for DNN Inference via Reinforcement Learning and Supervised Learning
Tinghao Zhang
Zhijun Li
Yongrui Chen
Kwok-Yan Lam
Jun Zhao
189
5
0
11 Oct 2022
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation
  Approach
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation ApproachNeural Information Processing Systems (NeurIPS), 2022
Peng Mi
Li Shen
Tianhe Ren
Weihao Ye
Xiaoshuai Sun
Rongrong Ji
Dacheng Tao
AAML
270
84
0
11 Oct 2022
Deep learning model compression using network sensitivity and gradients
Deep learning model compression using network sensitivity and gradients
M. Sakthi
N. Yadla
Raj Pawate
172
2
0
11 Oct 2022
DeepPerform: An Efficient Approach for Performance Testing of
  Resource-Constrained Neural Networks
DeepPerform: An Efficient Approach for Performance Testing of Resource-Constrained Neural NetworksInternational Conference on Automated Software Engineering (ASE), 2022
Simin Chen
Mirazul Haque
Cong Liu
Wei Yang
209
24
0
10 Oct 2022
Advancing Model Pruning via Bi-level Optimization
Advancing Model Pruning via Bi-level OptimizationNeural Information Processing Systems (NeurIPS), 2022
Yihua Zhang
Yuguang Yao
Parikshit Ram
Pu Zhao
Tianlong Chen
Min-Fong Hong
Yanzhi Wang
Sijia Liu
449
86
0
08 Oct 2022
Demand Layering for Real-Time DNN Inference with Minimized Memory Usage
Demand Layering for Real-Time DNN Inference with Minimized Memory UsageIEEE Real-Time Systems Symposium (RTSS), 2022
Min-Zhi Ji
Saehanseul Yi
Chang-Mo Koo
Sol Ahn
Dongjoo Seo
N. Dutt
Jong-Chan Kim
244
24
0
08 Oct 2022
In-situ Model Downloading to Realize Versatile Edge AI in 6G Mobile
  Networks
In-situ Model Downloading to Realize Versatile Edge AI in 6G Mobile NetworksIEEE wireless communications (IEEE Wireless Commun.), 2022
Kaibin Huang
Hai Wu
Zhiyan Liu
Xiaojuan Qi
193
14
0
07 Oct 2022
Small Character Models Match Large Word Models for Autocomplete Under
  Memory Constraints
Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints
Ganesh Jawahar
Subhabrata Mukherjee
Debadeepta Dey
Muhammad Abdul-Mageed
L. Lakshmanan
C. C. T. Mendes
Gustavo de Rosa
S. Shah
115
0
0
06 Oct 2022
Communication-Efficient and Drift-Robust Federated Learning via Elastic
  Net
Communication-Efficient and Drift-Robust Federated Learning via Elastic Net
Seonhyeon Kim
Jiheon Woo
Daewon Seo
Yongjune Kim
FedML
224
3
0
06 Oct 2022
Previous
123...212223...717273
Next