v1v2v3v4v5 (latest)

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015

Song Han

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,631 papers shown

Developmental Plasticity-inspired Adaptive Pruning for Deep Spiking and Artificial Neural NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Bing Han

Feifei Zhao

Yi Zeng

Guobin Shen

163

23 Nov 2022

FedDCT: Federated Learning of Large Convolutional Neural Networks on Resource Constrained Devices using Divide and Collaborative TrainingIEEE Transactions on Network and Service Management (IEEE TNSM), 2022

261

20 Nov 2022

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language ModelsInternational Conference on Machine Learning (ICML), 2022

Song Han

801

1,207

18 Nov 2022

Structured Pruning AdaptersPattern Recognition (Pattern Recogn.), 2022

277

17 Nov 2022

Is Smaller Always Faster? Tradeoffs in Compressing Self-Supervised Speech Transformers

295

17 Nov 2022

Structured Knowledge Distillation Towards Efficient and Compact Multi-View 3D Detection

264

14 Nov 2022

Pruning Very Deep Neural Network Channels for Efficient Inference

Yihui He

235

14 Nov 2022

Robust Training of Graph Neural Networks via Noise GovernanceWeb Search and Data Mining (WSDM), 2022

Jintai Chen

269

12 Nov 2022

Streaming, fast and accurate on-device Inverse Text Normalization for Automatic Speech RecognitionSpoken Language Technology Workshop (SLT), 2022

118

07 Nov 2022

RUBICON: A Framework for Designing Efficient Deep Learning-Based Genomic BasecallersGenome Biology (GB), 2022

371

06 Nov 2022

Multi-Objective Evolutionary for Object Detection Mobile Architectures Search

183

05 Nov 2022

Intriguing Properties of Compression on Multilingual ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

302

04 Nov 2022

Soft Masking for Cost-Constrained Channel PruningEuropean Conference on Computer Vision (ECCV), 2022

174

04 Nov 2022

Efficient Spatially Sparse Inference for Conditional GANs and Diffusion ModelsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Song Han

Jun-Yan Zhu

DiffM

497

03 Nov 2022

Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech ProcessingNeural Information Processing Systems (NeurIPS), 2022

Kaizhi Qian

354

02 Nov 2022

Model Compression for DNN-based Speaker Verification Using Weight QuantizationInterspeech (Interspeech), 2022

381

31 Oct 2022

FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition

Xingcheng Song

Di Wu

Binbin Zhang

Zhiyong Wu

...

133

31 Oct 2022

LearningGroup: A Real-Time Sparse Training on FPGA via Learnable Weight Grouping for Multi-Agent Reinforcement LearningInternational Conference on Field-Programmable Technology (ICFPT), 2022

Jenny Yang

Jaeuk Kim

Joo-Young Kim

196

29 Oct 2022

LOFT: Finding Lottery Tickets through Filter-wise TrainingInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022

Qihan Wang

Chen Dun

Fangshuo Liao

C. Jermaine

Anastasios Kyrillidis

181

28 Oct 2022

Class Based Thresholding in Early Exit Semantic Segmentation NetworksIEEE Signal Processing Letters (SPL), 2022

Alperen Görmez

Erdem Koyuncu

152

27 Oct 2022

Efficient ECG-based Atrial Fibrillation Detection via Parameterised Hypercomplex Neural NetworksEuropean Signal Processing Conference (EUSIPCO), 2022

Leonie Basso

Zhao Ren

Wolfgang Nejdl

320

27 Oct 2022

Gradient-based Weight Density Balancing for Robust Dynamic Sparse Training

151

25 Oct 2022

Pruning's Effect on Generalization Through the Lens of Training and RegularizationNeural Information Processing Systems (NeurIPS), 2022

Gintare Karolina Dziugaite

236

25 Oct 2022

Pushing the Efficiency Limit Using Structured Sparse ConvolutionsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022

Lawrence Carin

183

23 Oct 2022

Towards Global Neural Network Abstractions with Locally-Exact ReconstructionNeural Networks (NN), 2022

Edoardo Manino

I. Bessa

Lucas C. Cordeiro

225

21 Oct 2022

When Expressivity Meets Trainability: Fewer than

n

Neurons Can WorkNeural Information Processing Systems (NeurIPS), 2022

326

21 Oct 2022

Learning Robust Dynamics through Variational Sparse GatingNeural Information Processing Systems (NeurIPS), 2022

Samira Ebrahimi Kahou

154

21 Oct 2022

Pruning by Active Attention Manipulation

Z. Babaiee

Lucas Liebenwein

Ramin Hasani

Daniela Rus

Radu Grosu

156

20 Oct 2022

Attaining Class-level Forgetting in Pretrained Model using Few SamplesEuropean Conference on Computer Vision (ECCV), 2022

19 Oct 2022

Tempo: Accelerating Transformer-Based Model Training through Memory Footprint ReductionNeural Information Processing Systems (NeurIPS), 2022

Muralidhar Andoorveedu

Zhanda Zhu

Bojian Zheng

Gennady Pekhimenko

185

19 Oct 2022

Approximating Continuous Convolutions for Deep Network CompressionBritish Machine Vision Conference (BMVC), 2022

Theo W. Costain

V. Prisacariu

175

17 Oct 2022

Packed-Ensembles for Efficient Uncertainty EstimationInternational Conference on Learning Representations (ICLR), 2022

464

17 Oct 2022

HQNAS: Auto CNN deployment framework for joint quantization and architecture search

111

16 Oct 2022

The Effects of Partitioning Strategies on Energy Consumption in Distributed CNN Inference at The Edge

Erqian Tang

Xiaotian Guo

T. Stefanov

120

15 Oct 2022

Deep Differentiable Logic Gate NetworksNeural Information Processing Systems (NeurIPS), 2022

191

15 Oct 2022

Post-Training Quantization for Energy Efficient Realization of Deep Neural NetworksInternational Conference on Machine Learning and Applications (ICMLA), 2022

Cecilia Latotzke

Batuhan Balim

T. Gemmeke

14 Oct 2022

CAP: Correlation-Aware Pruning for Highly-Accurate Sparse Vision ModelsNeural Information Processing Systems (NeurIPS), 2022

Denis Kuznedelev

Eldar Kurtic

Elias Frantar

Dan Alistarh

VLM ViT

174

14 Oct 2022

Parameter-Efficient Masking NetworksNeural Information Processing Systems (NeurIPS), 2022

Huan Wang

148

13 Oct 2022

Structural Pruning via Latency-Saliency KnapsackNeural Information Processing Systems (NeurIPS), 2022

340

13 Oct 2022

SeKron: A Decomposition Method Supporting Many Factorization Structures

Marawan Gamal Abdel Hameed

A. Mosleh

Marzieh S. Tahaei

V. Nia

164

12 Oct 2022

SaiT: Sparse Vision Transformers through Adaptive Token Pruning

135

11 Oct 2022

Edge-Cloud Cooperation for DNN Inference via Reinforcement Learning and Supervised Learning

Tinghao Zhang

Zhijun Li

Yongrui Chen

Kwok-Yan Lam

Jun Zhao

189

11 Oct 2022

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation ApproachNeural Information Processing Systems (NeurIPS), 2022

Li Shen

270

11 Oct 2022

Deep learning model compression using network sensitivity and gradients

M. Sakthi

N. Yadla

Raj Pawate

172

11 Oct 2022

DeepPerform: An Efficient Approach for Performance Testing of Resource-Constrained Neural NetworksInternational Conference on Automated Software Engineering (ASE), 2022

209

10 Oct 2022

Advancing Model Pruning via Bi-level OptimizationNeural Information Processing Systems (NeurIPS), 2022

449

08 Oct 2022

Demand Layering for Real-Time DNN Inference with Minimized Memory UsageIEEE Real-Time Systems Symposium (RTSS), 2022

244

08 Oct 2022

In-situ Model Downloading to Realize Versatile Edge AI in 6G Mobile NetworksIEEE wireless communications (IEEE Wireless Commun.), 2022

Kaibin Huang

Hai Wu

Zhiyan Liu

Xiaojuan Qi

193

07 Oct 2022

Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints

Ganesh Jawahar

Subhabrata Mukherjee

Debadeepta Dey

Muhammad Abdul-Mageed

115

06 Oct 2022

Communication-Efficient and Drift-Robust Federated Learning via Elastic Net

224

06 Oct 2022