Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1510.00149
Cited By
v1
v2
v3
v4
v5 (latest)
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,631 papers shown
Developmental Plasticity-inspired Adaptive Pruning for Deep Spiking and Artificial Neural Networks
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Bing Han
Feifei Zhao
Yi Zeng
Guobin Shen
163
11
0
23 Nov 2022
FedDCT: Federated Learning of Large Convolutional Neural Networks on Resource Constrained Devices using Divide and Collaborative Training
IEEE Transactions on Network and Service Management (IEEE TNSM), 2022
Quan Nguyen
Hieu H. Pham
Kok-Seng Wong
Phi Le Nguyen
Truong Thao Nguyen
Minh N. Do
FedML
261
9
0
20 Nov 2022
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
International Conference on Machine Learning (ICML), 2022
Guangxuan Xiao
Ji Lin
Mickael Seznec
Hao Wu
Julien Demouth
Song Han
MQ
801
1,207
0
18 Nov 2022
Structured Pruning Adapters
Pattern Recognition (Pattern Recogn.), 2022
Lukas Hedegaard
Aman Alok
Juby Jose
Alexandros Iosifidis
277
15
0
17 Nov 2022
Is Smaller Always Faster? Tradeoffs in Compressing Self-Supervised Speech Transformers
Tzu-Quan Lin
Tsung-Huan Yang
Chun-Yao Chang
Kuang-Ming Chen
Tzu-hsun Feng
Hung-yi Lee
Hao Tang
295
6
0
17 Nov 2022
Structured Knowledge Distillation Towards Efficient and Compact Multi-View 3D Detection
Linfeng Zhang
Yukang Shi
Hung-Shuo Tai
Zhipeng Zhang
Yuan He
Ke Wang
Kaisheng Ma
264
4
0
14 Nov 2022
Pruning Very Deep Neural Network Channels for Efficient Inference
Yihui He
235
2
0
14 Nov 2022
Robust Training of Graph Neural Networks via Noise Governance
Web Search and Data Mining (WSDM), 2022
Siyi Qian
Haochao Ying
Renjun Hu
Jingbo Zhou
Jintai Chen
Benlin Liu
Jian Wu
NoLa
269
55
0
12 Nov 2022
Streaming, fast and accurate on-device Inverse Text Normalization for Automatic Speech Recognition
Spoken Language Technology Workshop (SLT), 2022
Yashesh Gaur
Nick Kibre
Jian Xue
Kangyuan Shu
Yuhui Wang
Issac Alphonso
Jinyu Li
Jiawei Liu
118
7
0
07 Nov 2022
RUBICON: A Framework for Designing Efficient Deep Learning-Based Genomic Basecallers
Genome Biology (GB), 2022
Gagandeep Singh
M. Alser
K. Denolf
Can Firtina
Alireza Khodamoradi
Meryem Banu Cavlak
Henk Corporaal
O. Mutlu
371
17
0
06 Nov 2022
Multi-Objective Evolutionary for Object Detection Mobile Architectures Search
Haichao Zhang
Jiashi Li
Xin Xia
K. Hao
Xuefeng Xiao
183
2
0
05 Nov 2022
Intriguing Properties of Compression on Multilingual Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Kelechi Ogueji
Orevaoghene Ahia
Gbemileke Onilude
Sebastian Gehrmann
Sara Hooker
Julia Kreutzer
302
15
0
04 Nov 2022
Soft Masking for Cost-Constrained Channel Pruning
European Conference on Computer Vision (ECCV), 2022
Ryan Humble
Maying Shen
J. Latorre
Eric Darve1
J. Álvarez
174
18
0
04 Nov 2022
Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Zhekai Zhang
Ji Lin
Chenlin Meng
Stefano Ermon
Song Han
Jun-Yan Zhu
DiffM
497
61
0
03 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Neural Information Processing Systems (NeurIPS), 2022
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
354
10
0
02 Nov 2022
Model Compression for DNN-based Speaker Verification Using Weight Quantization
Interspeech (Interspeech), 2022
Jingyu Li
W. Liu
Zhaoyang Zhang
Jiong Wang
Tan Lee
MQ
381
3
0
31 Oct 2022
FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition
Xingcheng Song
Di Wu
Binbin Zhang
Zhiyong Wu
Wenpeng Li
...
Peng Zhang
Zhendong Peng
Fuping Pan
Changbao Zhu
Zhongqin Wu
133
2
0
31 Oct 2022
LearningGroup: A Real-Time Sparse Training on FPGA via Learnable Weight Grouping for Multi-Agent Reinforcement Learning
International Conference on Field-Programmable Technology (ICFPT), 2022
Jenny Yang
Jaeuk Kim
Joo-Young Kim
196
2
0
29 Oct 2022
LOFT: Finding Lottery Tickets through Filter-wise Training
International Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Qihan Wang
Chen Dun
Fangshuo Liao
C. Jermaine
Anastasios Kyrillidis
181
3
0
28 Oct 2022
Class Based Thresholding in Early Exit Semantic Segmentation Networks
IEEE Signal Processing Letters (SPL), 2022
Alperen Görmez
Erdem Koyuncu
152
6
0
27 Oct 2022
Efficient ECG-based Atrial Fibrillation Detection via Parameterised Hypercomplex Neural Networks
European Signal Processing Conference (EUSIPCO), 2022
Leonie Basso
Zhao Ren
Wolfgang Nejdl
320
3
0
27 Oct 2022
Gradient-based Weight Density Balancing for Robust Dynamic Sparse Training
Mathias Parger
Alexander Ertl
Paul Eibensteiner
J. H. Mueller
Martin Winter
M. Steinberger
151
1
0
25 Oct 2022
Pruning's Effect on Generalization Through the Lens of Training and Regularization
Neural Information Processing Systems (NeurIPS), 2022
Tian Jin
Michael Carbin
Daniel M. Roy
Jonathan Frankle
Gintare Karolina Dziugaite
236
33
0
25 Oct 2022
Pushing the Efficiency Limit Using Structured Sparse Convolutions
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Vinay Kumar Verma
Nikhil Mehta
Shijing Si
Ricardo Henao
Lawrence Carin
183
3
0
23 Oct 2022
Towards Global Neural Network Abstractions with Locally-Exact Reconstruction
Neural Networks (NN), 2022
Edoardo Manino
I. Bessa
Lucas C. Cordeiro
225
3
0
21 Oct 2022
When Expressivity Meets Trainability: Fewer than
n
n
n
Neurons Can Work
Neural Information Processing Systems (NeurIPS), 2022
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Tian Ding
Jianfeng Yao
326
10
0
21 Oct 2022
Learning Robust Dynamics through Variational Sparse Gating
Neural Information Processing Systems (NeurIPS), 2022
A. Jain
Shivakanth Sujit
S. Joshi
Vincent Michalski
Danijar Hafner
Samira Ebrahimi Kahou
154
11
0
21 Oct 2022
Pruning by Active Attention Manipulation
Z. Babaiee
Lucas Liebenwein
Ramin Hasani
Daniela Rus
Radu Grosu
156
1
0
20 Oct 2022
Attaining Class-level Forgetting in Pretrained Model using Few Samples
European Conference on Computer Vision (ECCV), 2022
Pravendra Singh
Pratik Mazumder
M. A. Karim
VLM
CLL
MU
96
3
0
19 Oct 2022
Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Neural Information Processing Systems (NeurIPS), 2022
Muralidhar Andoorveedu
Zhanda Zhu
Bojian Zheng
Gennady Pekhimenko
185
8
0
19 Oct 2022
Approximating Continuous Convolutions for Deep Network Compression
British Machine Vision Conference (BMVC), 2022
Theo W. Costain
V. Prisacariu
175
0
0
17 Oct 2022
Packed-Ensembles for Efficient Uncertainty Estimation
International Conference on Learning Representations (ICLR), 2022
Olivier Laurent
Adrien Lafage
Enzo Tartaglione
Geoffrey Daniel
Jean-Marc Martinez
Andrei Bursuc
Gianni Franchi
OODD
464
40
0
17 Oct 2022
HQNAS: Auto CNN deployment framework for joint quantization and architecture search
Hongjiang Chen
Yang Wang
Leibo Liu
Shaojun Wei
Shouyi Yin
MQ
111
3
0
16 Oct 2022
The Effects of Partitioning Strategies on Energy Consumption in Distributed CNN Inference at The Edge
Erqian Tang
Xiaotian Guo
T. Stefanov
120
1
0
15 Oct 2022
Deep Differentiable Logic Gate Networks
Neural Information Processing Systems (NeurIPS), 2022
Felix Petersen
Christian Borgelt
Hilde Kuehne
Oliver Deussen
AI4CE
191
61
0
15 Oct 2022
Post-Training Quantization for Energy Efficient Realization of Deep Neural Networks
International Conference on Machine Learning and Applications (ICMLA), 2022
Cecilia Latotzke
Batuhan Balim
T. Gemmeke
MQ
75
3
0
14 Oct 2022
CAP: Correlation-Aware Pruning for Highly-Accurate Sparse Vision Models
Neural Information Processing Systems (NeurIPS), 2022
Denis Kuznedelev
Eldar Kurtic
Elias Frantar
Dan Alistarh
VLM
ViT
174
21
0
14 Oct 2022
Parameter-Efficient Masking Networks
Neural Information Processing Systems (NeurIPS), 2022
Yue Bai
Huan Wang
Xu Ma
Yitian Zhang
Zhiqiang Tao
Yun Fu
148
11
0
13 Oct 2022
Structural Pruning via Latency-Saliency Knapsack
Neural Information Processing Systems (NeurIPS), 2022
Maying Shen
Hongxu Yin
Pavlo Molchanov
Lei Mao
Jianna Liu
J. Álvarez
340
60
0
13 Oct 2022
SeKron: A Decomposition Method Supporting Many Factorization Structures
Marawan Gamal Abdel Hameed
A. Mosleh
Marzieh S. Tahaei
V. Nia
164
1
0
12 Oct 2022
SaiT: Sparse Vision Transformers through Adaptive Token Pruning
Ling Li
D. Thorsley
Joseph Hassoun
ViT
135
20
0
11 Oct 2022
Edge-Cloud Cooperation for DNN Inference via Reinforcement Learning and Supervised Learning
Tinghao Zhang
Zhijun Li
Yongrui Chen
Kwok-Yan Lam
Jun Zhao
189
5
0
11 Oct 2022
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
Neural Information Processing Systems (NeurIPS), 2022
Peng Mi
Li Shen
Tianhe Ren
Weihao Ye
Xiaoshuai Sun
Rongrong Ji
Dacheng Tao
AAML
270
84
0
11 Oct 2022
Deep learning model compression using network sensitivity and gradients
M. Sakthi
N. Yadla
Raj Pawate
172
2
0
11 Oct 2022
DeepPerform: An Efficient Approach for Performance Testing of Resource-Constrained Neural Networks
International Conference on Automated Software Engineering (ASE), 2022
Simin Chen
Mirazul Haque
Cong Liu
Wei Yang
209
24
0
10 Oct 2022
Advancing Model Pruning via Bi-level Optimization
Neural Information Processing Systems (NeurIPS), 2022
Yihua Zhang
Yuguang Yao
Parikshit Ram
Pu Zhao
Tianlong Chen
Min-Fong Hong
Yanzhi Wang
Sijia Liu
449
86
0
08 Oct 2022
Demand Layering for Real-Time DNN Inference with Minimized Memory Usage
IEEE Real-Time Systems Symposium (RTSS), 2022
Min-Zhi Ji
Saehanseul Yi
Chang-Mo Koo
Sol Ahn
Dongjoo Seo
N. Dutt
Jong-Chan Kim
244
24
0
08 Oct 2022
In-situ Model Downloading to Realize Versatile Edge AI in 6G Mobile Networks
IEEE wireless communications (IEEE Wireless Commun.), 2022
Kaibin Huang
Hai Wu
Zhiyan Liu
Xiaojuan Qi
193
14
0
07 Oct 2022
Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints
Ganesh Jawahar
Subhabrata Mukherjee
Debadeepta Dey
Muhammad Abdul-Mageed
L. Lakshmanan
C. C. T. Mendes
Gustavo de Rosa
S. Shah
115
0
0
06 Oct 2022
Communication-Efficient and Drift-Robust Federated Learning via Elastic Net
Seonhyeon Kim
Jiheon Woo
Daewon Seo
Yongjune Kim
FedML
224
3
0
06 Oct 2022
Previous
1
2
3
...
21
22
23
...
71
72
73
Next