Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1510.00149
Cited By
v1
v2
v3
v4
v5 (latest)
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,628 papers shown
Title
Data-Free Quantization via Mixed-Precision Compensation without Fine-Tuning
Pattern Recognition (Pattern Recogn.), 2023
Jun Chen
Shipeng Bai
Tianxin Huang
Mengmeng Wang
Guanzhong Tian
Y. Liu
MQ
277
24
0
02 Jul 2023
Filter Pruning for Efficient CNNs via Knowledge-driven Differential Filter Sampler
Shaohui Lin
Hao Wu
Jiao Xie
Baochang Zhang
Chunjiang Ge
Zhou Yu
Jungong Han
David Doermann
162
4
0
01 Jul 2023
Miniaturized Graph Convolutional Networks with Topologically Consistent Pruning
H. Sahbi
127
0
0
30 Jun 2023
Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Peng Mi
Li Shen
Tianhe Ren
Weihao Ye
Tianshuo Xu
Xiaoshuai Sun
Tongliang Liu
Rongrong Ji
Dacheng Tao
AAML
210
2
0
30 Jun 2023
OSP: Boosting Distributed Model Training with 2-stage Synchronization
International Conference on Parallel Processing (ICPP), 2023
Zixuan Chen
Lei Shi
Xuandong Liu
Jiahui Li
Sen Liu
Yang Xu
267
5
0
29 Jun 2023
DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN Inference
International Conference on High Performance Computing (HiPC), 2023
Bahareh Khabbazan
Marc Riera
Antonio González
MQ
175
3
0
28 Jun 2023
SparseOptimizer: Sparsify Language Models through Moreau-Yosida Regularization and Accelerate via Compiler Co-design
Fu-Ming Guo
MoE
194
0
0
27 Jun 2023
When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions
Weiming Zhuang
Chen Chen
Lingjuan Lyu
Chong Chen
Yaochu Jin
Lingjuan Lyu
AIFin
AI4CE
629
119
0
27 Jun 2023
Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
International Conference on Learning Representations (ICLR), 2023
Anna Bair
Hongxu Yin
Maying Shen
Pavlo Molchanov
J. Álvarez
248
17
0
25 Jun 2023
A Survey on Graph Neural Network Acceleration: Algorithms, Systems, and Customized Hardware
Shichang Zhang
Atefeh Sohrabizadeh
Cheng Wan
Zijie Huang
Ziniu Hu
Yewen Wang
Yingyan Lin
Lin
Jason Cong
Luke Huan
GNN
AI4CE
193
32
0
24 Jun 2023
H
2
_2
2
O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Neural Information Processing Systems (NeurIPS), 2023
Zhenyu Zhang
Ying Sheng
Wanrong Zhu
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zinan Lin
Beidi Chen
VLM
683
463
0
24 Jun 2023
Maintaining Plasticity in Deep Continual Learning
Shibhansh Dohare
J. F. Hernandez-Garcia
Parash Rahman
A. Rupam Mahmood
Richard S. Sutton
KELM
CLL
349
36
0
23 Jun 2023
Swin-Free: Achieving Better Cross-Window Attention and Efficiency with Size-varying Window
Jinkyu Koo
John Yang
Le An
Gwenaelle Cunha Sergio
Su Inn Park
ViT
106
0
0
23 Jun 2023
Binary domain generalization for sparsifying binary neural networks
Riccardo Schiavone
Francesco Galati
Maria A. Zuluaga
MQ
181
1
0
23 Jun 2023
Neural Network Pruning for Real-time Polyp Segmentation
Annual Conference on Medical Image Understanding and Analysis (MIUA), 2023
Suman Sapkota
Pranav Poudel
Sudarshan Regmi
Bibek Panthi
Binod Bhattarai
MedIm
167
0
0
22 Jun 2023
MultiTASC: A Multi-Tenancy-Aware Scheduler for Cascaded DNN Inference at the Consumer Edge
International Symposium on Computers and Communications (ISCC), 2023
Sokratis Nikolaidis
Stylianos I. Venieris
I. Venieris
169
2
0
22 Jun 2023
A Simple and Effective Pruning Approach for Large Language Models
International Conference on Learning Representations (ICLR), 2023
Mingjie Sun
Zhuang Liu
Anna Bair
J. Zico Kolter
461
637
0
20 Jun 2023
Towards Environmentally Equitable AI via Geographical Load Balancing
Energy-Efficient Computing and Networking (e-Energy), 2023
Pengfei Li
Jianyi Yang
Adam Wierman
Shaolei Ren
212
22
0
20 Jun 2023
Dynamic Perceiver for Efficient Visual Recognition
IEEE International Conference on Computer Vision (ICCV), 2023
Yizeng Han
Dongchen Han
Zeyu Liu
Yulin Wang
Xuran Pan
Yifan Pu
Chaorui Deng
Junlan Feng
Qing Xiao
Gao Huang
265
41
0
20 Jun 2023
AI Clinics on Mobile (AICOM): Universal AI Doctors for the Underserved and Hard-to-Reach
Tim Tianyi Yang
T. Yang
Na An
Ao Kong
Shaoshan Liu
Xue Liu
78
2
0
17 Jun 2023
HiNeRV: Video Compression with Hierarchical Encoding-based Neural Representation
Neural Information Processing Systems (NeurIPS), 2023
Ho Man Kwan
Ge Gao
Fan Zhang
Andrew Gower
David Bull
230
75
0
16 Jun 2023
Lightweight Attribute Localizing Models for Pedestrian Attribute Recognition
IEEE Intelligent Systems (IEEE Intell. Syst.), 2023
Ashish Jha
Dimitrii Ermilov
Konstantin Sobolev
Anh-Huy Phan
S. Ahmadi-Asl
...
Imran N. Junejo
Z. Aghbari
Thar Baker
A. Khedr
A. Cichocki
CVBM
127
3
0
16 Jun 2023
[Experiments & Analysis] Evaluating the Feasibility of Sampling-Based Techniques for Training Multilayer Perceptrons
International Conference on Extending Database Technology (EDBT), 2023
Sana Ebrahimi
Rishi Advani
Abolfazl Asudeh
199
0
0
15 Jun 2023
Understanding Parameter Sharing in Transformers
Ye Lin
Mingxuan Wang
Zhexi Zhang
Xiaohui Wang
Tong Xiao
Jingbo Zhu
MoE
203
4
0
15 Jun 2023
Neural Network Compression using Binarization and Few Full-Precision Weights
Information Sciences (Inf. Sci.), 2023
F. M. Nardini
Cosimo Rulli
Salvatore Trani
Rossano Venturini
MQ
236
1
0
15 Jun 2023
High-performance deep spiking neural networks with 0.3 spikes per neuron
Nature Communications (Nat. Commun.), 2023
A. Stanojević
Stanislaw Wo'zniak
G. Bellec
G. Cherubini
A. Pantazi
W. Gerstner
268
47
0
14 Jun 2023
SqueezeLLM: Dense-and-Sparse Quantization
International Conference on Machine Learning (ICML), 2023
Sehoon Kim
Coleman Hooper
A. Gholami
Zhen Dong
Xiuyu Li
Sheng Shen
Michael W. Mahoney
Kurt Keutzer
MQ
438
252
0
13 Jun 2023
RAMAN: A Re-configurable and Sparse tinyML Accelerator for Inference on Edge
IEEE Internet of Things Journal (IEEE IoT J.), 2023
Adithya Krishna
Srikanth Rohit Nudurupati
Chandana D G
Pritesh Dwivedi
André van Schaik
M. Mehendale
Chetan Singh Thakur
101
22
0
10 Jun 2023
FalconNet: Factorization for the Light-weight ConvNets
International Conference on Neural Information Processing (ICONIP), 2023
Zhicheng Cai
Qiu Shen
308
20
0
10 Jun 2023
MobileNMT: Enabling Translation in 15MB and 30ms
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Ye Lin
Xiaohui Wang
Zhexi Zhang
Mingxuan Wang
Tong Xiao
Jingbo Zhu
MQ
163
4
0
07 Jun 2023
CFDP: Common Frequency Domain Pruning
Samir Khaki
Weihan Luo
3DV
219
6
0
07 Jun 2023
The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Neural Information Processing Systems (NeurIPS), 2023
Ajay Jaiswal
Shiwei Liu
Tianlong Chen
Zinan Lin
VLM
239
44
0
06 Jun 2023
Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection Capability
International Conference on Machine Learning (ICML), 2023
Jianing Zhu
Hengzhuang Li
Jiangchao Yao
Tongliang Liu
Jianliang Xu
Bo Han
OODD
184
18
0
06 Jun 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Conference on Machine Learning and Systems (MLSys), 2023
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDL
MQ
799
927
0
01 Jun 2023
FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization
International Conference on Machine Learning (ICML), 2023
J. H. Lee
Jeonghoon Kim
S. Kwon
Dongsoo Lee
MQ
395
49
0
01 Jun 2023
Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Interspeech (Interspeech), 2023
Huiqiang Jiang
Li Zhang
Yuang Li
Yu-Huan Wu
Shijie Cao
Ting Cao
Yuqing Yang
Jinyu Li
Mao Yang
Lili Qiu
CVBM
176
17
0
31 May 2023
Vision Transformers for Mobile Applications: A Short Survey
Nahid Alam
Steven Kolawole
S. Sethi
Nishant Bansali
Karina Nguyen
ViT
145
4
0
30 May 2023
Budget-Aware Graph Convolutional Network Design using Probabilistic Magnitude Pruning
H. Sahbi
131
0
0
30 May 2023
Compact Real-time Radiance Fields with Neural Codebook
IEEE International Conference on Multimedia and Expo (ICME), 2023
Lingzhi Li
Zhongshu Wang
Zhen Shen
Li Shen
Ping Tan
127
8
0
29 May 2023
A Transfer Learning and Explainable Solution to Detect mpox from Smartphones images
Pervasive and Mobile Computing (PMC), 2023
M. Campana
Marco Colussi
Franca Delmastro
S. Mascetti
Elena Pagani
148
16
0
29 May 2023
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
International Conference on Machine Learning (ICML), 2023
Dachuan Shi
Chaofan Tao
Anyi Rao
Zhendong Yang
Chun Yuan
Yuan Liu
VLM
414
36
0
27 May 2023
Input-Aware Dynamic Timestep Spiking Neural Networks for Efficient In-Memory Computing
Design Automation Conference (DAC), 2023
Yuhang Li
Abhishek Moitra
Tamar Geller
Priyadarshini Panda
121
26
0
27 May 2023
COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models
International Conference on Machine Learning (ICML), 2023
Jinqi Xiao
Miao Yin
Yu Gong
Xiao Zang
Jian Ren
Bo Yuan
VLM
ViT
299
16
0
26 May 2023
Improving Knowledge Distillation via Regularizing Feature Norm and Direction
Yuzhu Wang
Lechao Cheng
Manni Duan
Yongheng Wang
Zunlei Feng
Shu Kong
186
28
0
26 May 2023
CUEING: a lightweight model to Capture hUman attEntion In driviNG
Linfeng Liang
Yao Deng
Yang Zhang
Jianchao Lu
Chen Wang
Quan Z. Sheng
Xi Zheng
228
3
0
25 May 2023
RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models
David Qiu
David Rim
Shaojin Ding
Oleg Rybakov
Yanzhang He
MQ
175
4
0
24 May 2023
PruMUX: Augmenting Data Multiplexing with Model Compression
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Yushan Su
Vishvak Murahari
Karthik Narasimhan
Keqin Li
210
3
0
24 May 2023
Combining Multi-Objective Bayesian Optimization with Reinforcement Learning for TinyML
ACM Transactions on Evolutionary Learning and Optimization (TELO), 2023
M. Deutel
G. Kontes
Christopher Mutschler
Jürgen Teich
458
3
0
23 May 2023
Layer-adaptive Structured Pruning Guided by Latency
Siyuan Pan
Linna Zhang
Jie Zhang
Xiaoshuang Li
Liang Hou
Xiaobing Tu
159
2
0
23 May 2023
Revisiting Data Augmentation in Model Compression: An Empirical and Comprehensive Study
IEEE International Joint Conference on Neural Network (IJCNN), 2023
Muzhou Yu
Linfeng Zhang
Kaisheng Ma
172
2
0
22 May 2023
Previous
1
2
3
...
16
17
18
...
71
72
73
Next