Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1510.00149
Cited By
v1
v2
v3
v4
v5 (latest)
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,629 papers shown
Title
Leveraging Angular Distributions for Improved Knowledge Distillation
Eunyeong Jeon
Hongjun Choi
Ankita Shukla
Pavan Turaga
135
8
0
27 Feb 2023
Efficient Multitask Learning on Resource-Constrained Systems
Yubo Luo
Le Zhang
Zhenyu Wang
S. Nirjon
163
9
0
25 Feb 2023
A Unified Framework for Soft Threshold Pruning
International Conference on Learning Representations (ICLR), 2023
Yanqing Chen
Zhengyu Ma
Wei Fang
Xiawu Zheng
Zhaofei Yu
Yonghong Tian
243
23
0
25 Feb 2023
Debiased Distillation by Transplanting the Last Layer
Jiwoon Lee
Jaeho Lee
135
5
0
22 Feb 2023
Device Tuning for Multi-Task Large Model
Penghao Jiang
Xuanchen Hou
Y. Zhou
87
0
0
21 Feb 2023
Oriented Object Detection in Optical Remote Sensing Images using Deep Learning: A Survey
Artificial Intelligence Review (AIR), 2023
Kunlin Wang
Zi Wang
Zhang Li
Ang Su
Xichao Teng
Minhao Liu
Qifeng Yu
Qifeng Yu
ObjD
629
27
0
21 Feb 2023
HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers
International Conference on Learning Representations (ICLR), 2023
Chen Liang
Haoming Jiang
Zheng Li
Xianfeng Tang
Bin Yin
Tuo Zhao
VLM
275
29
0
19 Feb 2023
Rethinking Data-Free Quantization as a Zero-Sum Game
AAAI Conference on Artificial Intelligence (AAAI), 2023
Biao Qian
Yang Wang
Richang Hong
Meng Wang
MQ
218
23
0
19 Feb 2023
Moby: Empowering 2D Models for Efficient Point Cloud Analytics on the Edge
ACM Multimedia (ACM MM), 2023
Jingzong Li
Yik Hong Cai
Libin Liu
Yushun Mao
Chun Jason Xue
Hongchang Xu
258
5
0
18 Feb 2023
VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUs
International Symposium on High-Performance Computer Architecture (HPCA), 2023
Geonhwa Jeong
S. Damani
Abhimanyu Bambhaniya
Eric Qin
C. Hughes
S. Subramoney
Hyesoon Kim
T. Krishna
MoE
283
33
0
17 Feb 2023
Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators
Nature Communications (Nat. Commun.), 2023
Malte J. Rasch
C. Mackin
Corey Lammie
An Chen
A. Fasoli
...
P. Narayanan
H. Tsai
G. Burr
Abu Sebastian
Vijay Narayanan
195
132
0
16 Feb 2023
TFormer: A Transmission-Friendly ViT Model for IoT Devices
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2023
Zhichao Lu
Chuntao Ding
Felix Juefei Xu
Vishnu Boddeti
Shangguang Wang
Yun Yang
180
21
0
15 Feb 2023
Towards Optimal Compression: Joint Pruning and Quantization
Ben Zandonati
Glenn Bucagu
Adrian Alan Pol
M. Pierini
Olya Sirkin
Tal Kopetz
MQ
313
5
0
15 Feb 2023
Workload-Balanced Pruning for Sparse Spiking Neural Networks
IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI), 2023
Ruokai Yin
Youngeun Kim
Yuhang Li
Abhishek Moitra
Nitin Satpute
Anna Hambitzer
Priyadarshini Panda
224
26
0
13 Feb 2023
Simple Hardware-Efficient Long Convolutions for Sequence Modeling
International Conference on Machine Learning (ICML), 2023
Daniel Y. Fu
Elliot L. Epstein
Eric N. D. Nguyen
A. Thomas
Michael Zhang
Tri Dao
Atri Rudra
Christopher Ré
190
66
0
13 Feb 2023
Sneaky Spikes: Uncovering Stealthy Backdoor Attacks in Spiking Neural Networks with Neuromorphic Data
Network and Distributed System Security Symposium (NDSS), 2023
Gorka Abad
Oguzhan Ersoy
S. Picek
A. Urbieta
AAML
167
26
0
13 Feb 2023
Autoselection of the Ensemble of Convolutional Neural Networks with Second-Order Cone Programming
Social Science Research Network (SSRN), 2023
Buse Çisil Güldoğuş
Abdullah Nazhat Abdullah
Muhammad Ammar Ali
Süreyya Özögür-Akyüz
111
1
0
12 Feb 2023
Pruning Deep Neural Networks from a Sparsity Perspective
International Conference on Learning Representations (ICLR), 2023
Enmao Diao
G. Wang
Jiawei Zhan
Yuhong Yang
Jie Ding
Vahid Tarokh
323
41
0
11 Feb 2023
Offsite-Tuning: Transfer Learning without Full Model
Guangxuan Xiao
Ji Lin
Song Han
198
98
0
09 Feb 2023
SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks
International Conference on Machine Learning (ICML), 2023
Mahdi Nikdan
Tommaso Pegolotti
Eugenia Iofinova
Eldar Kurtic
Dan Alistarh
178
12
0
09 Feb 2023
Towards Fairer and More Efficient Federated Learning via Multidimensional Personalized Edge Models
IEEE International Joint Conference on Neural Network (IJCNN), 2023
Yingchun Wang
Jingcai Guo
Jie Zhang
Song Guo
Weizhan Zhang
Qinghua Zheng
FedML
362
11
0
09 Feb 2023
Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement Learning
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Yingchun Wang
Jingcai Guo
Song Guo
Weizhan Zhang
MQ
185
23
0
09 Feb 2023
PFGM++: Unlocking the Potential of Physics-Inspired Generative Models
International Conference on Machine Learning (ICML), 2023
Yilun Xu
Ziming Liu
Yonglong Tian
Shangyuan Tong
Max Tegmark
Tommi Jaakkola
AI4CE
DiffM
206
88
0
08 Feb 2023
CRAFT: Criticality-Aware Fault-Tolerance Enhancement Techniques for Emerging Memories-Based Deep Neural Networks
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (IEEE TCAD), 2023
Thai-Hoang Nguyen
Muhammad Imran
Jaehyuk Choi
Joongseob Yang
52
3
0
08 Feb 2023
What Matters In The Structured Pruning of Generative Language Models?
Michael Santacroce
Zixin Wen
Yelong Shen
Yuan-Fang Li
177
36
0
07 Feb 2023
Ten Lessons We Have Learned in the New "Sparseland": A Short Handbook for Sparse Neural Network Researchers
Shiwei Liu
Zinan Lin
381
34
0
06 Feb 2023
Towards Implementing Energy-aware Data-driven Intelligence for Smart Health Applications on Mobile Platforms
G. D. Samaraweera
Hung Nguyen
Hadi Zanddizari
Behnam Zeinali
Jerome Chang
120
0
0
01 Feb 2023
UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers
International Conference on Machine Learning (ICML), 2023
Dachuan Shi
Chaofan Tao
Ying Jin
Zhendong Yang
Chun Yuan
Yuan Liu
VLM
ViT
337
54
0
31 Jan 2023
Self-Compressing Neural Networks
Szabolcs Cséfalvay
J. Imber
157
3
0
30 Jan 2023
DepGraph: Towards Any Structural Pruning
Computer Vision and Pattern Recognition (CVPR), 2023
Gongfan Fang
Xinyin Ma
Weilong Dai
Michael Bi Mi
Xinchao Wang
GNN
379
404
0
30 Jan 2023
Towards Inference Efficient Deep Ensemble Learning
AAAI Conference on Artificial Intelligence (AAAI), 2023
Ziyue Li
Kan Ren
Yifan Yang
Xinyang Jiang
Yuqing Yang
Dongsheng Li
BDL
134
17
0
29 Jan 2023
Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases
International Conference on Machine Learning (ICML), 2023
Xiaoxia Wu
Cheng-rong Li
Reza Yazdani Aminabadi
Z. Yao
Yuxiong He
MQ
187
33
0
27 Jan 2023
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
International Conference on Machine Learning (ICML), 2023
Max Ryabinin
Tim Dettmers
Michael Diskin
Alexander Borzunov
MoE
316
55
0
27 Jan 2023
Voting from Nearest Tasks: Meta-Vote Pruning of Pre-trained Models for Downstream Tasks
Haiyan Zhao
Tianyi Zhou
Guodong Long
Jing Jiang
Chengqi Zhang
162
1
0
27 Jan 2023
Open Problems in Applied Deep Learning
M. Raissi
AI4CE
219
3
0
26 Jan 2023
BiBench: Benchmarking and Analyzing Network Binarization
International Conference on Machine Learning (ICML), 2023
Haotong Qin
Mingyuan Zhang
Yifu Ding
Aoyu Li
Zhongang Cai
Ziwei Liu
Feng Yu
Xianglong Liu
MQ
AAML
265
49
0
26 Jan 2023
Low-Rank Winograd Transformation for 3D Convolutional Neural Networks
Science China Information Sciences (Sci China Inf Sci), 2023
Ziran Qin
Mingbao Lin
Weiyao Lin
3DPC
149
3
0
26 Jan 2023
Rewarded meta-pruning: Meta Learning with Rewards for Channel Pruning
Athul Shibu
Abhishek Kumar
Heechul Jung
Dong-Gyu Lee
186
2
0
26 Jan 2023
PowerQuant: Automorphism Search for Non-Uniform Quantization
International Conference on Learning Representations (ICLR), 2023
Edouard Yvinec
Arnaud Dapogny
Matthieu Cord
Kévin Bailly
MQ
156
22
0
24 Jan 2023
Ensemble Transfer Learning for Multilingual Coreference Resolution
T. Lai
Heng Ji
147
3
0
22 Jan 2023
Accelerating and Compressing Deep Neural Networks for Massive MIMO CSI Feedback
O. Erak
H. Abou-zeid
141
8
0
20 Jan 2023
Getting Away with More Network Pruning: From Sparsity to Geometry and Linear Regions
Integration of AI and OR Techniques in Constraint Programming (CPAIOR), 2023
Junyang Cai
Khai-Nguyen Nguyen
Nishant Shrestha
Aidan Good
Ruisen Tu
Xin Yu
Shandian Zhe
Thiago Serra
MLT
268
11
0
19 Jan 2023
Quantum HyperNetworks: Training Binary Neural Networks in Quantum Superposition
Juan Carrasquilla
Mohamed Hibat-Allah
E. Inack
Alireza Makhzani
Kirill Neklyudov
Graham Taylor
G. Torlai
MQ
228
11
0
19 Jan 2023
Scaling Deep Networks with the Mesh Adaptive Direct Search algorithm
Dounia Lakhmiri
Mahdi Zolnouri
V. Nia
C. Tribes
Sébastien Le Digabel
106
0
0
17 Jan 2023
RedBit: An End-to-End Flexible Framework for Evaluating the Accuracy of Quantized CNNs
A. M. Ribeiro-dos-Santos
João Dinis Ferreira
O. Mutlu
G. Falcão
MQ
232
3
0
15 Jan 2023
GOHSP: A Unified Framework of Graph and Optimization-based Heterogeneous Structured Pruning for Vision Transformer
AAAI Conference on Artificial Intelligence (AAAI), 2023
Miao Yin
Burak Uzkent
Yilin Shen
Hongxia Jin
Bo Yuan
ViT
272
20
0
13 Jan 2023
Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Pruning
Huan Wang
Can Qin
Yue Bai
Yun Fu
241
25
0
12 Jan 2023
AnycostFL: Efficient On-Demand Federated Learning over Heterogeneous Edge Devices
IEEE Conference on Computer Communications (IEEE INFOCOM), 2023
Peichun Li
Guoliang Cheng
Xumin Huang
Jiawen Kang
Rong Yu
Yuan Wu
Miao Pan
FedML
225
27
0
08 Jan 2023
Does compressing activations help model parallel training?
Conference on Machine Learning and Systems (MLSys), 2023
S. Bian
Dacheng Li
Hongyi Wang
Eric P. Xing
Shivaram Venkataraman
195
12
0
06 Jan 2023
A Theory of I/O-Efficient Sparse Neural Network Inference
Niels Gleinig
Tal Ben-Nun
Torsten Hoefler
137
0
0
03 Jan 2023
Previous
1
2
3
...
19
20
21
...
71
72
73
Next