Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
1510.00149
Cited By
v1
v2
v3
v4
v5 (latest)
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,628 papers shown
Title
FusionBench: A Unified Library and Comprehensive Benchmark for Deep Model Fusion
Anke Tang
Li Shen
Yong Luo
Enneng Yang
Di Lin
Dacheng Tao
Bo Du
Dacheng Tao
ELM
MoMe
VLM
366
38
0
05 Jun 2024
Toward Efficient Deep Spiking Neuron Networks:A Survey On Compression
Hui Xie
Ge Yang
Wenjuan Gao
203
1
0
03 Jun 2024
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Simla Burcu Harma
Ayan Chakraborty
Elizaveta Kostenok
Danila Mishin
Dongho Ha
...
Martin Jaggi
Ming Liu
Yunho Oh
Suvinay Subramanian
Amir Yazdanbakhsh
MQ
301
18
0
31 May 2024
Investigating Calibration and Corruption Robustness of Post-hoc Pruned Perception CNNs: An Image Classification Benchmark Study
Pallavi Mitra
Gesina Schwalbe
Nadja Klein
AAML
169
3
0
31 May 2024
Self-degraded contrastive domain adaptation for industrial fault diagnosis with bi-imbalanced data
Gecheng Chen
Zeyu Yang
Chengwen Luo
Jian-qiang Li
184
1
0
31 May 2024
LCQ: Low-Rank Codebook based Quantization for Large Language Models
Wen-Pu Cai
Wu-Jun Li
Wu-Jun Li
MQ
296
0
0
31 May 2024
Dual sparse training framework: inducing activation map sparsity via Transformed
ℓ
1
\ell1
ℓ
1
regularization
Xiaolong Yu
Cong Tian
172
3
0
30 May 2024
CiliaGraph: Enabling Expression-enhanced Hyper-Dimensional Computation in Ultra-Lightweight and One-Shot Graph Classification on Edge
Yuxi Han
Jihe Wang
Danghui Wang
201
1
0
29 May 2024
Efficient Model Compression for Hierarchical Federated Learning
Xi Zhu
Songcan Yu
Junbo Wang
Qinglin Yang
FedML
41
3
0
27 May 2024
Scorch: A Library for Sparse Deep Learning
Bobby Yan
Alexander J. Root
Trevor Gale
David Broman
Fredrik Kjolstad
198
2
0
27 May 2024
Extreme Compression of Adaptive Neural Images
Leo Hoshikawa
Marcos V. Conde
Takeshi Ohashi
Atsushi Irie
348
1
0
27 May 2024
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts
Mohammed Nowaz Rabbani Chowdhury
Meng Wang
Kaoutar El Maghraoui
Naigang Wang
Pin-Yu Chen
Christopher Carothers
MoE
340
10
0
26 May 2024
Pruning for Robust Concept Erasing in Diffusion Models
Tianyun Yang
Juan Cao
Chang Xu
279
24
0
26 May 2024
Online Resource Allocation for Edge Intelligence with Colocated Model Retraining and Inference
Huaiguang Cai
Zhi Zhou
Qianyi Huang
136
6
0
25 May 2024
BOLD: Boolean Logic Deep Learning
Van Minh Nguyen
Cristian Ocampo
Aymen Askri
Louis Leconte
Ba-Hien Tran
AI4CE
329
2
0
25 May 2024
PatchProt: Hydrophobic patch prediction using protein foundation models
Dea Gogishvili
Emmanuel Minois-Genin
Jan van Eck
Sanne Abeln
107
4
0
24 May 2024
Sparse maximal update parameterization: A holistic approach to sparse training dynamics
Nolan Dey
Shane Bergsma
Joel Hestness
220
7
0
24 May 2024
Embedding Compression for Efficient Re-Identification
Luke McDermott
74
0
0
23 May 2024
CoMERA: Computing- and Memory-Efficient Training via Rank-Adaptive Tensor Optimization
Neural Information Processing Systems (NeurIPS), 2024
Zi Yang
Samridhi Choudhary
Xinfeng Xie
Cao Gao
Siegfried Kunzmann
Zheng Zhang
VLM
312
15
0
23 May 2024
Efficient Multitask Dense Predictor via Binarization
Computer Vision and Pattern Recognition (CVPR), 2024
Yuzhang Shang
Dan Xu
Gaowen Liu
Ramana Rao Kompella
Yan Yan
MQ
AAML
307
6
0
23 May 2024
Distill-then-prune: An Efficient Compression Framework for Real-time Stereo Matching Network on Edge Devices
Baiyu Pan
Jichao Jiao
Jianxin Pang
Jun Cheng
151
6
0
20 May 2024
Efficiency optimization of large-scale language models based on deep learning in natural language processing tasks
Taiyuan Mei
Yun Zi
X. Cheng
Zijun Gao
Qi Wang
Haowei Yang
208
25
0
20 May 2024
Cost-Effective Fault Tolerance for CNNs Using Parameter Vulnerability Based Hardening and Pruning
IEEE International Symposium on On-Line Testing and Robust System Design (IOLTS), 2024
Mohammad Hasan Ahmadilivani
Seyedhamidreza Mousavi
J. Raik
Masoud Daneshtalab
M. Jenihhin
AAML
162
6
0
17 May 2024
Memory-efficient Energy-adaptive Inference of Pre-Trained Models on Batteryless Embedded Systems
Pietro Farina
Subrata Biswas
Eren Yildiz
Khakim Akhunov
Saad Ahmed
Bashima Islam
K. Yıldırım
183
7
0
16 May 2024
Unmasking Efficiency: Learning Salient Sparse Models in Non-IID Federated Learning
Riyasat Ohib
Bishal Thapaliya
Gintare Karolina Dziugaite
Jingyu Liu
Vince D. Calhoun
Sergey Plis
FedML
183
1
0
15 May 2024
Neural Network Compression for Reinforcement Learning Tasks
Scientific Reports (Sci Rep), 2024
Dmitry A. Ivanov
D. Larionov
Oleg V. Maslennikov
V. Voevodin
OffRL
AI4CE
225
7
0
13 May 2024
From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2024
Xue Geng
Zhe Wang
Chunyun Chen
Qing Xu
Kaixin Xu
...
Zhenghua Chen
M. Aly
Jie Lin
Ruibing Jin
Xiaoli Li
279
8
0
09 May 2024
Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost
International Conference on Learning Representations (ICLR), 2024
Yuan Gao
Weizhong Zhang
Tong Lu
Lin Ma
Jin-Gang Yu
Gui-Song Xia
Jiayi Ma
168
1
0
09 May 2024
Communication-Efficient Collaborative Perception via Information Filling with Codebook
Yue Hu
Juntong Peng
Si Liu
Junhao Ge
Si Liu
Siheng Chen
281
42
0
08 May 2024
Switchable Decision: Dynamic Neural Generation Networks
Shujian Zhang
Korawat Tanwisuth
Chengyue Gong
Pengcheng He
Mi Zhou
BDL
166
0
0
07 May 2024
Collage: Light-Weight Low-Precision Strategy for LLM Training
International Conference on Machine Learning (ICML), 2024
Tao Yu
Gaurav Gupta
Karthick Gopalswamy
Amith R. Mamidala
Hao Zhou
Jeffrey Huynh
Youngsuk Park
Ron Diamant
Hao Ding
Jun Huan
MQ
214
7
0
06 May 2024
Iterative Filter Pruning for Concatenation-based CNN Architectures
IEEE International Joint Conference on Neural Network (IJCNN), 2024
Svetlana Pavlitska
Oliver Bagge
Federico Nicolás Peccia
Toghrul Mammadov
J. Marius Zöllner
VLM
3DPC
146
7
0
04 May 2024
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning
International Conference on Machine Learning (ICML), 2024
Jing Xu
Jingzhao Zhang
210
10
0
04 May 2024
Torch2Chip: An End-to-end Customizable Deep Neural Network Compression and Deployment Toolkit for Prototype Hardware Accelerator Design
Conference on Machine Learning and Systems (MLSys), 2024
Jian Meng
Yuan Liao
Anupreetham Anupreetham
Ahmed Hassan
Shixing Yu
Han-Sok Suh
Xiaofeng Hu
Jae-sun Seo
MQ
158
2
0
02 May 2024
Enhancing User Experience in On-Device Machine Learning with Gated Compression Layers
Haiguang Li
Usama Pervaiz
Joseph Antognini
Michal Matuszak
Lawrence Au
Gilles Roux
T. Thormundsson
220
1
0
02 May 2024
COPAL: Continual Pruning in Large Language Generative Models
International Conference on Machine Learning (ICML), 2024
Srikanth Malla
Joon Hee Choi
Chiho Choi
VLM
CLL
163
6
0
02 May 2024
Learning a Sparse Neural Network using IHT
S. Damadi
Soroush Zolfaghari
Mahdi Rezaie
Jinglai Shen
187
2
0
29 Apr 2024
On TinyML and Cybersecurity: Electric Vehicle Charging Infrastructure Use Case
Fatemeh Dehrouyeh
Li Yang
F. Badrkhani Ajaei
Abdallah Shami
243
22
0
25 Apr 2024
AdaQAT: Adaptive Bit-Width Quantization-Aware Training
Cédric Gernigon
Silviu-Ioan Filip
Olivier Sentieys
Clément Coggiola
Mickael Bruno
137
6
0
22 Apr 2024
QCore: Data-Efficient, On-Device Continual Calibration for Quantized Models -- Extended Version
David Campos
Bin Yang
Tung Kieu
Miao Zhang
Chenjuan Guo
Christian S. Jensen
243
10
0
22 Apr 2024
Data-independent Module-aware Pruning for Hierarchical Vision Transformers
Yang He
Qiufeng Wang
ViT
203
7
0
21 Apr 2024
Parallel Decoding via Hidden Transfer for Lossless Large Language Model Acceleration
Pengfei Wu
Jiahao Liu
Zhuocheng Gong
Qifan Wang
Jinpeng Li
Jingang Wang
Xunliang Cai
Dongyan Zhao
179
3
0
18 Apr 2024
SNP: Structured Neuron-level Pruning to Preserve Attention Scores
Kyunghwan Shim
Jaewoong Yun
Shinkook Choi
126
2
0
18 Apr 2024
Energy-Efficient Uncertainty-Aware Biomass Composition Prediction at the Edge
Muhammad Zawish
Paul Albert
Flavio Esposito
Steven Davy
Lizy Abraham
169
1
0
17 Apr 2024
Efficient and accurate neural field reconstruction using resistive memory
Yifei Yu
Shaocong Wang
Woyu Zhang
Xinyuan Zhang
Xiuzhe Wu
...
Zhongrui Wang
Dashan Shang
Qi Liu
Kwang-Ting Cheng
Ming-Yuan Liu
164
1
0
15 Apr 2024
Hybrid FedGraph: An efficient hybrid federated learning algorithm using graph convolutional neural network
Jaeyeon Jang
Diego Klabjan
Veena Mendiratta
Fanfei Meng
FedML
171
1
0
15 Apr 2024
SQUAT: Stateful Quantization-Aware Training in Recurrent Spiking Neural Networks
Sreyes P. Venkatesh
Razvan Marinescu
Nhan Duy Truong
MQ
240
8
0
15 Apr 2024
Bullion: A Column Store for Machine Learning
Gang Liao
Ye Liu
Jianjun Chen
Daniel J. Abadi
180
9
0
13 Apr 2024
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
Jingxuan Xu
Wuyang Chen
Yao-Min Zhao
Yunchao Wei
VLM
229
1
0
11 Apr 2024
Lossless Acceleration of Large Language Model via Adaptive N-gram Parallel Decoding
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Jie Ou
Yueming Chen
Wenhong Tian
236
22
0
10 Apr 2024
Previous
1
2
3
...
9
10
11
...
71
72
73
Next