Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1510.00149
Cited By
v1
v2
v3
v4
v5 (latest)
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,628 papers shown
Title
EEG Emotion Copilot: Optimizing Lightweight LLMs for Emotional EEG Interpretation with Assisted Medical Record Generation
Neural Networks (NN), 2024
Hongyu Chen
Weiming Zeng
Chong Chen
Luhui Cai
Haiwei Yang
...
Wei Zhang
Yuchen Ren
Hongjie Yan
W. Siok
Nizhuan Wang
276
0
0
30 Sep 2024
MicroFlow: An Efficient Rust-Based Inference Engine for TinyML
Internet of Things (IoT), 2024
Matteo Carnelos
Francesco Pasti
Nicola Bellotto
230
5
0
28 Sep 2024
Value-Based Deep Multi-Agent Reinforcement Learning with Dynamic Sparse Training
Neural Information Processing Systems (NeurIPS), 2024
Pihe Hu
Shaolong Li
Zhuoran Li
L. Pan
Longbo Huang
154
1
0
28 Sep 2024
Efficient Noise Mitigation for Enhancing Inference Accuracy in DNNs on Mixed-Signal Accelerators
Seyedarmin Azizi
Mohammad Erfan Sadeghi
M. Kamal
Massoud Pedram
153
2
0
27 Sep 2024
Mitigating Selection Bias with Node Pruning and Auxiliary Options
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Hyeong Kyu Choi
Weijie Xu
Chi Xue
Stephanie Eckman
Chandan K. Reddy
366
10
0
27 Sep 2024
Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores
Asia and South Pacific Design Automation Conference (ASP-DAC), 2024
Shaobo Ma
Chao Fang
Haikuo Shao
Zhongfeng Wang
277
5
0
26 Sep 2024
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Neural Information Processing Systems (NeurIPS), 2024
Gongfan Fang
Hongxu Yin
Saurav Muralidharan
Greg Heinrich
Jeff Pool
Jan Kautz
Pavlo Molchanov
Xinchao Wang
145
31
0
26 Sep 2024
SPAQ-DL-SLAM: Towards Optimizing Deep Learning-based SLAM for Resource-Constrained Embedded Platforms
International Conference on Control, Automation, Robotics and Vision (ICARCV), 2024
Niraj Pudasaini
Muhammad Abdullah Hanif
Mohamed Bennai
177
4
0
22 Sep 2024
CFSP: An Efficient Structured Pruning Framework for LLMs with Coarse-to-Fine Activation Information
International Conference on Computational Linguistics (COLING), 2024
Yuxin Wang
Minghua Ma
Zekun Wang
Jingchang Chen
Huiming Fan
Liping Shan
Qing Yang
Dongliang Xu
Ming Liu
Bing Qin
151
6
0
20 Sep 2024
Green Federated Learning: A new era of Green Aware AI
ACM Computing Surveys (ACM CSUR), 2024
Dipanwita Thakur
Antonella Guzzo
Giancarlo Fortino
Francesco Piccialli
AI4CE
390
28
0
19 Sep 2024
Robust Training of Neural Networks at Arbitrary Precision and Sparsity
Chengxi Ye
Grace Chu
Yanfeng Liu
Yichi Zhang
Lukasz Lew
Li Zhang
Mark Sandler
Andrew G. Howard
MQ
154
2
0
14 Sep 2024
S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training
Neural Information Processing Systems (NeurIPS), 2024
Yuezhou Hu
Jun-Jie Zhu
Jianfei Chen
355
5
0
13 Sep 2024
Expediting and Elevating Large Language Model Reasoning via Hidden Chain-of-Thought Decoding
Tianqiao Liu
Zui Chen
Zitao Liu
Mi Tian
Weiqi Luo
LRM
122
9
0
13 Sep 2024
NVRC: Neural Video Representation Compression
Neural Information Processing Systems (NeurIPS), 2024
Ho Man Kwan
Ge Gao
Fan Zhang
Andrew Gower
David Bull
170
15
0
11 Sep 2024
HESSO: Towards Automatic Efficient and User Friendly Any Neural Network Training and Pruning
Tianyi Chen
Xiaoyi Qu
David Aponte
Colby R. Banbury
Jongwoo Ko
Tianyu Ding
Yong Ma
Vladimir Lyapunov
Ilya Zharkov
Luming Liang
388
2
0
11 Sep 2024
Towards Energy-Efficiency by Navigating the Trilemma of Energy, Latency, and Accuracy
International Symposium on Mixed and Augmented Reality (ISMAR), 2024
Boyuan Tian
Yihan Pang
Muhammad Huzaifa
Shenlong Wang
Sarita Adve
205
3
0
06 Sep 2024
Panoptic Perception for Autonomous Driving: A Survey
Yunge Li
Lanyu Xu
235
3
0
27 Aug 2024
K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences
Computer Vision and Pattern Recognition (CVPR), 2024
Zhikai Li
Xuewen Liu
Dongrong Fu
Jianquan Li
Qingyi Gu
Kurt Keutzer
Zhen Dong
EGVM
VGen
DiffM
318
8
0
26 Aug 2024
Condensed Data Expansion Using Model Inversion for Knowledge Distillation
Kuluhan Binici
Shivam Aggarwal
C. Acar
N. Pham
K. Leman
Gim Hee Lee
Tulika Mitra
218
1
0
25 Aug 2024
MPruner: Optimizing Neural Network Size with CKA-Based Mutual Information Pruning
Seungbeom Hu
ChanJun Park
Andrew Ferraiuolo
Sang-Ki Ko
Jinwoo Kim
Haein Song
Jieung Kim
270
2
0
24 Aug 2024
A Greedy Hierarchical Approach to Whole-Network Filter-Pruning in CNNs
Kiran Purohit
Anurag Parvathgari
Sourangshu Bhattacharya
VLM
198
0
0
22 Aug 2024
Real-Time Video Generation with Pyramid Attention Broadcast
International Conference on Learning Representations (ICLR), 2024
Xuanlei Zhao
Xiaolong Jin
Kai Wang
Yang You
VGen
DiffM
454
75
0
22 Aug 2024
Domain-invariant Progressive Knowledge Distillation for UAV-based Object Detection
IEEE Geoscience and Remote Sensing Letters (GRSL), 2024
Liang Yao
Fan Liu
Chuanyi Zhang
Zhiquan Ou
Ting Wu
VLM
213
8
0
21 Aug 2024
Enhancing One-shot Pruned Pre-trained Language Models through Sparse-Dense-Sparse Mechanism
International Conference on Computational Linguistics (COLING), 2024
Guanchen Li
Xiandong Zhao
Lian Liu
Zeping Li
Dong Li
Lu Tian
Jie He
Ashish Sirasao
E. Barsoum
VLM
155
1
0
20 Aug 2024
Diffusion Model for Planning: A Systematic Literature Review
Toshihide Ubukata
Jialong Li
Kenji Tei
DiffM
MedIm
258
16
0
16 Aug 2024
An Effective Information Theoretic Framework for Channel Pruning
Yihao Chen
Zefang Wang
189
9
0
14 Aug 2024
Infra-YOLO: Efficient Neural Network Structure with Model Compression for Real-Time Infrared Small Object Detection
Zhonglin Chen
Anyu Geng
Jianan Jiang
Jiwu Lu
Di Wu
ObjD
177
2
0
14 Aug 2024
PEARL: Parallel Speculative Decoding with Adaptive Draft Length
International Conference on Learning Representations (ICLR), 2024
Tianyu Liu
Yun Li
Qitan Lv
Kai Liu
Jianchen Zhu
Winston Hu
Xingwu Sun
338
42
0
13 Aug 2024
Deeploy: Enabling Energy-Efficient Deployment of Small Language Models On Heterogeneous Microcontrollers
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2024
Moritz Scherer
Luka Macan
Victor J. B. Jung
Philip Wiese
Luca Bompani
Luca Bompani
Francesco Conti
Luca Benini
MoE
203
19
0
08 Aug 2024
AdapMTL: Adaptive Pruning Framework for Multitask Learning Model
ACM Multimedia (MM), 2024
Mingcan Xiang
Steven Jiaxun Tang
Qizheng Yang
Hui Guan
Tongping Liu
VLM
205
3
0
07 Aug 2024
Speaker Adaptation for Quantised End-to-End ASR Models
Qiuming Zhao
Guangzhi Sun
Chao Zhang
Mingxing Xu
Thomas Fang Zheng
137
1
0
07 Aug 2024
Designing Extremely Memory-Efficient CNNs for On-device Vision Tasks
Asian Conference on Computer Vision (ACCV), 2024
Jaewook Lee
Yoel Park
Seulki Lee
VLM
128
4
0
07 Aug 2024
Compress and Compare: Interactively Evaluating Efficiency and Behavior Across ML Model Compression Experiments
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2024
Angie Boggust
Venkatesh Sivaraman
Yannick Assogba
Donghao Ren
Dominik Moritz
Fred Hohman
VLM
161
9
0
06 Aug 2024
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
International Conference on Learning Representations (ICLR), 2024
Peijie Dong
Lujun Li
Dayou Du
Yuhan Chen
Zhenheng Tang
...
Wei Xue
Wenhan Luo
Qi-fei Liu
Yi-Ting Guo
Xiaowen Chu
MQ
156
28
0
03 Aug 2024
Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization
British Machine Vision Conference (BMVC), 2024
Róisín Luo
Alexandru Drimbarean
Walsh Simon
Colm O'Riordan
MQ
194
0
0
01 Aug 2024
Pruning Large Language Models with Semi-Structural Adaptive Sparse Training
Weiyu Huang
Yuezhou Hu
Guohao Jian
Jun Zhu
Jianfei Chen
252
16
0
30 Jul 2024
Toward Efficient Permutation for Hierarchical N:M Sparsity on GPUs
Seungmin Yu
Xiaodie Yi
Hayun Lee
Dongkun Shin
177
2
0
30 Jul 2024
MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity
AAAI Conference on Artificial Intelligence (AAAI), 2024
Kanghyun Choi
Hyeyoon Lee
Dain Kwon
Sunjong Park
Kyuyeun Kim
Noseong Park
Jinho Lee
Jinho Lee
MQ
555
6
0
29 Jul 2024
Parameter-Efficient Fine-Tuning via Circular Convolution
Chenyi Zi
Jiashun Cheng
Zijing Liu
Ziqi Gao
Fugee Tsung
Yu-Feng Li
Jia Li
368
4
0
27 Jul 2024
Greedy Output Approximation: Towards Efficient Structured Pruning for LLMs Without Retraining
Jianwei Li
Yijun Dong
Qi Lei
299
8
0
26 Jul 2024
Efficient Inference of Vision Instruction-Following Models with Elastic Cache
Zuyan Liu
Benlin Liu
Jiahui Wang
Yuhao Dong
Guangyi Chen
Yongming Rao
Ranjay Krishna
Jiwen Lu
VLM
198
25
0
25 Jul 2024
Accurate and Efficient Fine-Tuning of Quantized Large Language Models Through Optimal Balance
Ao Shen
Qiang Wang
Zhiquan Lai
Xionglve Li
Dongsheng Li
MQ
ALM
209
1
0
24 Jul 2024
Accelerating the Low-Rank Decomposed Models
Habib Hajimolahoseini
Walid Ahmed
Austin Wen
Yang Liu
194
0
0
24 Jul 2024
MetaAug: Meta-Data Augmentation for Post-Training Quantization
Cuong Pham
Hoang Anh Dung
Cuong C. Nguyen
Trung Le
Dinh Q. Phung
Gustavo Carneiro
Thanh-Toan Do
MQ
198
1
0
20 Jul 2024
Straightforward Layer-wise Pruning for More Efficient Visual Adaptation
Ruizi Han
Jinglei Tang
235
3
0
19 Jul 2024
Reconstruct the Pruned Model without Any Retraining
Pingjie Wang
Ziqing Fan
Shengchao Hu
Zhe Chen
Yanfeng Wang
Yu Wang
192
2
0
18 Jul 2024
CCSRP: Robust Pruning of Spiking Neural Networks through Cooperative Coevolution
J. Reif
Jiakang Li
Bowen Tian
Alexander Fay
AAML
163
0
0
18 Jul 2024
INTELLECT: Adapting Cyber Threat Detection to Heterogeneous Computing Environments
Simone Magnani
Liubov Nedoshivina
Roberto Doriguzzi-Corin
Stefano Braghin
Domenico Siracusa
203
0
0
17 Jul 2024
Hybrid Dynamic Pruning: A Pathway to Efficient Transformer Inference
Ghadeer Jaradat
M. Tolba
Ghada Alsuhli
Hani Saleh
Mahmoud Al-Qutayri
Thanos Stouraitis
Baker Mohammad
126
1
0
17 Jul 2024
Enhancing Split Computing and Early Exit Applications through Predefined Sparsity
Luigi Capogrosso
Enrico Fraccaroli
Giulio Petrozziello
Francesco Setti
Samarjit Chakraborty
Franco Fummi
Marco Cristani
184
3
0
16 Jul 2024
Previous
1
2
3
...
7
8
9
...
71
72
73
Next