ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.00149
  4. Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding
v1v2v3v4v5 (latest)

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015
Song Han
Huizi Mao
W. Dally
    3DGS
ArXiv (abs)PDFHTML

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,628 papers shown
Title
CQIL: Inference Latency Optimization with Concurrent Computation of
  Quasi-Independent Layers
CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent LayersAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Longwei Zou
Qingyang Wang
Han Zhao
Tingfeng Liu
Yi Yang
Yangdong Deng
209
1
0
10 Apr 2024
TabConv: Low-Computation CNN Inference via Table Lookups
TabConv: Low-Computation CNN Inference via Table Lookups
Neelesh Gupta
Narayanan Kannan
Pengmiao Zhang
Viktor Prasanna
166
3
0
08 Apr 2024
MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning
MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning
Matteo Farina
Goran Frehse
Elia Cunegatti
Gaowen Liu
Giovanni Iacca
Elisa Ricci
VLM
235
8
0
08 Apr 2024
Lightweight Deep Learning for Resource-Constrained Environments: A
  Survey
Lightweight Deep Learning for Resource-Constrained Environments: A Survey
Hou-I Liu
Marco Galindo
Hongxia Xie
Lai-Kuan Wong
Hong-Han Shuai
Yung-Hui Li
Wen-Huang Cheng
303
143
0
08 Apr 2024
What Happens When Small Is Made Smaller? Exploring the Impact of
  Compression on Small Data Pretrained Language Models
What Happens When Small Is Made Smaller? Exploring the Impact of Compression on Small Data Pretrained Language Models
Busayo Awobade
Mardiyyah Oduwole
Steven Kolawole
169
1
0
06 Apr 2024
Dynamic Switch Layers For Unsupervised Learning
Dynamic Switch Layers For Unsupervised Learning
Haiguang Li
Usama Pervaiz
Michal Matuszak
Robert Kamara
Gilles Roux
T. Thormundsson
Joseph Antognini
221
1
0
05 Apr 2024
Lossless and Near-Lossless Compression for Foundation Models
Lossless and Near-Lossless Compression for Foundation Models
Moshik Hershcovitch
Leshem Choshen
Andrew Wood
Ilias Enmouri
Peter Chin
S. Sundararaman
Danny Harnik
198
12
0
05 Apr 2024
On the Surprising Efficacy of Distillation as an Alternative to
  Pre-Training Small Models
On the Surprising Efficacy of Distillation as an Alternative to Pre-Training Small Models
Sean Farhat
Deming Chen
216
0
0
04 Apr 2024
Talaria: Interactively Optimizing Machine Learning Models for Efficient
  Inference
Talaria: Interactively Optimizing Machine Learning Models for Efficient InferenceInternational Conference on Human Factors in Computing Systems (CHI), 2024
Fred Hohman
Chaoqun Wang
Jinmook Lee
Jochen Görtler
Dominik Moritz
Jeffrey P. Bigham
Zhile Ren
Cecile Foret
Qi Shan
Xiaoyi Zhang
274
7
0
03 Apr 2024
Optimizing the Deployment of Tiny Transformers on Low-Power MCUs
Optimizing the Deployment of Tiny Transformers on Low-Power MCUsIEEE transactions on computers (IEEE Trans. Comput.), 2024
Victor J. B. Jung
Luca Bompani
Moritz Scherer
Francesco Conti
Luca Benini
283
15
0
03 Apr 2024
Rethinking Pruning for Vision-Language Models: Strategies for Effective
  Sparsity and Performance Restoration
Rethinking Pruning for Vision-Language Models: Strategies for Effective Sparsity and Performance Restoration
Shwai He
Ang Li
Tianlong Chen
VLM
241
3
0
03 Apr 2024
Improve Knowledge Distillation via Label Revision and Data Selection
Improve Knowledge Distillation via Label Revision and Data SelectionIEEE Transactions on Cognitive and Developmental Systems (IEEE TCDS), 2024
Weichao Lan
Yiu-ming Cheung
Qing Xu
Buhua Liu
Zhikai Hu
Mengke Li
Zhenghua Chen
195
6
0
03 Apr 2024
Accelerating Transformer Pre-training with 2:4 Sparsity
Accelerating Transformer Pre-training with 2:4 SparsityInternational Conference on Machine Learning (ICML), 2024
Yuezhou Hu
Kang Zhao
Weiyu Huang
Jianfei Chen
Jun Zhu
242
16
0
02 Apr 2024
Condition-Aware Neural Network for Controlled Image Generation
Condition-Aware Neural Network for Controlled Image Generation
Han Cai
Zhekai Zhang
Zhuoyang Zhang
Qinsheng Zhang
Ming-Yu Liu
Song Han
DiffM
160
15
0
01 Apr 2024
Separate, Dynamic and Differentiable (SMART) Pruner for Block/Output
  Channel Pruning on Computer Vision Tasks
Separate, Dynamic and Differentiable (SMART) Pruner for Block/Output Channel Pruning on Computer Vision Tasks
Guanhua Ding
Zexi Ye
Zhen Zhong
Gang Li
David Shao
159
0
0
29 Mar 2024
Tiny Machine Learning: Progress and Futures
Tiny Machine Learning: Progress and Futures
Ji Lin
Ligeng Zhu
Wei-Ming Chen
Wei-Chen Wang
Song Han
206
110
0
28 Mar 2024
Dense Vision Transformer Compression with Few Samples
Dense Vision Transformer Compression with Few Samples
Hanxiao Zhang
Yifan Zhou
Guo-Hua Wang
Jianxin Wu
ViTVLM
189
7
0
27 Mar 2024
Block Selective Reprogramming for On-device Training of Vision
  Transformers
Block Selective Reprogramming for On-device Training of Vision Transformers
Sreetama Sarkar
Souvik Kundu
Kai Zheng
Peter A. Beerel
206
5
0
25 Mar 2024
Parametric PDE Control with Deep Reinforcement Learning and
  Differentiable L0-Sparse Polynomial Policies
Parametric PDE Control with Deep Reinforcement Learning and Differentiable L0-Sparse Polynomial Policies
N. Botteghi
Urban Fasel
AI4CE
249
7
0
22 Mar 2024
Hierarchical Skip Decoding for Efficient Autoregressive Text Generation
Hierarchical Skip Decoding for Efficient Autoregressive Text Generation
Yunqi Zhu
Xuebing Yang
Yuanyuan Wu
Wensheng Zhang
295
5
0
22 Mar 2024
FedMef: Towards Memory-efficient Federated Dynamic Pruning
FedMef: Towards Memory-efficient Federated Dynamic Pruning
Hong Huang
Weiming Zhuang
Chen Chen
Lingjuan Lyu
177
16
0
21 Mar 2024
Auto-Train-Once: Controller Network Guided Automatic Network Pruning
  from Scratch
Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch
Xidong Wu
Shangqian Gao
Zeyu Zhang
Zhenzhen Li
Runxue Bao
Yanfu Zhang
Xiaoqian Wang
Heng-Chiao Huang
197
25
0
21 Mar 2024
Evaluating Unsupervised Dimensionality Reduction Methods for Pretrained
  Sentence Embeddings
Evaluating Unsupervised Dimensionality Reduction Methods for Pretrained Sentence Embeddings
Gaifan Zhang
Yi Zhou
Danushka Bollegala
172
10
0
20 Mar 2024
Pruning for Improved ADC Efficiency in Crossbar-based Analog In-memory
  Accelerators
Pruning for Improved ADC Efficiency in Crossbar-based Analog In-memory Accelerators
Timur Ibrayev
Isha Garg
I. Chakraborty
Kaushik Roy
107
0
0
19 Mar 2024
SEVEN: Pruning Transformer Model by Reserving Sentinels
SEVEN: Pruning Transformer Model by Reserving SentinelsIEEE International Joint Conference on Neural Network (IJCNN), 2024
Jinying Xiao
Ping Li
Jie Nie
Zhe Tang
164
3
0
19 Mar 2024
EffiPerception: an Efficient Framework for Various Perception Tasks
EffiPerception: an Efficient Framework for Various Perception Tasks
Xinhao Xiang
Simon Dräger
Jiawei Zhang
VLM
176
1
0
18 Mar 2024
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient
  LLMs Under Compression
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under CompressionInternational Conference on Machine Learning (ICML), 2024
Junyuan Hong
Jinhao Duan
Chenhui Zhang
Zhangheng Li
Chulin Xie
...
B. Kailkhura
Dan Hendrycks
Dawn Song
Zinan Lin
Yue Liu
276
44
0
18 Mar 2024
Federated Learning based on Pruning and Recovery
Federated Learning based on Pruning and Recovery
Chengjie Ma
FedML
104
1
0
16 Mar 2024
BRIEDGE: EEG-Adaptive Edge AI for Multi-Brain to Multi-Robot Interaction
BRIEDGE: EEG-Adaptive Edge AI for Multi-Brain to Multi-Robot Interaction
Jinhui Ouyang
Mingzhu Wu
Xinglin Li
Hanhui Deng
Di Wu
138
3
0
14 Mar 2024
Physics-Inspired Deep Learning Anti-Aliasing Framework in Efficient
  Channel State Feedback
Physics-Inspired Deep Learning Anti-Aliasing Framework in Efficient Channel State FeedbackIEEE Transactions on Wireless Communications (IEEE TWC), 2024
Yu-Chien Lin
Yan Xin
Ta-Sung Lee
Charlie Zhang
Zhang
Zhi Ding
173
2
0
12 Mar 2024
IM-Unpack: Training and Inference with Arbitrarily Low Precision
  Integers
IM-Unpack: Training and Inference with Arbitrarily Low Precision IntegersInternational Conference on Machine Learning (ICML), 2024
Zhanpeng Zeng
Karthikeyan Sankaralingam
Vikas Singh
176
1
0
12 Mar 2024
Smart-Infinity: Fast Large Language Model Training using Near-Storage
  Processing on a Real System
Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real SystemInternational Symposium on High-Performance Computer Architecture (HPCA), 2024
Hongsun Jang
Jaeyong Song
Jaewon Jung
Jaeyoung Park
Youngsok Kim
Jinho Lee
143
28
0
11 Mar 2024
A Converting Autoencoder Toward Low-latency and Energy-efficient DNN
  Inference at the Edge
A Converting Autoencoder Toward Low-latency and Energy-efficient DNN Inference at the EdgeIEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2024
Hasanul Mahmud
Peng Kang
Kevin Desai
P. Lama
Sushil Prasad
151
3
0
11 Mar 2024
Enhanced Sparsification via Stimulative Training
Enhanced Sparsification via Stimulative TrainingEuropean Conference on Computer Vision (ECCV), 2024
Shengji Tang
Weihao Lin
Hancheng Ye
Peng Ye
Chong Yu
Baopu Li
Tao Chen
139
2
0
11 Mar 2024
Exploring Hardware Friendly Bottleneck Architecture in CNN for Embedded
  Computing Systems
Exploring Hardware Friendly Bottleneck Architecture in CNN for Embedded Computing SystemsInternational Conference on Information Photonics (ICIP), 2019
Xing Lei
Longjun Liu
Zhiheng Zhou
Hongbin Sun
Nanning Zheng
160
1
0
11 Mar 2024
FrameQuant: Flexible Low-Bit Quantization for Transformers
FrameQuant: Flexible Low-Bit Quantization for TransformersInternational Conference on Machine Learning (ICML), 2024
Harshavardhan Adepu
Zhanpeng Zeng
Li Zhang
Vikas Singh
MQ
151
13
0
10 Mar 2024
A Survey of Lottery Ticket Hypothesis
A Survey of Lottery Ticket Hypothesis
Bohan Liu
Zijie Zhang
Peixiong He
Zhensen Wang
Yang Xiao
Ruimeng Ye
Yang Zhou
Wei-Shinn Ku
Bo Hui
UQCV
235
22
0
07 Mar 2024
LORS: Low-rank Residual Structure for Parameter-Efficient Network
  Stacking
LORS: Low-rank Residual Structure for Parameter-Efficient Network Stacking
Jialin Li
Qiang Nie
Weifu Fu
Yuhuan Lin
Guangpin Tao
Yong-Jin Liu
Chengjie Wang
199
5
0
07 Mar 2024
Learn to Code Sustainably: An Empirical Study on LLM-based Green Code
  Generation
Learn to Code Sustainably: An Empirical Study on LLM-based Green Code Generation
Tina Vartziotis
Ippolyti Dellatolas
George Dasoulas
Maximilian Schmidt
Florian Schneider
Tim Hoffmann
S. Kotsopoulos
Michael Keckeisen
222
13
0
05 Mar 2024
MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for
  Accelerating Vision-Language Transformer
MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer
Jianjian Cao
Peng Ye
Shengze Li
Chong Yu
Yansong Tang
Jiwen Lu
Tao Chen
156
43
0
05 Mar 2024
On the Compressibility of Quantized Large Language Models
On the Compressibility of Quantized Large Language Models
Yu Mao
Weilan Wang
Hongchao Du
Nan Guan
Chun Jason Xue
MQ
205
7
0
03 Mar 2024
OSSCAR: One-Shot Structured Pruning in Vision and Language Models with
  Combinatorial Optimization
OSSCAR: One-Shot Structured Pruning in Vision and Language Models with Combinatorial Optimization
Xiang Meng
Shibal Ibrahim
Kayhan Behdin
Hussein Hazimeh
Natalia Ponomareva
Rahul Mazumder
VLM
318
12
0
02 Mar 2024
BasedAI: A decentralized P2P network for Zero Knowledge Large Language
  Models (ZK-LLMs)
BasedAI: A decentralized P2P network for Zero Knowledge Large Language Models (ZK-LLMs)
Sean Wellington
68
6
0
01 Mar 2024
"Lossless" Compression of Deep Neural Networks: A High-dimensional
  Neural Tangent Kernel Approach
"Lossless" Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach
Lingyu Gu
Yongqiang Du
Yuan Zhang
Di Xie
Shiliang Pu
Robert C. Qiu
Zhenyu Liao
199
9
0
01 Mar 2024
T3DNet: Compressing Point Cloud Models for Lightweight 3D Recognition
T3DNet: Compressing Point Cloud Models for Lightweight 3D Recognition
Zhiyuan Yang
Yunjiao Zhou
Lihua Xie
Jianfei Yang
3DPC
191
2
0
29 Feb 2024
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for
  Large Language Models
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
Amit Dhurandhar
Tejaswini Pedapati
Ronny Luss
Soham Dan
Aurélie C. Lozano
Payel Das
Georgios Kollias
339
3
0
28 Feb 2024
SparseLLM: Towards Global Pruning for Pre-trained Language Models
SparseLLM: Towards Global Pruning for Pre-trained Language Models
Guangji Bai
Yijiang Li
Chen Ling
Kibaek Kim
Bo Pan
421
26
0
28 Feb 2024
REPrune: Channel Pruning via Kernel Representative Selection
REPrune: Channel Pruning via Kernel Representative Selection
Mincheol Park
Dongjin Kim
Cheonjun Park
Yuna Park
Gyeong Eun Gong
Won Woo Ro
Suhyun Kim
VLM
210
2
0
27 Feb 2024
GenAINet: Enabling Wireless Collective Intelligence via Knowledge Transfer and Reasoning
GenAINet: Enabling Wireless Collective Intelligence via Knowledge Transfer and Reasoning
Han Zou
Qiyang Zhao
Lina Bariah
Yu Tian
M. Bennis
S. Lasaulce
259
23
0
26 Feb 2024
EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural
  Network Acceleration
EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network Acceleration
Bo Liu
Grace Li Zhang
Xunzhao Yin
Ulf Schlichtmann
Bing Li
MQAI4CE
181
0
0
25 Feb 2024
Previous
123...101112...717273
Next