ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.00149
  4. Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding
v1v2v3v4v5 (latest)

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015
Song Han
Huizi Mao
W. Dally
    3DGS
ArXiv (abs)PDFHTML

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,622 papers shown
Title
Mosaic Pruning: A Hierarchical Framework for Generalizable Pruning of Mixture-of-Experts Models
Mosaic Pruning: A Hierarchical Framework for Generalizable Pruning of Mixture-of-Experts Models
Wentao Hu
Mingkuan Zhao
Shuangyong Song
Xiaoyan Zhu
Xin Lai
Jiayin Wang
71
1
0
25 Nov 2025
Deterministic Continuous Replacement: Fast and Stable Module Replacement in Pretrained Transformers
Deterministic Continuous Replacement: Fast and Stable Module Replacement in Pretrained Transformers
Rowan Bradbury
Aniket Srinivasan Ashok
Sai Ram Kasanagottu
Gunmay Jhingran
Shuai Meng
101
0
0
24 Nov 2025
Multimodal Real-Time Anomaly Detection and Industrial Applications
Multimodal Real-Time Anomaly Detection and Industrial Applications
Aman Verma
Keshav Samdani
Mohd. Samiuddin Shafi
90
0
0
24 Nov 2025
Exploiting the Experts: Unauthorized Compression in MoE-LLMs
Exploiting the Experts: Unauthorized Compression in MoE-LLMs
Pinaki Prasad Guha Neogi
Ahmad Mohammadshirazi
Dheeraj Kulshrestha
R. Ramnath
MoE
76
0
0
22 Nov 2025
Layer-wise Weight Selection for Power-Efficient Neural Network Acceleration
Layer-wise Weight Selection for Power-Efficient Neural Network Acceleration
Jiaxun Fang
Grace Li Zhang
Shaoyi Huang
MQ
223
0
0
21 Nov 2025
Equivariant-Aware Structured Pruning for Efficient Edge Deployment: A Comprehensive Framework with Adaptive Fine-Tuning
Equivariant-Aware Structured Pruning for Efficient Edge Deployment: A Comprehensive Framework with Adaptive Fine-Tuning
Mohammed Alnemari
40
0
0
21 Nov 2025
Sex and age determination in European lobsters using AI-Enhanced bioacoustics
Sex and age determination in European lobsters using AI-Enhanced bioacoustics
Feliciano Domingos
I. Ihianle
Omprakash Kaiwartya
Ahmad Lotfi
Nicola Khan
Nicholas Beaudreau
Amaya Albalat
Pedro Machado
122
0
0
20 Nov 2025
Weight-sparse transformers have interpretable circuits
Weight-sparse transformers have interpretable circuits
Leo Gao
Achyuta Rajaram
Jacob Coxon
Soham V. Govande
Bowen Baker
Dan Mossing
MILM
164
2
0
17 Nov 2025
MFI-ResNet: Efficient ResNet Architecture Optimization via MeanFlow Compression and Selective Incubation
MFI-ResNet: Efficient ResNet Architecture Optimization via MeanFlow Compression and Selective Incubation
Nuolin Sun
Linyuan Wang
Haonan Wei
Lei Li
Bin Yan
101
0
0
16 Nov 2025
LILogic Net: Compact Logic Gate Networks with Learnable Connectivity for Efficient Hardware Deployment
LILogic Net: Compact Logic Gate Networks with Learnable Connectivity for Efficient Hardware Deployment
Katarzyna Fojcik
Renaldas Zioma
Jogundas Armaitis
NAIGNN
124
0
0
15 Nov 2025
Steering Pretrained Drafters during Speculative Decoding
Steering Pretrained Drafters during Speculative Decoding
Frédéric Berdoz
Peer Rheinboldt
Roger Wattenhofer
LLMSV
316
0
0
13 Nov 2025
Toward Dignity-Aware AI: Next-Generation Elderly Monitoring from Fall Detection to ADL
Toward Dignity-Aware AI: Next-Generation Elderly Monitoring from Fall Detection to ADL
Xun Shao
Aoba Otani
Yuto Hirasuka
Runji Cai
Seng W. Loke
61
0
0
12 Nov 2025
CompressNAS : A Fast and Efficient Technique for Model Compression using Decomposition
CompressNAS : A Fast and Efficient Technique for Model Compression using Decomposition
Sudhakar Sah
Nikhil Chabbra
Matthieu Durnerin
60
0
0
12 Nov 2025
A Generalized Spectral Framework to Expain Neural Scaling and Compression Dynamics
A Generalized Spectral Framework to Expain Neural Scaling and Compression Dynamics
Yizhou Zhang
72
2
0
11 Nov 2025
MI-to-Mid Distilled Compression (M2M-DC): An Hybrid-Information-Guided-Block Pruning with Progressive Inner Slicing Approach to Model Compression
MI-to-Mid Distilled Compression (M2M-DC): An Hybrid-Information-Guided-Block Pruning with Progressive Inner Slicing Approach to Model Compression
Lionel Levine
Sajjad Ghiasvand
Haniyeh Ehsani Oskouie
Majid Sarrafzadeh
MQ
295
0
0
10 Nov 2025
EcoSpa: Efficient Transformer Training with Coupled Sparsity
EcoSpa: Efficient Transformer Training with Coupled Sparsity
Jinqi Xiao
Cheng Luo
Lingyi Huang
Cheng Yang
Yang Sui
...
Xiao Zang
Yibiao Ying
Zhexiang Tang
A. Anandkumar
Bo Yuan
36
0
0
09 Nov 2025
SymLight: Exploring Interpretable and Deployable Symbolic Policies for Traffic Signal Control
SymLight: Exploring Interpretable and Deployable Symbolic Policies for Traffic Signal ControlIEEE Internet of Things Journal (IEEE IoT J.), 2025
Xiao-Cheng Liao
Yi Mei
Mengjie Zhang
69
0
0
08 Nov 2025
From Hubs to Deserts: Urban Cultural Accessibility Patterns with Explainable AI
From Hubs to Deserts: Urban Cultural Accessibility Patterns with Explainable AI
Protik Bose Pranto
Minhazul Islam
Ripon Kumar Saha
Abimelec Mercado Rivera
Namig Abbasov
52
0
0
08 Nov 2025
Quantifying the Climate Risk of Generative AI: Region-Aware Carbon Accounting with G-TRACE and the AI Sustainability Pyramid
Quantifying the Climate Risk of Generative AI: Region-Aware Carbon Accounting with G-TRACE and the AI Sustainability Pyramid
Zahida Kausar
Seemab Latif
Raja Khurrum Shahzad
Mehwish Fatima
48
0
0
06 Nov 2025
DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization
DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization
Yuantian Shao
Yuanteng Chen
Peisong Wang
Jianlin Yu
Jing Lin
Yiwu Yao
Zhihui Wei
Jian Cheng
MQ
216
0
0
06 Nov 2025
MDM: Manhattan Distance Mapping of DNN Weights for Parasitic-Resistance-Resilient Memristive Crossbars
MDM: Manhattan Distance Mapping of DNN Weights for Parasitic-Resistance-Resilient Memristive CrossbarsInternational Conference on Learning Representations (ICLR), 2025
Matheus Farias
Wanghley Martins
H. T. Kung
64
0
0
06 Nov 2025
TwIST: Rigging the Lottery in Transformers with Independent Subnetwork Training
TwIST: Rigging the Lottery in Transformers with Independent Subnetwork Training
Michael Menezes
Barbara Su
Xinze Feng
Yehya Farhat
Hamza Shili
Anastasios Kyrillidis
136
1
0
06 Nov 2025
Efficiently Training A Flat Neural Network Before It has been Quantizated
Efficiently Training A Flat Neural Network Before It has been Quantizated
Peng Xia
Junbiao Pang
Tianyang Cai
MQ
96
0
0
03 Nov 2025
Memory-Efficient Training with In-Place FFT Implementation
Memory-Efficient Training with In-Place FFT Implementation
Xinyu Ding
Bangtian Liu
Siyu Liao
Zhongfeng Wang
185
0
0
03 Nov 2025
AI Progress Should Be Measured by Capability-Per-Resource, Not Scale Alone: A Framework for Gradient-Guided Resource Allocation in LLMs
AI Progress Should Be Measured by Capability-Per-Resource, Not Scale Alone: A Framework for Gradient-Guided Resource Allocation in LLMs
David McCoy
Yulun Wu
Zachary Butzin-Dozier
64
0
0
02 Nov 2025
LeMiCa: Lexicographic Minimax Path Caching for Efficient Diffusion-Based Video Generation
LeMiCa: Lexicographic Minimax Path Caching for Efficient Diffusion-Based Video Generation
Huanlin Gao
Ping Chen
Fuyuan Shi
C. Tan
Zhaoxiang Liu
Fang Zhao
Kai Wang
Shiguo Lian
DiffMVGen
159
0
0
30 Oct 2025
Efficient Cost-and-Quality Controllable Arbitrary-scale Super-resolution with Fourier Constraints
Efficient Cost-and-Quality Controllable Arbitrary-scale Super-resolution with Fourier Constraints
Kazutoshi Akita
Norimichi Ukita
SupR
230
0
0
28 Oct 2025
Fast and accurate neural reflectance transformation imaging through knowledge distillation
Fast and accurate neural reflectance transformation imaging through knowledge distillation
Tinsae G. Dulecha
Leonardo Righetto
Ruggero Pintus
Enrico Gobbetti
Andrea Giachetti
3DH
224
1
0
28 Oct 2025
PAHQ: Accelerating Automated Circuit Discovery through Mixed-Precision Inference Optimization
PAHQ: Accelerating Automated Circuit Discovery through Mixed-Precision Inference Optimization
Xinhai Wang
Shu Yang
Liangyu Wang
L. Zhang
Huanyi Xie
Lijie Hu
Di Wang
105
2
0
27 Oct 2025
Hankel Singular Value Regularization for Highly Compressible State Space Models
Hankel Singular Value Regularization for Highly Compressible State Space Models
Paul Schwerdtner
Jules Berman
Benjamin Peherstorfer
154
1
0
27 Oct 2025
Frustratingly Easy Task-aware Pruning for Large Language Models
Frustratingly Easy Task-aware Pruning for Large Language Models
Yuanhe Tian
Junjie Liu
Xican Yang
Haishan Ye
Yan Song
93
0
0
26 Oct 2025
Dynamic Graph Neural Network for Data-Driven Physiologically Based Pharmacokinetic Modeling
Dynamic Graph Neural Network for Data-Driven Physiologically Based Pharmacokinetic Modeling
Su Liu
Xin Hu
Shurong Wen
Jiaqi Liu
Jiexi Xu
Lanruo Wang
102
0
0
25 Oct 2025
Pruning and Quantization Impact on Graph Neural Networks
Pruning and Quantization Impact on Graph Neural Networks
Khatoon Khedri
Reza Rawassizadeh
Qifu Wen
M. Hosseinzadeh
GNN
150
0
0
24 Oct 2025
Compress to Impress: Efficient LLM Adaptation Using a Single Gradient Step on 100 Samples
Compress to Impress: Efficient LLM Adaptation Using a Single Gradient Step on 100 Samples
Shiva Sreeram
Alaa Maalouf
Pratyusha Sharma
Daniela Rus
92
0
0
23 Oct 2025
TernaryCLIP: Efficiently Compressing Vision-Language Models with Ternary Weights and Distilled Knowledge
TernaryCLIP: Efficiently Compressing Vision-Language Models with Ternary Weights and Distilled Knowledge
Shu-Hao Zhang
Wei Tang
Chen Wu
Peng Hu
Nan Li
L. Zhang
Qi Zhang
Shao-Qun Zhang
MQVLM
203
0
0
23 Oct 2025
HAMLOCK: HArdware-Model LOgically Combined attacK
HAMLOCK: HArdware-Model LOgically Combined attacK
Sanskar Amgain
Daniel Lobo
Atri Chatterjee
Swarup Bhunia
Fnu Suya
89
0
0
22 Oct 2025
A Survey on Cache Methods in Diffusion Models: Toward Efficient Multi-Modal Generation
A Survey on Cache Methods in Diffusion Models: Toward Efficient Multi-Modal Generation
Jiacheng Liu
Xinyu Wang
Yuqi Lin
Zhikai Wang
P. Wang
...
Zexuan Yan
Zhengyi Shi
Chang Zou
Yue Ma
Linfeng Zhang
239
1
0
22 Oct 2025
MetaCluster: Enabling Deep Compression of Kolmogorov-Arnold Network
MetaCluster: Enabling Deep Compression of Kolmogorov-Arnold Network
Matthew Raffel
Adwaith Renjith
Lizhong Chen
81
0
0
21 Oct 2025
C-SWAP: Explainability-Aware Structured Pruning for Efficient Neural Networks Compression
C-SWAP: Explainability-Aware Structured Pruning for Efficient Neural Networks Compression
Baptiste Bauvin
Loïc Baret
Ola Ahmad
76
0
0
21 Oct 2025
From Local to Global: Revisiting Structured Pruning Paradigms for Large Language Models
From Local to Global: Revisiting Structured Pruning Paradigms for Large Language Models
Ziyan Wang
Enmao Diao
Qi Le
Pu Wang
Minwoo Lee
Shu-ping Yeh
Evgeny Stupachenko
Hao Feng
Li Yang
100
1
0
20 Oct 2025
EdgeNavMamba: Mamba Optimized Object Detection for Energy Efficient Edge Devices
EdgeNavMamba: Mamba Optimized Object Detection for Energy Efficient Edge Devices
Romina Aalishah
Mozhgan Navardi
T. Mohsenin
Mamba
132
0
0
16 Oct 2025
Convergence, design and training of continuous-time dropout as a random batch method
Convergence, design and training of continuous-time dropout as a random batch method
Antonio Álvarez-López
Martín Hernández
52
0
0
15 Oct 2025
Generalizing WiFi Gesture Recognition via Large-Model-Aware Semantic Distillation and Alignment
Generalizing WiFi Gesture Recognition via Large-Model-Aware Semantic Distillation and Alignment
Feng-Qi Cui
Yu-Tong Guo
Tianyue Zheng
Jinyang Huang
60
0
0
15 Oct 2025
Compressibility Measures Complexity: Minimum Description Length Meets Singular Learning Theory
Compressibility Measures Complexity: Minimum Description Length Meets Singular Learning Theory
Einar Urdshals
Edmund Lau
Jesse Hoogland
Stan van Wingerden
Daniel Murfet
87
1
0
14 Oct 2025
A Comprehensive Forecasting-Based Framework for Time Series Anomaly Detection: Benchmarking on the Numenta Anomaly Benchmark (NAB)
A Comprehensive Forecasting-Based Framework for Time Series Anomaly Detection: Benchmarking on the Numenta Anomaly Benchmark (NAB)
Mohammad Karami
Mostafa Jalali
Fatemeh Ghassemi
AI4TS
79
0
0
13 Oct 2025
Long Exposure: Accelerating Parameter-Efficient Fine-Tuning for LLMs under Shadowy Sparsity
Long Exposure: Accelerating Parameter-Efficient Fine-Tuning for LLMs under Shadowy SparsityInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2024
Tuowei Wang
Kun Li
Zixu Hao
Donglin Bai
Ju Ren
Yaoxue Zhang
Ting Cao
M. Yang
116
4
0
12 Oct 2025
PermLLM: Learnable Channel Permutation for N:M Sparse Large Language Models
PermLLM: Learnable Channel Permutation for N:M Sparse Large Language Models
Lancheng Zou
Shuo Yin
Zehua Pei
Tsung-Yi Ho
Farzan Farnia
Bei Yu
60
0
0
11 Oct 2025
Automatic Speech Recognition in the Modern Era: Architectures, Training, and Evaluation
Automatic Speech Recognition in the Modern Era: Architectures, Training, and Evaluation
Md. Nayeem
Md Shamse Tabrej
Kabbojit Jit Deb
Shaonti Goswami
Md. Azizul Hakim
AI4TSVLM
64
2
0
11 Oct 2025
SQS: Bayesian DNN Compression through Sparse Quantized Sub-distributions
SQS: Bayesian DNN Compression through Sparse Quantized Sub-distributions
Ziyi Wang
Nan Jiang
Guang Lin
Qifan Song
MQ
165
0
0
10 Oct 2025
Automated Evolutionary Optimization for Resource-Efficient Neural Network Training
Automated Evolutionary Optimization for Resource-Efficient Neural Network Training
Ilia Revin
Leon Strelkov
Vadim A. Potemkin
Ivan A Kireev
Andrey Savchenko
88
0
0
10 Oct 2025
1234...717273
Next