ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.00149
  4. Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015
Song Han
Huizi Mao
W. Dally
    3DGS
ArXivPDFHTML

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,434 papers shown
Title
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
Mamba
43
0
0
13 May 2025
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Tollef Emil Jørgensen
MQ
49
0
0
13 May 2025
Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry
Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry
Mohammed Adnan
Rohan Jain
Ekansh Sharma
Rahul Krishnan
Yani Andrew Ioannou
54
0
0
08 May 2025
PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs
PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs
Lukas Meiner
Jens Mehnert
A. P. Condurache
MQ
37
0
0
06 May 2025
Efficient Continual Learning in Keyword Spotting using Binary Neural Networks
Efficient Continual Learning in Keyword Spotting using Binary Neural Networks
Quynh Nguyen Phuong Vu
Luciano S. Martinez-Rau
Yuxuan Zhang
Nho-Duc Tran
Bengt Oelmann
Michele Magno
Sebastian Bader
CLL
35
0
0
05 May 2025
Efficient Shapley Value-based Non-Uniform Pruning of Large Language Models
Efficient Shapley Value-based Non-Uniform Pruning of Large Language Models
Chuan Sun
Han Yu
Lizhen Cui
Xiaoxiao Li
66
0
0
03 May 2025
HMI: Hierarchical Knowledge Management for Efficient Multi-Tenant Inference in Pretrained Language Models
HMI: Hierarchical Knowledge Management for Efficient Multi-Tenant Inference in Pretrained Language Models
J. Zhang
J. Wang
H. Li
Lidan Shou
Ke Chen
Gang Chen
Qin Xie
Guiming Xie
Xuejian Gong
33
0
0
24 Apr 2025
BackSlash: Rate Constrained Optimized Training of Large Language Models
BackSlash: Rate Constrained Optimized Training of Large Language Models
Jun Wu
Jiangtao Wen
Yuxing Han
34
0
0
23 Apr 2025
Efficient Adaptation of Deep Neural Networks for Semantic Segmentation in Space Applications
Efficient Adaptation of Deep Neural Networks for Semantic Segmentation in Space Applications
Leonardo Olivi
Edoardo Santero Mormile
Enzo Tartaglione
SSeg
30
0
0
22 Apr 2025
Mathematical Programming Models for Exact and Interpretable Formulation of Neural Networks
Mathematical Programming Models for Exact and Interpretable Formulation of Neural Networks
Masoud Ataei
Edrin Hasaj
Jacob Gipp
Sepideh Forouzi
17
0
0
19 Apr 2025
Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts
Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts
Leyang Li
Shilin Lu
Yan Ren
A. Kong
DiffM
40
1
0
17 Apr 2025
Collaborative Learning of On-Device Small Model and Cloud-Based Large Model: Advances and Future Directions
Collaborative Learning of On-Device Small Model and Cloud-Based Large Model: Advances and Future Directions
Chaoyue Niu
Yucheng Ding
Junhui Lu
Zhengxiang Huang
Hang Zeng
Yutong Dai
Xuezhen Tu
Chengfei Lv
Fan Wu
Guihai Chen
27
1
0
17 Apr 2025
Mamba-Based Ensemble learning for White Blood Cell Classification
Mamba-Based Ensemble learning for White Blood Cell Classification
Lewis Clifton
X. Tian
D. Palasuwan
Phandee Watanaboonyongcharoen
Ponlapat Rojnuckarin
Nantheera Anantrasirichai
Mamba
49
0
0
15 Apr 2025
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
Tianyi Zhang
Yang Sui
Shaochen Zhong
V. Chaudhary
Xia Hu
Anshumali Shrivastava
MQ
32
0
0
15 Apr 2025
Efficient Reasoning Models: A Survey
Efficient Reasoning Models: A Survey
Sicheng Feng
Gongfan Fang
Xinyin Ma
Xinchao Wang
ReLM
LRM
110
0
0
15 Apr 2025
ConvShareViT: Enhancing Vision Transformers with Convolutional Attention Mechanisms for Free-Space Optical Accelerators
ConvShareViT: Enhancing Vision Transformers with Convolutional Attention Mechanisms for Free-Space Optical Accelerators
Riad Ibadulla
Thomas M. Chen
C. Reyes-Aldasoro
ViT
27
0
0
15 Apr 2025
CUT: Pruning Pre-Trained Multi-Task Models into Compact Models for Edge Devices
CUT: Pruning Pre-Trained Multi-Task Models into Compact Models for Edge Devices
Jingxuan Zhou
Weidong Bao
Ji Wang
Zhengyi Zhong
27
0
0
14 Apr 2025
Can LLMs Revolutionize the Design of Explainable and Efficient TinyML Models?
Can LLMs Revolutionize the Design of Explainable and Efficient TinyML Models?
Christophe El Zeinaty
W. Hamidouche
Glenn Herrou
D. Ménard
Merouane Debbah
41
0
0
13 Apr 2025
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training
Yi Hu
Jinhang Zuo
Eddie Zhang
Bob Iannucci
Carlee Joe-Wong
24
0
0
13 Apr 2025
Cycle Training with Semi-Supervised Domain Adaptation: Bridging Accuracy and Efficiency for Real-Time Mobile Scene Detection
Cycle Training with Semi-Supervised Domain Adaptation: Bridging Accuracy and Efficiency for Real-Time Mobile Scene Detection
Huu-Phong Phan-Nguyen
Anh Dao
T. Nguyen
Tuan Quang
H. Tran
Tinh-Anh Nguyen-Nhu
Huy-Thach Pham
Quan Nguyen
Hoang M. Le
Quang-Vinh Dinh
30
0
0
12 Apr 2025
Optimizing Large Language Models: Metrics, Energy Efficiency, and Case Study Insights
Optimizing Large Language Models: Metrics, Energy Efficiency, and Case Study Insights
Tahniat Khan
Soroor Motie
Sedef Akinli Kocak
Shaina Raza
MQ
37
0
0
07 Apr 2025
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models
Yoojin Jung
Byung Cheol Song
AAML
VLM
MQ
36
0
0
07 Apr 2025
Hyperflows: Pruning Reveals the Importance of Weights
Hyperflows: Pruning Reveals the Importance of Weights
Eugen Barbulescu
Antonio Alexoaie
21
0
0
06 Apr 2025
Towards Understanding and Improving Refusal in Compressed Models via Mechanistic Interpretability
Towards Understanding and Improving Refusal in Compressed Models via Mechanistic Interpretability
Vishnu Kabir Chhabra
Mohammad Mahdi Khalili
AI4CE
28
0
0
05 Apr 2025
Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning
Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning
Sanghwan Bae
Jiwoo Hong
Min Young Lee
Hanbyul Kim
Jeongyeon Nam
Donghyun Kwak
OffRL
LRM
48
3
0
04 Apr 2025
HyperRAG: Enhancing Quality-Efficiency Tradeoffs in Retrieval-Augmented Generation with Reranker KV-Cache Reuse
HyperRAG: Enhancing Quality-Efficiency Tradeoffs in Retrieval-Augmented Generation with Reranker KV-Cache Reuse
Yuwei An
Yihua Cheng
Seo Jin Park
Junchen Jiang
36
1
0
03 Apr 2025
MDP: Multidimensional Vision Model Pruning with Latency Constraint
MDP: Multidimensional Vision Model Pruning with Latency Constraint
Xinglong Sun
Barath Lakshmanan
Maying Shen
Shiyi Lan
Jingde Chen
Jose M. Alvarez
VLM
44
0
0
02 Apr 2025
FedPaI: Achieving Extreme Sparsity in Federated Learning via Pruning at Initialization
FedPaI: Achieving Extreme Sparsity in Federated Learning via Pruning at Initialization
Haonan Wang
Z. Liu
Kajimusugura Hoshino
Tuo Zhang
J. Walters
S. Crago
49
0
0
01 Apr 2025
Optimization of Layer Skipping and Frequency Scaling for Convolutional Neural Networks under Latency Constraint
Optimization of Layer Skipping and Frequency Scaling for Convolutional Neural Networks under Latency Constraint
Minh David Thao Chan
Ruoyu Zhao
Yukuan Jia
Ruiqing Mao
Sheng Zhou
40
0
0
31 Mar 2025
Machine Learning-assisted High-speed Combinatorial Optimization with Ising Machines for Dynamically Changing Problems
Machine Learning-assisted High-speed Combinatorial Optimization with Ising Machines for Dynamically Changing Problems
Yohei Hamakawa
Tomoya Kashimata
Masaya Yamasaki
Kosuke Tatsumura
AI4CE
30
0
0
31 Mar 2025
An Efficient Training Algorithm for Models with Block-wise Sparsity
An Efficient Training Algorithm for Models with Block-wise Sparsity
Ding Zhu
Zhiqun Zuo
Mohammad Mahdi Khalili
37
0
0
27 Mar 2025
Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model
Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model
Abdelrahman M. Shaker
Muhammad Maaz
Chenhui Gou
Hamid Rezatofighi
Salman Khan
F. Khan
103
0
0
27 Mar 2025
Optimizing Multi-DNN Inference on Mobile Devices through Heterogeneous Processor Co-Execution
Optimizing Multi-DNN Inference on Mobile Devices through Heterogeneous Processor Co-Execution
Yunquan Gao
Zhiguo Zhang
Praveen Kumar Donta
C. Dehury
X. Wang
Dusit Niyato
Qiyang Zhang
41
0
0
27 Mar 2025
Boosting Large Language Models with Mask Fine-Tuning
Boosting Large Language Models with Mask Fine-Tuning
M. Zhang
Yue Bai
Huan Wang
Yizhou Wang
Qihua Dong
Y. Fu
CLL
48
0
0
27 Mar 2025
A Low-complexity Structured Neural Network Approach to Intelligently Realize Wideband Multi-beam Beamformers
A Low-complexity Structured Neural Network Approach to Intelligently Realize Wideband Multi-beam Beamformers
Hansaka Aluvihare
Sivakumar Sivasankar
Xianqi Li
Arjuna Madanayake
Sirani M. Perera
73
0
0
26 Mar 2025
Lipschitz Constant Meets Condition Number: Learning Robust and Compact Deep Neural Networks
Lipschitz Constant Meets Condition Number: Learning Robust and Compact Deep Neural Networks
Yangqi Feng
S. J. Lin
Baoyuan Gao
Xian Wei
AAML
74
0
0
26 Mar 2025
GIViC: Generative Implicit Video Compression
GIViC: Generative Implicit Video Compression
Ge Gao
Siyue Teng
Tianhao Peng
Fan Zhang
David Bull
DiffM
VGen
41
0
0
25 Mar 2025
MoST: Efficient Monarch Sparse Tuning for 3D Representation Learning
MoST: Efficient Monarch Sparse Tuning for 3D Representation Learning
Xu Han
Yuan Tang
Jinfeng Xu
Xianzhi Li
48
0
0
24 Mar 2025
Temporal Action Detection Model Compression by Progressive Block Drop
Temporal Action Detection Model Compression by Progressive Block Drop
Xiaoyong Chen
Yong Guo
Jiaming Liang
Sitong Zhuang
Runhao Zeng
Xiping Hu
43
0
0
21 Mar 2025
Attention Pruning: Automated Fairness Repair of Language Models via Surrogate Simulated Annealing
Attention Pruning: Automated Fairness Repair of Language Models via Surrogate Simulated Annealing
Vishnu Asutosh Dasu
Md. Rafi Ur Rashid
Vipul Gupta
Saeid Tizpaz-Niari
Gang Tan
AAML
43
0
0
20 Mar 2025
PARQ: Piecewise-Affine Regularized Quantization
PARQ: Piecewise-Affine Regularized Quantization
Lisa Jin
Jianhao Ma
Zechun Liu
Andrey Gromov
Aaron Defazio
Lin Xiao
MQ
38
0
0
19 Mar 2025
Decision Tree Induction Through LLMs via Semantically-Aware Evolution
Decision Tree Induction Through LLMs via Semantically-Aware Evolution
Tennison Liu
Nicolas Huynh
M. Schaar
63
0
0
18 Mar 2025
Knowledge Distillation: Enhancing Neural Network Compression with Integrated Gradients
Knowledge Distillation: Enhancing Neural Network Compression with Integrated Gradients
David E. Hernandez
J. Chang
Torbjörn E. M. Nordling
56
0
0
17 Mar 2025
Changing Base Without Losing Pace: A GPU-Efficient Alternative to MatMul in DNNs
Changing Base Without Losing Pace: A GPU-Efficient Alternative to MatMul in DNNs
Nir Ailon
Akhiad Bercovich
Omri Weinstein
52
0
0
15 Mar 2025
Stabilizing Quantization-Aware Training by Implicit-Regularization on Hessian Matrix
Junbiao Pang
Tianyang Cai
39
1
0
14 Mar 2025
Safe Vision-Language Models via Unsafe Weights Manipulation
Safe Vision-Language Models via Unsafe Weights Manipulation
Moreno DÍncà
E. Peruzzo
Xingqian Xu
Humphrey Shi
N. Sebe
Massimiliano Mancini
MU
55
0
0
14 Mar 2025
Towards Extreme Pruning of LLMs with Plug-and-Play Mixed Sparsity
Chi Xu
Gefei Zhang
Yantong Zhu
Luca Benini
Guosheng Hu
Yawei Li
Zhihong Zhang
29
0
0
14 Mar 2025
ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba
Juncan Deng
Shuaiting Li
Zeyu Wang
Kedong Xu
Hong Gu
Kejie Huang
MQ
60
0
0
12 Mar 2025
Residual Learning and Filtering Networks for End-to-End Lossless Video Compression
Md Baharul Islam
Afsana Ahsan Jeny
53
0
0
11 Mar 2025
SSVQ: Unleashing the Potential of Vector Quantization with Sign-Splitting
Shuaiting Li
Juncan Deng
Chenxuan Wang
Kedong Xu
Rongtao Deng
Hong Gu
Haibin Shen
Kejie Huang
MQ
53
0
0
11 Mar 2025
1234...676869
Next