Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1510.00149
Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,434 papers shown
Title
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
Mamba
43
0
0
13 May 2025
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Tollef Emil Jørgensen
MQ
49
0
0
13 May 2025
Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry
Mohammed Adnan
Rohan Jain
Ekansh Sharma
Rahul Krishnan
Yani Andrew Ioannou
54
0
0
08 May 2025
PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs
Lukas Meiner
Jens Mehnert
A. P. Condurache
MQ
37
0
0
06 May 2025
Efficient Continual Learning in Keyword Spotting using Binary Neural Networks
Quynh Nguyen Phuong Vu
Luciano S. Martinez-Rau
Yuxuan Zhang
Nho-Duc Tran
Bengt Oelmann
Michele Magno
Sebastian Bader
CLL
35
0
0
05 May 2025
Efficient Shapley Value-based Non-Uniform Pruning of Large Language Models
Chuan Sun
Han Yu
Lizhen Cui
Xiaoxiao Li
66
0
0
03 May 2025
HMI: Hierarchical Knowledge Management for Efficient Multi-Tenant Inference in Pretrained Language Models
J. Zhang
J. Wang
H. Li
Lidan Shou
Ke Chen
Gang Chen
Qin Xie
Guiming Xie
Xuejian Gong
33
0
0
24 Apr 2025
BackSlash: Rate Constrained Optimized Training of Large Language Models
Jun Wu
Jiangtao Wen
Yuxing Han
34
0
0
23 Apr 2025
Efficient Adaptation of Deep Neural Networks for Semantic Segmentation in Space Applications
Leonardo Olivi
Edoardo Santero Mormile
Enzo Tartaglione
SSeg
30
0
0
22 Apr 2025
Mathematical Programming Models for Exact and Interpretable Formulation of Neural Networks
Masoud Ataei
Edrin Hasaj
Jacob Gipp
Sepideh Forouzi
17
0
0
19 Apr 2025
Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts
Leyang Li
Shilin Lu
Yan Ren
A. Kong
DiffM
40
1
0
17 Apr 2025
Collaborative Learning of On-Device Small Model and Cloud-Based Large Model: Advances and Future Directions
Chaoyue Niu
Yucheng Ding
Junhui Lu
Zhengxiang Huang
Hang Zeng
Yutong Dai
Xuezhen Tu
Chengfei Lv
Fan Wu
Guihai Chen
27
1
0
17 Apr 2025
Mamba-Based Ensemble learning for White Blood Cell Classification
Lewis Clifton
X. Tian
D. Palasuwan
Phandee Watanaboonyongcharoen
Ponlapat Rojnuckarin
Nantheera Anantrasirichai
Mamba
49
0
0
15 Apr 2025
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
Tianyi Zhang
Yang Sui
Shaochen Zhong
V. Chaudhary
Xia Hu
Anshumali Shrivastava
MQ
32
0
0
15 Apr 2025
Efficient Reasoning Models: A Survey
Sicheng Feng
Gongfan Fang
Xinyin Ma
Xinchao Wang
ReLM
LRM
110
0
0
15 Apr 2025
ConvShareViT: Enhancing Vision Transformers with Convolutional Attention Mechanisms for Free-Space Optical Accelerators
Riad Ibadulla
Thomas M. Chen
C. Reyes-Aldasoro
ViT
27
0
0
15 Apr 2025
CUT: Pruning Pre-Trained Multi-Task Models into Compact Models for Edge Devices
Jingxuan Zhou
Weidong Bao
Ji Wang
Zhengyi Zhong
27
0
0
14 Apr 2025
Can LLMs Revolutionize the Design of Explainable and Efficient TinyML Models?
Christophe El Zeinaty
W. Hamidouche
Glenn Herrou
D. Ménard
Merouane Debbah
41
0
0
13 Apr 2025
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training
Yi Hu
Jinhang Zuo
Eddie Zhang
Bob Iannucci
Carlee Joe-Wong
24
0
0
13 Apr 2025
Cycle Training with Semi-Supervised Domain Adaptation: Bridging Accuracy and Efficiency for Real-Time Mobile Scene Detection
Huu-Phong Phan-Nguyen
Anh Dao
T. Nguyen
Tuan Quang
H. Tran
Tinh-Anh Nguyen-Nhu
Huy-Thach Pham
Quan Nguyen
Hoang M. Le
Quang-Vinh Dinh
30
0
0
12 Apr 2025
Optimizing Large Language Models: Metrics, Energy Efficiency, and Case Study Insights
Tahniat Khan
Soroor Motie
Sedef Akinli Kocak
Shaina Raza
MQ
37
0
0
07 Apr 2025
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models
Yoojin Jung
Byung Cheol Song
AAML
VLM
MQ
36
0
0
07 Apr 2025
Hyperflows: Pruning Reveals the Importance of Weights
Eugen Barbulescu
Antonio Alexoaie
21
0
0
06 Apr 2025
Towards Understanding and Improving Refusal in Compressed Models via Mechanistic Interpretability
Vishnu Kabir Chhabra
Mohammad Mahdi Khalili
AI4CE
28
0
0
05 Apr 2025
Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning
Sanghwan Bae
Jiwoo Hong
Min Young Lee
Hanbyul Kim
Jeongyeon Nam
Donghyun Kwak
OffRL
LRM
48
3
0
04 Apr 2025
HyperRAG: Enhancing Quality-Efficiency Tradeoffs in Retrieval-Augmented Generation with Reranker KV-Cache Reuse
Yuwei An
Yihua Cheng
Seo Jin Park
Junchen Jiang
36
1
0
03 Apr 2025
MDP: Multidimensional Vision Model Pruning with Latency Constraint
Xinglong Sun
Barath Lakshmanan
Maying Shen
Shiyi Lan
Jingde Chen
Jose M. Alvarez
VLM
44
0
0
02 Apr 2025
FedPaI: Achieving Extreme Sparsity in Federated Learning via Pruning at Initialization
Haonan Wang
Z. Liu
Kajimusugura Hoshino
Tuo Zhang
J. Walters
S. Crago
49
0
0
01 Apr 2025
Optimization of Layer Skipping and Frequency Scaling for Convolutional Neural Networks under Latency Constraint
Minh David Thao Chan
Ruoyu Zhao
Yukuan Jia
Ruiqing Mao
Sheng Zhou
40
0
0
31 Mar 2025
Machine Learning-assisted High-speed Combinatorial Optimization with Ising Machines for Dynamically Changing Problems
Yohei Hamakawa
Tomoya Kashimata
Masaya Yamasaki
Kosuke Tatsumura
AI4CE
30
0
0
31 Mar 2025
An Efficient Training Algorithm for Models with Block-wise Sparsity
Ding Zhu
Zhiqun Zuo
Mohammad Mahdi Khalili
37
0
0
27 Mar 2025
Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model
Abdelrahman M. Shaker
Muhammad Maaz
Chenhui Gou
Hamid Rezatofighi
Salman Khan
F. Khan
103
0
0
27 Mar 2025
Optimizing Multi-DNN Inference on Mobile Devices through Heterogeneous Processor Co-Execution
Yunquan Gao
Zhiguo Zhang
Praveen Kumar Donta
C. Dehury
X. Wang
Dusit Niyato
Qiyang Zhang
41
0
0
27 Mar 2025
Boosting Large Language Models with Mask Fine-Tuning
M. Zhang
Yue Bai
Huan Wang
Yizhou Wang
Qihua Dong
Y. Fu
CLL
48
0
0
27 Mar 2025
A Low-complexity Structured Neural Network Approach to Intelligently Realize Wideband Multi-beam Beamformers
Hansaka Aluvihare
Sivakumar Sivasankar
Xianqi Li
Arjuna Madanayake
Sirani M. Perera
73
0
0
26 Mar 2025
Lipschitz Constant Meets Condition Number: Learning Robust and Compact Deep Neural Networks
Yangqi Feng
S. J. Lin
Baoyuan Gao
Xian Wei
AAML
74
0
0
26 Mar 2025
GIViC: Generative Implicit Video Compression
Ge Gao
Siyue Teng
Tianhao Peng
Fan Zhang
David Bull
DiffM
VGen
41
0
0
25 Mar 2025
MoST: Efficient Monarch Sparse Tuning for 3D Representation Learning
Xu Han
Yuan Tang
Jinfeng Xu
Xianzhi Li
48
0
0
24 Mar 2025
Temporal Action Detection Model Compression by Progressive Block Drop
Xiaoyong Chen
Yong Guo
Jiaming Liang
Sitong Zhuang
Runhao Zeng
Xiping Hu
43
0
0
21 Mar 2025
Attention Pruning: Automated Fairness Repair of Language Models via Surrogate Simulated Annealing
Vishnu Asutosh Dasu
Md. Rafi Ur Rashid
Vipul Gupta
Saeid Tizpaz-Niari
Gang Tan
AAML
43
0
0
20 Mar 2025
PARQ: Piecewise-Affine Regularized Quantization
Lisa Jin
Jianhao Ma
Zechun Liu
Andrey Gromov
Aaron Defazio
Lin Xiao
MQ
38
0
0
19 Mar 2025
Decision Tree Induction Through LLMs via Semantically-Aware Evolution
Tennison Liu
Nicolas Huynh
M. Schaar
63
0
0
18 Mar 2025
Knowledge Distillation: Enhancing Neural Network Compression with Integrated Gradients
David E. Hernandez
J. Chang
Torbjörn E. M. Nordling
56
0
0
17 Mar 2025
Changing Base Without Losing Pace: A GPU-Efficient Alternative to MatMul in DNNs
Nir Ailon
Akhiad Bercovich
Omri Weinstein
52
0
0
15 Mar 2025
Stabilizing Quantization-Aware Training by Implicit-Regularization on Hessian Matrix
Junbiao Pang
Tianyang Cai
39
1
0
14 Mar 2025
Safe Vision-Language Models via Unsafe Weights Manipulation
Moreno DÍncà
E. Peruzzo
Xingqian Xu
Humphrey Shi
N. Sebe
Massimiliano Mancini
MU
55
0
0
14 Mar 2025
Towards Extreme Pruning of LLMs with Plug-and-Play Mixed Sparsity
Chi Xu
Gefei Zhang
Yantong Zhu
Luca Benini
Guosheng Hu
Yawei Li
Zhihong Zhang
29
0
0
14 Mar 2025
ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba
Juncan Deng
Shuaiting Li
Zeyu Wang
Kedong Xu
Hong Gu
Kejie Huang
MQ
60
0
0
12 Mar 2025
Residual Learning and Filtering Networks for End-to-End Lossless Video Compression
Md Baharul Islam
Afsana Ahsan Jeny
53
0
0
11 Mar 2025
SSVQ: Unleashing the Potential of Vector Quantization with Sign-Splitting
Shuaiting Li
Juncan Deng
Chenxuan Wang
Kedong Xu
Rongtao Deng
Hong Gu
Haibin Shen
Kejie Huang
MQ
53
0
0
11 Mar 2025
1
2
3
4
...
67
68
69
Next