Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
1510.00149
Cited By
v1
v2
v3
v4
v5 (latest)
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,625 papers shown
Title
PermLLM: Learnable Channel Permutation for N:M Sparse Large Language Models
Lancheng Zou
Shuo Yin
Zehua Pei
Tsung-Yi Ho
Farzan Farnia
Bei Yu
64
0
0
11 Oct 2025
SQS: Bayesian DNN Compression through Sparse Quantized Sub-distributions
Ziyi Wang
Nan Jiang
Guang Lin
Qifan Song
MQ
165
0
0
10 Oct 2025
Automated Evolutionary Optimization for Resource-Efficient Neural Network Training
Ilia Revin
Leon Strelkov
Vadim A. Potemkin
Ivan A Kireev
Andrey Savchenko
88
0
0
10 Oct 2025
xRouter: Training Cost-Aware LLMs Orchestration System via Reinforcement Learning
Cheng Qian
Zuxin Liu
Shirley Kokane
Akshara Prabhakar
Jielin Qiu
...
Weiran Yao
Shelby Heinecke
Silvio Savarese
Caiming Xiong
Huan Wang
132
0
0
09 Oct 2025
Vanishing Contributions: A Unified Approach to Smoothly Transition Neural Models into Compressed Form
Lorenzo Nikiforos
Charalampos Antoniadis
Luciano Prono
F. Pareschi
R. Rovatti
Gianluca Setti
96
0
0
09 Oct 2025
Test-Time Reasoners Are Strategic Multiple-Choice Test-Takers
Nishant Balepur
Atrey Desai
Rachel Rudinger
LRM
76
0
0
09 Oct 2025
OBS-Diff: Accurate Pruning For Diffusion Models in One-Shot
Junhan Zhu
Hesong Wang
Mingluo Su
Zefang Wang
Huan Wang
DiffM
VLM
139
0
0
08 Oct 2025
Where to Begin: Efficient Pretraining via Subnetwork Selection and Distillation
Arjun Krishnakumar
R. Sukthanker
Hannan Javed Mahadik
Gabriela Kadlecová
Vladyslav Moroshan
Timur Carstensen
Frank Hutter
Aaron Klein
97
0
0
08 Oct 2025
Downsized and Compromised?: Assessing the Faithfulness of Model Compression
Moumita Kamal
Douglas A. Talbert
92
0
0
07 Oct 2025
ActiveMark: on watermarking of visual foundation models via massive activations
Anna Chistyakova
Mikhail Pautov
WIGM
157
0
0
06 Oct 2025
ERDE: Entropy-Regularized Distillation for Early-exit
Martial Guidez
S. Duffner
Yannick Alpou
Oscar Röth
Christophe Garcia
66
0
0
06 Oct 2025
Quantization Range Estimation for Convolutional Neural Networks
Bingtao Yang
Yujia Wang
Mengzhi Jiao
Hongwei Huo
MQ
115
0
0
05 Oct 2025
From Filters to VLMs: Benchmarking Defogging Methods through Object Detection and Segmentation Performance
Ardalan Aryashad
Parsa Razmara
Amin Mahjoub
Seyedarmin Azizi
Mahdi Salmani
Arad Firouzkouhi
VLM
93
0
0
04 Oct 2025
ReTiDe: Real-Time Denoising for Energy-Efficient Motion Picture Processing with FPGAs
Changhong Li
Clément Bled
Rosa Fernandez
Shreejith Shanker
64
0
0
04 Oct 2025
The Curious Case of In-Training Compression of State Space Models
Makram Chahine
Philipp Nazari
Daniela Rus
T. Konstantin Rusch
135
0
0
03 Oct 2025
FlexiQ: Adaptive Mixed-Precision Quantization for Latency/Accuracy Trade-Offs in Deep Neural Networks
Jaemin Kim
Hongjun Um
Sungkyun Kim
Yongjun Park
Jiwon Seo
MQ
105
0
0
03 Oct 2025
SAGE: Streaming Agreement-Driven Gradient Sketches for Representative Subset Selection
Ashish Jha
S. Ahmadi-Asl
137
0
0
02 Oct 2025
Nav-EE: Navigation-Guided Early Exiting for Efficient Vision-Language Models in Autonomous Driving
Haibo Hu
Lianming Huang
X. Wang
Yufei Cui
Shangyu Wu
Nan Guan
Chun Jason Xue
VLM
159
0
0
02 Oct 2025
A universal compression theory: Lottery ticket hypothesis and superpolynomial scaling laws
Hong-Yi Wang
Di Luo
T. Poggio
Isaac Chuang
Liu Ziyin
54
1
0
01 Oct 2025
The silence of the weights: an investigation of structural pruning strategies for attention-based audio signal architectures
Andrea Diecidue
C. Barbano
Piero Fraternali
Mathieu Fontaine
Enzo Tartaglione
52
0
0
30 Sep 2025
Post-Training Quantization via Residual Truncation and Zero Suppression for Diffusion Models
Donghoon Kim
Dongyoung Lee
Ik Joon Chang
Sung-Ho Bae
MQ
92
0
0
30 Sep 2025
CAST: Continuous and Differentiable Semi-Structured Sparsity-Aware Training for Large Language Models
Weiyu Huang
Yuezhou Hu
Jun Zhu
Jianfei Chen
CLL
88
0
0
30 Sep 2025
Enhancing Certifiable Semantic Robustness via Robust Pruning of Deep Neural Networks
Hanjiang Hu
Bowei Li
Ziwei Wang
Tianhao Wei
Casidhe Hutchison
Eric Sample
Changliu Liu
AAML
102
0
0
30 Sep 2025
Norm-Q: Effective Compression Method for Hidden Markov Models in Neuro-Symbolic Applications
Hanyuan Gao
Xiaoxuan Yang
MQ
84
0
0
29 Sep 2025
Budgeted Broadcast: An Activity-Dependent Pruning Rule for Neural Network Efficiency
Yaron Meirovitch
Fuming Yang
J. Lichtman
Nir Shavit
80
1
0
26 Sep 2025
Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers
Peter Shaw
James Cohan
Jacob Eisenstein
Kristina Toutanova
166
0
0
26 Sep 2025
MonoCon: A general framework for learning ultra-compact high-fidelity representations using monotonicity constraints
Shreyas Gokhale
84
0
0
26 Sep 2025
Progressive Weight Loading: Accelerating Initial Inference and Gradually Boosting Performance on Resource-Constrained Environments
Hyunwoo Kim
Junha Lee
M. Choi
J. Lee
Jaeshin Cho
VLM
98
0
0
26 Sep 2025
Smaller is Better: Enhancing Transparency in Vehicle AI Systems via Pruning
Sanish Suwal
Shaurya Garg
Dipkamal Bhusal
Michael Clifford
Nidhi Rastogi
AAML
98
1
0
24 Sep 2025
Embodied AI: From LLMs to World Models
Tongtong Feng
Xin Wang
Yu Jiang
Wenwu Zhu
LM&Ro
289
7
0
24 Sep 2025
Rule Encoding and Compliance in Large Language Models: An Information-Theoretic Analysis
Joachim Diederich
116
0
0
23 Sep 2025
Optimizing Inference in Transformer-Based Models: A Multi-Method Benchmark
Siu Hang Ho
Prasad Ganesan
Nguyen Duong
Daniel Schlabig
MQ
104
0
0
22 Sep 2025
TinyBEV: Cross Modal Knowledge Distillation for Efficient Multi Task Bird's Eye View Perception and Planning
Reeshad Khan
John Gauch
125
0
0
22 Sep 2025
Deep Hierarchical Learning with Nested Subspace Networks
Paulius Rauba
M. Schaar
72
0
0
22 Sep 2025
MEC-Quant: Maximum Entropy Coding for Extremely Low Bit Quantization-Aware Training
Junbiao Pang
Tianyang Cai
Baochang Zhang
MQ
100
0
0
19 Sep 2025
Detail Across Scales: Multi-Scale Enhancement for Full Spectrum Neural Representations
Yuan Ni
Zhantao Chen
Cheng Peng
Rajan Plumley
Chun Hong Yoon
Jana Thayer
J. Turner
80
0
0
19 Sep 2025
RMT-KD: Random Matrix Theoretic Causal Knowledge Distillation
Davide Ettori
Nastaran Darabi
Sureshkumar Senthilkumar
A. R. Trivedi
105
1
0
19 Sep 2025
GhostNetV3-Small: A Tailored Architecture and Comparative Study of Distillation Strategies for Tiny Images
Florian Zager
Hamza A. A. Gardi
180
0
0
15 Sep 2025
Modality Alignment with Multi-scale Bilateral Attention for Multimodal Recommendation
Kelin Ren
Chan-Yang Ju
Dong-Ho Lee
52
0
0
11 Sep 2025
Explaining How Quantization Disparately Skews a Model
Abhimanyu Bellam
Jung-Eun Kim
MQ
108
0
0
08 Sep 2025
1 bit is all we need: binary normalized neural networks
Eduardo Lobo Lustoda Cabral
Paulo Pirozelli
Larissa Driemeier
MQ
144
0
0
07 Sep 2025
AI-Driven Fronthaul Link Compression in Wireless Communication Systems: Review and Method Design
Keqin Zhang
36
0
0
05 Sep 2025
MambaLite-Micro: Memory-Optimized Mamba Inference on MCUs
Hongjun Xu
Junxi Xia
Weisi Yang
Yueyuan Sui
Stephen Xia
Mamba
148
0
0
05 Sep 2025
E-ARMOR: Edge case Assessment and Review of Multilingual Optical Character Recognition
Aryan Gupta
Anupam Purwar
VLM
72
1
0
03 Sep 2025
NeurStore: Efficient In-database Deep Learning Model Management System
Siqi Xiang
Sheng Wang
Xiaokui Xiao
Cong Yue
Zhanhao Zhao
Beng Chin Ooi
124
0
0
03 Sep 2025
QuantV2X: A Fully Quantized Multi-Agent System for Cooperative Perception
Seth Z. Zhao
Huizhi Zhang
Zhaowei Li
Juntong Peng
Anthony Chui
...
Fujia Wang
Ran Tian
Chenfeng Xu
Bolei Zhou
Jiaqi Ma
88
1
0
03 Sep 2025
UrbanInsight: A Distributed Edge Computing Framework with LLM-Powered Data Filtering for Smart City Digital Twins
Kishor Datta Gupta
Md Manjurul Ahsan
Mohd Ariful Haque
Roy George
Azmine Toushik Wasi
AI4CE
62
0
0
31 Aug 2025
PDTrim: Targeted Pruning for Prefill-Decode Disaggregation in Inference
Hao Zhang
Mengsi Lyu
Zhuo Chen
Xingrun Xing
Yulong Ao
Yonghua Lin
359
1
0
29 Aug 2025
Dual-Model Weight Selection and Self-Knowledge Distillation for Medical Image Classification
Ayaka Tsutsumi
Guang Li
Ren Togo
Takahiro Ogawa
Satoshi Kondo
Miki Haseyama
60
0
0
28 Aug 2025
SKGE-SWIN: End-To-End Autonomous Vehicle Waypoint Prediction and Navigation Using Skip Stage Swin Transformer
Fachri Najm Noer Kartiman
Rasim
Yaya Wihardi
Nurul Hasanah
Oskar Natan
Bambang Wahono
Taufik Ibnu Salim
ViT
40
0
0
28 Aug 2025
Previous
1
2
3
4
5
...
71
72
73
Next