v1v2v3v4v5 (latest)

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015

Song Han

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,625 papers shown

Title
PermLLM: Learnable Channel Permutation for N:M Sparse Large Language Models Lancheng Zou Shuo Yin Zehua Pei Tsung-Yi Ho Farzan Farnia Bei Yu 64 0 0 11 Oct 2025
SQS: Bayesian DNN Compression through Sparse Quantized Sub-distributions Ziyi Wang Nan Jiang Guang Lin Qifan Song MQ 165 0 0 10 Oct 2025
Automated Evolutionary Optimization for Resource-Efficient Neural Network Training Ilia Revin Leon Strelkov Vadim A. Potemkin Ivan A Kireev Andrey Savchenko 88 0 0 10 Oct 2025
xRouter: Training Cost-Aware LLMs Orchestration System via Reinforcement Learning Cheng Qian Zuxin Liu Shirley Kokane Akshara Prabhakar Jielin Qiu ... Weiran Yao Shelby Heinecke Silvio Savarese Caiming Xiong Huan Wang 132 0 0 09 Oct 2025
Vanishing Contributions: A Unified Approach to Smoothly Transition Neural Models into Compressed Form Lorenzo Nikiforos Charalampos Antoniadis Luciano Prono F. Pareschi R. Rovatti Gianluca Setti 96 0 0 09 Oct 2025
Test-Time Reasoners Are Strategic Multiple-Choice Test-Takers Nishant Balepur Atrey Desai Rachel Rudinger LRM 76 0 0 09 Oct 2025
OBS-Diff: Accurate Pruning For Diffusion Models in One-Shot Junhan Zhu Hesong Wang Mingluo Su Zefang Wang Huan Wang DiffM VLM 139 0 0 08 Oct 2025
Where to Begin: Efficient Pretraining via Subnetwork Selection and Distillation Arjun Krishnakumar R. Sukthanker Hannan Javed Mahadik Gabriela Kadlecová Vladyslav Moroshan Timur Carstensen Frank Hutter Aaron Klein 97 0 0 08 Oct 2025
Downsized and Compromised?: Assessing the Faithfulness of Model Compression Moumita Kamal Douglas A. Talbert 92 0 0 07 Oct 2025
ActiveMark: on watermarking of visual foundation models via massive activations Anna Chistyakova Mikhail Pautov WIGM 157 0 0 06 Oct 2025
ERDE: Entropy-Regularized Distillation for Early-exit Martial Guidez S. Duffner Yannick Alpou Oscar Röth Christophe Garcia 66 0 0 06 Oct 2025
Quantization Range Estimation for Convolutional Neural Networks Bingtao Yang Yujia Wang Mengzhi Jiao Hongwei Huo MQ 115 0 0 05 Oct 2025
From Filters to VLMs: Benchmarking Defogging Methods through Object Detection and Segmentation Performance Ardalan Aryashad Parsa Razmara Amin Mahjoub Seyedarmin Azizi Mahdi Salmani Arad Firouzkouhi VLM 93 0 0 04 Oct 2025
ReTiDe: Real-Time Denoising for Energy-Efficient Motion Picture Processing with FPGAs Changhong Li Clément Bled Rosa Fernandez Shreejith Shanker 64 0 0 04 Oct 2025
The Curious Case of In-Training Compression of State Space Models Makram Chahine Philipp Nazari Daniela Rus T. Konstantin Rusch 135 0 0 03 Oct 2025
FlexiQ: Adaptive Mixed-Precision Quantization for Latency/Accuracy Trade-Offs in Deep Neural Networks Jaemin Kim Hongjun Um Sungkyun Kim Yongjun Park Jiwon Seo MQ 105 0 0 03 Oct 2025
SAGE: Streaming Agreement-Driven Gradient Sketches for Representative Subset Selection Ashish Jha S. Ahmadi-Asl 137 0 0 02 Oct 2025
Nav-EE: Navigation-Guided Early Exiting for Efficient Vision-Language Models in Autonomous Driving Haibo Hu Lianming Huang X. Wang Yufei Cui Shangyu Wu Nan Guan Chun Jason Xue VLM 159 0 0 02 Oct 2025
A universal compression theory: Lottery ticket hypothesis and superpolynomial scaling laws Hong-Yi Wang Di Luo T. Poggio Isaac Chuang Liu Ziyin 54 1 0 01 Oct 2025
The silence of the weights: an investigation of structural pruning strategies for attention-based audio signal architectures Andrea Diecidue C. Barbano Piero Fraternali Mathieu Fontaine Enzo Tartaglione 52 0 0 30 Sep 2025
Post-Training Quantization via Residual Truncation and Zero Suppression for Diffusion Models Donghoon Kim Dongyoung Lee Ik Joon Chang Sung-Ho Bae MQ 92 0 0 30 Sep 2025
CAST: Continuous and Differentiable Semi-Structured Sparsity-Aware Training for Large Language Models Weiyu Huang Yuezhou Hu Jun Zhu Jianfei Chen CLL 88 0 0 30 Sep 2025
Enhancing Certifiable Semantic Robustness via Robust Pruning of Deep Neural Networks Hanjiang Hu Bowei Li Ziwei Wang Tianhao Wei Casidhe Hutchison Eric Sample Changliu Liu AAML 102 0 0 30 Sep 2025
Norm-Q: Effective Compression Method for Hidden Markov Models in Neuro-Symbolic Applications Hanyuan Gao Xiaoxuan Yang MQ 84 0 0 29 Sep 2025
Budgeted Broadcast: An Activity-Dependent Pruning Rule for Neural Network Efficiency Yaron Meirovitch Fuming Yang J. Lichtman Nir Shavit 80 1 0 26 Sep 2025
Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers Peter Shaw James Cohan Jacob Eisenstein Kristina Toutanova 166 0 0 26 Sep 2025
MonoCon: A general framework for learning ultra-compact high-fidelity representations using monotonicity constraints Shreyas Gokhale 84 0 0 26 Sep 2025
Progressive Weight Loading: Accelerating Initial Inference and Gradually Boosting Performance on Resource-Constrained Environments Hyunwoo Kim Junha Lee M. Choi J. Lee Jaeshin Cho VLM 98 0 0 26 Sep 2025
Smaller is Better: Enhancing Transparency in Vehicle AI Systems via Pruning Sanish Suwal Shaurya Garg Dipkamal Bhusal Michael Clifford Nidhi Rastogi AAML 98 1 0 24 Sep 2025
Embodied AI: From LLMs to World Models Tongtong Feng Xin Wang Yu Jiang Wenwu Zhu LM&Ro 289 7 0 24 Sep 2025
Rule Encoding and Compliance in Large Language Models: An Information-Theoretic Analysis Joachim Diederich 116 0 0 23 Sep 2025
Optimizing Inference in Transformer-Based Models: A Multi-Method Benchmark Siu Hang Ho Prasad Ganesan Nguyen Duong Daniel Schlabig MQ 104 0 0 22 Sep 2025
TinyBEV: Cross Modal Knowledge Distillation for Efficient Multi Task Bird's Eye View Perception and Planning Reeshad Khan John Gauch 125 0 0 22 Sep 2025
Deep Hierarchical Learning with Nested Subspace Networks Paulius Rauba M. Schaar 72 0 0 22 Sep 2025
MEC-Quant: Maximum Entropy Coding for Extremely Low Bit Quantization-Aware Training Junbiao Pang Tianyang Cai Baochang Zhang MQ 100 0 0 19 Sep 2025
Detail Across Scales: Multi-Scale Enhancement for Full Spectrum Neural Representations Yuan Ni Zhantao Chen Cheng Peng Rajan Plumley Chun Hong Yoon Jana Thayer J. Turner 80 0 0 19 Sep 2025
RMT-KD: Random Matrix Theoretic Causal Knowledge Distillation Davide Ettori Nastaran Darabi Sureshkumar Senthilkumar A. R. Trivedi 105 1 0 19 Sep 2025
GhostNetV3-Small: A Tailored Architecture and Comparative Study of Distillation Strategies for Tiny Images Florian Zager Hamza A. A. Gardi 180 0 0 15 Sep 2025
Modality Alignment with Multi-scale Bilateral Attention for Multimodal Recommendation Kelin Ren Chan-Yang Ju Dong-Ho Lee 52 0 0 11 Sep 2025
Explaining How Quantization Disparately Skews a Model Abhimanyu Bellam Jung-Eun Kim MQ 108 0 0 08 Sep 2025
1 bit is all we need: binary normalized neural networks Eduardo Lobo Lustoda Cabral Paulo Pirozelli Larissa Driemeier MQ 144 0 0 07 Sep 2025
AI-Driven Fronthaul Link Compression in Wireless Communication Systems: Review and Method Design Keqin Zhang 36 0 0 05 Sep 2025
MambaLite-Micro: Memory-Optimized Mamba Inference on MCUs Hongjun Xu Junxi Xia Weisi Yang Yueyuan Sui Stephen Xia Mamba 148 0 0 05 Sep 2025
E-ARMOR: Edge case Assessment and Review of Multilingual Optical Character Recognition Aryan Gupta Anupam Purwar VLM 72 1 0 03 Sep 2025
NeurStore: Efficient In-database Deep Learning Model Management System Siqi Xiang Sheng Wang Xiaokui Xiao Cong Yue Zhanhao Zhao Beng Chin Ooi 124 0 0 03 Sep 2025
QuantV2X: A Fully Quantized Multi-Agent System for Cooperative Perception Seth Z. Zhao Huizhi Zhang Zhaowei Li Juntong Peng Anthony Chui ... Fujia Wang Ran Tian Chenfeng Xu Bolei Zhou Jiaqi Ma 88 1 0 03 Sep 2025
UrbanInsight: A Distributed Edge Computing Framework with LLM-Powered Data Filtering for Smart City Digital Twins Kishor Datta Gupta Md Manjurul Ahsan Mohd Ariful Haque Roy George Azmine Toushik Wasi AI4CE 62 0 0 31 Aug 2025
PDTrim: Targeted Pruning for Prefill-Decode Disaggregation in Inference Hao Zhang Mengsi Lyu Zhuo Chen Xingrun Xing Yulong Ao Yonghua Lin 359 1 0 29 Aug 2025
Dual-Model Weight Selection and Self-Knowledge Distillation for Medical Image Classification Ayaka Tsutsumi Guang Li Ren Togo Takahiro Ogawa Satoshi Kondo Miki Haseyama 60 0 0 28 Aug 2025
SKGE-SWIN: End-To-End Autonomous Vehicle Waypoint Prediction and Navigation Using Skip Stage Swin Transformer Fachri Najm Noer Kartiman Rasim Yaya Wihardi Nurul Hasanah Oskar Natan Bambang Wahono Taufik Ibnu Salim ViT 40 0 0 28 Aug 2025