v1v2v3v4v5 (latest)

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015

Song Han

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,628 papers shown

Title
Deep Coherence Learning: An Unsupervised Deep Beamformer for High Quality Single Plane Wave Imaging in Medical Ultrasound Hyunwoo Cho Seongjun Park Jinbum Kang Yangmo Yoo OOD 54 14 0 18 Nov 2023
ECLM: Efficient Edge-Cloud Collaborative Learning with Continuous Environment Adaptation Zhuang Yan Zhenzhe Zheng Yunfeng Shao Bingshuai Li Fan Wu Guihai Chen 149 6 0 18 Nov 2023
Improved TokenPose with Sparsity Anning Li ViT 176 0 0 16 Nov 2023
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two BenchmarksNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 Ting-Yun Chang Jesse Thomason Robin Jia 263 24 0 15 Nov 2023
FedCode: Communication-Efficient Federated Learning via Transferring CodebooksInternational Conference on Edge Computing [Services Society] (EDGE), 2023 Saeed Khalilian Gourtani Vasileios Tsouvalas T. Ozcelebi N. Meratnia FedML 274 7 0 15 Nov 2023
Boolean Variation and Boolean Logic BackPropagation Van Minh Nguyen 185 2 0 13 Nov 2023
Training A Multi-stage Deep Classifier with Feedback Signals Chao Xu Yu Yang Rong Wang Guan Wang Bojia Lin 112 0 0 12 Nov 2023
5G Positioning Advancements with AI/ML Mohammad Alawieh Georgios Kontes 104 6 0 10 Nov 2023
Quantized Distillation: Optimizing Driver Activity Recognition Models for Resource-Constrained EnvironmentsIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023 Calvin Tanama Kunyu Peng Zdravko Marinov Rainer Stiefelhagen Alina Roitberg 182 2 0 10 Nov 2023
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor CoresInternational Conference on Learning Representations (ICLR), 2023 Daniel Y. Fu Hermann Kumbong Eric N. D. Nguyen Christopher Ré VLM 236 38 0 10 Nov 2023
Compressed and Sparse Models for Non-Convex Decentralized Learning Andrew Campbell Hang Liu Leah Woldemariam Anna Scaglione 174 0 0 09 Nov 2023
Adaptive Compression-Aware Split Learning and Inference for Enhanced Network Efficiency Akrit Mudvari Antero Vainio Iason Ofeidis Sasu Tarkoma Leandros Tassiulas 301 11 0 09 Nov 2023
Game Theory Solutions in Sensor-Based Human Activity Recognition: A Review M. Shayesteh Behrooz Sharokhzadeh B. Masoumi 98 3 0 09 Nov 2023
Exploiting Neural-Network Statistics for Low-Power DNN InferenceIEEE Open Journal of Circuits and Systems (JOCS), 2023 Lennart Bamberg Ardalan Najafi Alberto García-Ortiz 64 1 0 09 Nov 2023
Beyond Size: How Gradients Shape Pruning Decisions in Large Language Models Rocktim Jyoti Das Mingjie Sun Liqun Ma Zhiqiang Shen VLM 154 23 0 08 Nov 2023
Mini but Mighty: Finetuning ViTs with Mini AdaptersIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023 Imad Eddine Marouf Enzo Tartaglione Stéphane Lathuilière 145 11 0 07 Nov 2023
Machine learning's own Industrial Revolution Yuan Luo Song Han Jingjing Liu AI4CE 204 0 0 04 Nov 2023
AFPQ: Asymmetric Floating Point Quantization for LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Yijia Zhang Sicheng Zhang Shijie Cao Dayou Du Jianyu Wei Ting Cao Ningyi Xu MQ 127 7 0 03 Nov 2023
Flow-Based Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object DetectionNeural Information Processing Systems (NeurIPS), 2023 Haibao Yu Yingjuan Tang Enze Xie Jilei Mao Ping Luo Zaiqing Nie 3DPC 215 48 0 03 Nov 2023
Ultra-Efficient On-Device Object Detection on AI-Integrated Smart Glasses with TinyissimoYOLO Julian Moosmann Pietro Bonazzi Yawei Li Sizhen Bian Philipp Mayer Luca Benini Michele Magno 363 20 0 02 Nov 2023
Efficient LLM Inference on CPUs Haihao Shen Hanwen Chang Bo Dong Yu Luo Hengyu Meng MQ 197 30 0 01 Nov 2023
Federated Topic Model and Model Pruning Based on Variational Autoencoder Chengjie Ma Yawen Li M. Liang Ang Li FedML 58 1 0 01 Nov 2023
Importance Estimation with Random Gradient for Neural Network Pruning Suman Sapkota Binod Bhattarai 178 2 0 31 Oct 2023
PriPrune: Quantifying and Preserving Privacy in Pruned Federated LearningACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS), 2023 Tianyue Chu Mengwei Yang Nikolaos Laoutaris A. Markopoulou 234 9 0 30 Oct 2023
SparseByteNN: A Novel Mobile Inference Acceleration Framework Based on Fine-Grained Group Sparsity Haitao Xu Songwei Liu Yuyang Xu Shuai Wang Jiashi Li Chenqian Yan Liangqiang Li Xing Mei Xin Pan Fangmin Chen MQ 105 3 0 30 Oct 2023
Efficient IoT Inference via Context-Awareness Mohammad Mehdi Rastikerdar Jin Huang Shiwei Fang Hui Guan Deepak Ganesan 236 0 0 29 Oct 2023
Atom: Low-bit Quantization for Efficient and Accurate LLM ServingConference on Machine Learning and Systems (MLSys), 2023 Yilong Zhao Chien-Yu Lin Kan Zhu Zihao Ye Lequn Chen Wenlei Bao Luis Ceze Arvind Krishnamurthy Tianqi Chen Baris Kasikci MQ 323 228 0 29 Oct 2023
FedPEAT: Convergence of Federated Learning, Parameter-Efficient Fine Tuning, and Emulator Assisted Tuning for Artificial Intelligence Foundation Models with Mobile Edge Computing Terence Jie Chua Wen-li Yu Junfeng Zhao Kwok-Yan Lam FedML 210 6 0 26 Oct 2023
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference TimeInternational Conference on Machine Learning (ICML), 2023 Zichang Liu Jue Wang Tri Dao Wanrong Zhu Binhang Yuan ... Anshumali Shrivastava Ce Zhang Yuandong Tian Christopher Ré Beidi Chen BDL 284 271 0 26 Oct 2023
How Robust is Federated Learning to Communication Error? A Comparison Study Between Uplink and Downlink ChannelsIEEE Wireless Communications and Networking Conference (WCNC), 2023 Linping Qu Shenghui Song Chi-Ying Tsui Yuyi Mao 126 3 0 25 Oct 2023
E-Sparse: Boosting the Large Language Model Inference through Entropy-based N:M Sparsity Yun Li Lin Niu Xipeng Zhang Kai Liu Jianchen Zhu Zhanhui Kang MoE 169 17 0 24 Oct 2023
LoRAShear: Efficient Large Language Model Structured Pruning and Knowledge Recovery Tianyi Chen Tianyu Ding Badal Yadav Ilya Zharkov Luming Liang 245 37 0 24 Oct 2023
Federated learning compression designed for lightweight communicationsInternational Conference on Electronics, Circuits, and Systems (ICECS), 2023 Lucas Grativol Ribeiro Mathieu Léonardon Guillaume Muller Virginie Fresse Matthieu Arzel FedML 169 5 0 23 Oct 2023
Large Search Model: Redefining Search Stack in the Era of LLMs Liang Wang Nan Yang Xiaolong Huang Linjun Yang Rangan Majumder Furu Wei LRM KELM 216 23 0 23 Oct 2023
One is More: Diverse Perspectives within a Single Network for Efficient DRL Yiqin Tan Ling Pan Longbo Huang OffRL 261 0 0 21 Oct 2023
Breaking through Deterministic Barriers: Randomized Pruning Mask Generation and Selection Jianwei Li Weizhi Gao Qi Lei Dongkuan Xu 312 3 0 19 Oct 2023
SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and GenerationInternational Conference on Learning Representations (ICLR), 2023 Chongyu Fan Jiancheng Liu Yihua Zhang Eric Wong Dennis Wei Sijia Liu MU 465 251 0 19 Oct 2023
Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN WorkloadsMicro (MICRO), 2023 Hongxiang Fan Stylianos I. Venieris Alexandros Kouris Nicholas D. Lane 182 10 0 17 Oct 2023
RefConv: Re-parameterized Refocusing Convolution for Powerful ConvNetsIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023 Zhicheng Cai Xiaohan Ding Qiu Shen Xun Cao 210 29 0 16 Oct 2023
The Road to On-board Change Detection: A Lightweight Patch-Level Change Detection Network via Exploring the Potential of Pruning and Pooling Lihui Xue Zhihao Wang Xueqian Wang Gang Li 229 2 0 16 Oct 2023
Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models Wenqi Jiang Marco Zeller R. Waleffe Torsten Hoefler Gustavo Alonso 365 38 0 15 Oct 2023
Edge-InversionNet: Enabling Efficient Inference of InversionNet on Edge Devices Zhepeng Wang Isaacshubhanand Putla Weiwen Jiang Youzuo Lin 183 3 0 14 Oct 2023
Prompt Backdoors in Visual Prompt Learning Hai Huang Subrat Kishore Dutta Michael Backes Yun Shen Yang Zhang VLM VPVLM AAML SILM 171 3 0 11 Oct 2023
Efficient machine-learning surrogates for large-scale geological carbon and energy storage T. Kadeethum Stephen J Verzi Hongkyu Yoon AI4CE 146 2 0 11 Oct 2023
Sheared LLaMA: Accelerating Language Model Pre-training via Structured PruningInternational Conference on Learning Representations (ICLR), 2023 Mengzhou Xia Tianyu Gao Zhiyuan Zeng Danqi Chen 377 402 0 10 Oct 2023
Progressive Neural Compression for Adaptive Image Offloading under Timing ConstraintsIEEE Real-Time Systems Symposium (RTSS), 2023 Ruiqi Wang Hanyang Liu Jiaming Qiu Moran Xu Roch Guérin Chenyang Lu 170 8 0 08 Oct 2023
Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM Luoming Zhang Wen Fei Weijia Wu Yefei He Zhenyu Lou Hong Zhou MQ 182 5 0 07 Oct 2023
Extract-Transform-Load for Video StreamsProceedings of the VLDB Endowment (PVLDB), 2023 Ferdinand Kossmann Ziniu Wu Eugenie Lai Nesime Tatbul Lei Cao Tim Kraska Samuel Madden 162 18 0 07 Oct 2023
The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning Tian Jin Nolan Clement Xin Dong Vaishnavh Nagarajan Michael Carbin Jonathan Ragan-Kelley Gintare Karolina Dziugaite LRM 275 5 0 07 Oct 2023
Model Compression in Practice: Lessons Learned from Practitioners Creating On-device Machine Learning ExperiencesInternational Conference on Human Factors in Computing Systems (CHI), 2023 Fred Hohman Mary Beth Kery Donghao Ren Dominik Moritz 325 24 0 06 Oct 2023