ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.00149
  4. Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding
v1v2v3v4v5 (latest)

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015
Song Han
Huizi Mao
W. Dally
    3DGS
ArXiv (abs)PDFHTML

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,628 papers shown
Title
Deep Coherence Learning: An Unsupervised Deep Beamformer for High
  Quality Single Plane Wave Imaging in Medical Ultrasound
Deep Coherence Learning: An Unsupervised Deep Beamformer for High Quality Single Plane Wave Imaging in Medical Ultrasound
Hyunwoo Cho
Seongjun Park
Jinbum Kang
Yangmo Yoo
OOD
54
14
0
18 Nov 2023
ECLM: Efficient Edge-Cloud Collaborative Learning with Continuous
  Environment Adaptation
ECLM: Efficient Edge-Cloud Collaborative Learning with Continuous Environment Adaptation
Zhuang Yan
Zhenzhe Zheng
Yunfeng Shao
Bingshuai Li
Fan Wu
Guihai Chen
149
6
0
18 Nov 2023
Improved TokenPose with Sparsity
Improved TokenPose with Sparsity
Anning Li
ViT
176
0
0
16 Nov 2023
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale
  of Two Benchmarks
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two BenchmarksNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Ting-Yun Chang
Jesse Thomason
Robin Jia
263
24
0
15 Nov 2023
FedCode: Communication-Efficient Federated Learning via Transferring
  Codebooks
FedCode: Communication-Efficient Federated Learning via Transferring CodebooksInternational Conference on Edge Computing [Services Society] (EDGE), 2023
Saeed Khalilian Gourtani
Vasileios Tsouvalas
T. Ozcelebi
N. Meratnia
FedML
274
7
0
15 Nov 2023
Boolean Variation and Boolean Logic BackPropagation
Boolean Variation and Boolean Logic BackPropagation
Van Minh Nguyen
185
2
0
13 Nov 2023
Training A Multi-stage Deep Classifier with Feedback Signals
Training A Multi-stage Deep Classifier with Feedback Signals
Chao Xu
Yu Yang
Rong Wang
Guan Wang
Bojia Lin
112
0
0
12 Nov 2023
5G Positioning Advancements with AI/ML
5G Positioning Advancements with AI/ML
Mohammad Alawieh
Georgios Kontes
104
6
0
10 Nov 2023
Quantized Distillation: Optimizing Driver Activity Recognition Models
  for Resource-Constrained Environments
Quantized Distillation: Optimizing Driver Activity Recognition Models for Resource-Constrained EnvironmentsIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Calvin Tanama
Kunyu Peng
Zdravko Marinov
Rainer Stiefelhagen
Alina Roitberg
182
2
0
10 Nov 2023
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor
  Cores
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor CoresInternational Conference on Learning Representations (ICLR), 2023
Daniel Y. Fu
Hermann Kumbong
Eric N. D. Nguyen
Christopher Ré
VLM
236
38
0
10 Nov 2023
Compressed and Sparse Models for Non-Convex Decentralized Learning
Compressed and Sparse Models for Non-Convex Decentralized Learning
Andrew Campbell
Hang Liu
Leah Woldemariam
Anna Scaglione
174
0
0
09 Nov 2023
Adaptive Compression-Aware Split Learning and Inference for Enhanced
  Network Efficiency
Adaptive Compression-Aware Split Learning and Inference for Enhanced Network Efficiency
Akrit Mudvari
Antero Vainio
Iason Ofeidis
Sasu Tarkoma
Leandros Tassiulas
301
11
0
09 Nov 2023
Game Theory Solutions in Sensor-Based Human Activity Recognition: A
  Review
Game Theory Solutions in Sensor-Based Human Activity Recognition: A Review
M. Shayesteh
Behrooz Sharokhzadeh
B. Masoumi
98
3
0
09 Nov 2023
Exploiting Neural-Network Statistics for Low-Power DNN Inference
Exploiting Neural-Network Statistics for Low-Power DNN InferenceIEEE Open Journal of Circuits and Systems (JOCS), 2023
Lennart Bamberg
Ardalan Najafi
Alberto García-Ortiz
64
1
0
09 Nov 2023
Beyond Size: How Gradients Shape Pruning Decisions in Large Language
  Models
Beyond Size: How Gradients Shape Pruning Decisions in Large Language Models
Rocktim Jyoti Das
Mingjie Sun
Liqun Ma
Zhiqiang Shen
VLM
154
23
0
08 Nov 2023
Mini but Mighty: Finetuning ViTs with Mini Adapters
Mini but Mighty: Finetuning ViTs with Mini AdaptersIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Imad Eddine Marouf
Enzo Tartaglione
Stéphane Lathuilière
145
11
0
07 Nov 2023
Machine learning's own Industrial Revolution
Machine learning's own Industrial Revolution
Yuan Luo
Song Han
Jingjing Liu
AI4CE
204
0
0
04 Nov 2023
AFPQ: Asymmetric Floating Point Quantization for LLMs
AFPQ: Asymmetric Floating Point Quantization for LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Yijia Zhang
Sicheng Zhang
Shijie Cao
Dayou Du
Jianyu Wei
Ting Cao
Ningyi Xu
MQ
127
7
0
03 Nov 2023
Flow-Based Feature Fusion for Vehicle-Infrastructure Cooperative 3D
  Object Detection
Flow-Based Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object DetectionNeural Information Processing Systems (NeurIPS), 2023
Haibao Yu
Yingjuan Tang
Enze Xie
Jilei Mao
Ping Luo
Zaiqing Nie
3DPC
215
48
0
03 Nov 2023
Ultra-Efficient On-Device Object Detection on AI-Integrated Smart Glasses with TinyissimoYOLO
Ultra-Efficient On-Device Object Detection on AI-Integrated Smart Glasses with TinyissimoYOLO
Julian Moosmann
Pietro Bonazzi
Yawei Li
Sizhen Bian
Philipp Mayer
Luca Benini
Michele Magno
363
20
0
02 Nov 2023
Efficient LLM Inference on CPUs
Efficient LLM Inference on CPUs
Haihao Shen
Hanwen Chang
Bo Dong
Yu Luo
Hengyu Meng
MQ
197
30
0
01 Nov 2023
Federated Topic Model and Model Pruning Based on Variational Autoencoder
Federated Topic Model and Model Pruning Based on Variational Autoencoder
Chengjie Ma
Yawen Li
M. Liang
Ang Li
FedML
58
1
0
01 Nov 2023
Importance Estimation with Random Gradient for Neural Network Pruning
Importance Estimation with Random Gradient for Neural Network Pruning
Suman Sapkota
Binod Bhattarai
178
2
0
31 Oct 2023
PriPrune: Quantifying and Preserving Privacy in Pruned Federated
  Learning
PriPrune: Quantifying and Preserving Privacy in Pruned Federated LearningACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS), 2023
Tianyue Chu
Mengwei Yang
Nikolaos Laoutaris
A. Markopoulou
234
9
0
30 Oct 2023
SparseByteNN: A Novel Mobile Inference Acceleration Framework Based on
  Fine-Grained Group Sparsity
SparseByteNN: A Novel Mobile Inference Acceleration Framework Based on Fine-Grained Group Sparsity
Haitao Xu
Songwei Liu
Yuyang Xu
Shuai Wang
Jiashi Li
Chenqian Yan
Liangqiang Li
Xing Mei
Xin Pan
Fangmin Chen
MQ
105
3
0
30 Oct 2023
Efficient IoT Inference via Context-Awareness
Efficient IoT Inference via Context-Awareness
Mohammad Mehdi Rastikerdar
Jin Huang
Shiwei Fang
Hui Guan
Deepak Ganesan
236
0
0
29 Oct 2023
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Atom: Low-bit Quantization for Efficient and Accurate LLM ServingConference on Machine Learning and Systems (MLSys), 2023
Yilong Zhao
Chien-Yu Lin
Kan Zhu
Zihao Ye
Lequn Chen
Wenlei Bao
Luis Ceze
Arvind Krishnamurthy
Tianqi Chen
Baris Kasikci
MQ
323
228
0
29 Oct 2023
FedPEAT: Convergence of Federated Learning, Parameter-Efficient Fine
  Tuning, and Emulator Assisted Tuning for Artificial Intelligence Foundation
  Models with Mobile Edge Computing
FedPEAT: Convergence of Federated Learning, Parameter-Efficient Fine Tuning, and Emulator Assisted Tuning for Artificial Intelligence Foundation Models with Mobile Edge Computing
Terence Jie Chua
Wen-li Yu
Junfeng Zhao
Kwok-Yan Lam
FedML
210
6
0
26 Oct 2023
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference TimeInternational Conference on Machine Learning (ICML), 2023
Zichang Liu
Jue Wang
Tri Dao
Wanrong Zhu
Binhang Yuan
...
Anshumali Shrivastava
Ce Zhang
Yuandong Tian
Christopher Ré
Beidi Chen
BDL
284
271
0
26 Oct 2023
How Robust is Federated Learning to Communication Error? A Comparison
  Study Between Uplink and Downlink Channels
How Robust is Federated Learning to Communication Error? A Comparison Study Between Uplink and Downlink ChannelsIEEE Wireless Communications and Networking Conference (WCNC), 2023
Linping Qu
Shenghui Song
Chi-Ying Tsui
Yuyi Mao
126
3
0
25 Oct 2023
E-Sparse: Boosting the Large Language Model Inference through
  Entropy-based N:M Sparsity
E-Sparse: Boosting the Large Language Model Inference through Entropy-based N:M Sparsity
Yun Li
Lin Niu
Xipeng Zhang
Kai Liu
Jianchen Zhu
Zhanhui Kang
MoE
169
17
0
24 Oct 2023
LoRAShear: Efficient Large Language Model Structured Pruning and
  Knowledge Recovery
LoRAShear: Efficient Large Language Model Structured Pruning and Knowledge Recovery
Tianyi Chen
Tianyu Ding
Badal Yadav
Ilya Zharkov
Luming Liang
245
37
0
24 Oct 2023
Federated learning compression designed for lightweight communications
Federated learning compression designed for lightweight communicationsInternational Conference on Electronics, Circuits, and Systems (ICECS), 2023
Lucas Grativol Ribeiro
Mathieu Léonardon
Guillaume Muller
Virginie Fresse
Matthieu Arzel
FedML
169
5
0
23 Oct 2023
Large Search Model: Redefining Search Stack in the Era of LLMs
Large Search Model: Redefining Search Stack in the Era of LLMs
Liang Wang
Nan Yang
Xiaolong Huang
Linjun Yang
Rangan Majumder
Furu Wei
LRMKELM
216
23
0
23 Oct 2023
One is More: Diverse Perspectives within a Single Network for Efficient
  DRL
One is More: Diverse Perspectives within a Single Network for Efficient DRL
Yiqin Tan
Ling Pan
Longbo Huang
OffRL
261
0
0
21 Oct 2023
Breaking through Deterministic Barriers: Randomized Pruning Mask
  Generation and Selection
Breaking through Deterministic Barriers: Randomized Pruning Mask Generation and Selection
Jianwei Li
Weizhi Gao
Qi Lei
Dongkuan Xu
312
3
0
19 Oct 2023
SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency
  in Both Image Classification and Generation
SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and GenerationInternational Conference on Learning Representations (ICLR), 2023
Chongyu Fan
Jiancheng Liu
Yihua Zhang
Eric Wong
Dennis Wei
Sijia Liu
MU
465
251
0
19 Oct 2023
Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse
  Multi-DNN Workloads
Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN WorkloadsMicro (MICRO), 2023
Hongxiang Fan
Stylianos I. Venieris
Alexandros Kouris
Nicholas D. Lane
182
10
0
17 Oct 2023
RefConv: Re-parameterized Refocusing Convolution for Powerful ConvNets
RefConv: Re-parameterized Refocusing Convolution for Powerful ConvNetsIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Zhicheng Cai
Xiaohan Ding
Qiu Shen
Xun Cao
210
29
0
16 Oct 2023
The Road to On-board Change Detection: A Lightweight Patch-Level Change
  Detection Network via Exploring the Potential of Pruning and Pooling
The Road to On-board Change Detection: A Lightweight Patch-Level Change Detection Network via Exploring the Potential of Pruning and Pooling
Lihui Xue
Zhihao Wang
Xueqian Wang
Gang Li
229
2
0
16 Oct 2023
Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models
Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models
Wenqi Jiang
Marco Zeller
R. Waleffe
Torsten Hoefler
Gustavo Alonso
365
38
0
15 Oct 2023
Edge-InversionNet: Enabling Efficient Inference of InversionNet on Edge
  Devices
Edge-InversionNet: Enabling Efficient Inference of InversionNet on Edge Devices
Zhepeng Wang
Isaacshubhanand Putla
Weiwen Jiang
Youzuo Lin
183
3
0
14 Oct 2023
Prompt Backdoors in Visual Prompt Learning
Prompt Backdoors in Visual Prompt Learning
Hai Huang
Subrat Kishore Dutta
Michael Backes
Yun Shen
Yang Zhang
VLMVPVLMAAMLSILM
171
3
0
11 Oct 2023
Efficient machine-learning surrogates for large-scale geological carbon
  and energy storage
Efficient machine-learning surrogates for large-scale geological carbon and energy storage
T. Kadeethum
Stephen J Verzi
Hongkyu Yoon
AI4CE
146
2
0
11 Oct 2023
Sheared LLaMA: Accelerating Language Model Pre-training via Structured
  Pruning
Sheared LLaMA: Accelerating Language Model Pre-training via Structured PruningInternational Conference on Learning Representations (ICLR), 2023
Mengzhou Xia
Tianyu Gao
Zhiyuan Zeng
Danqi Chen
377
402
0
10 Oct 2023
Progressive Neural Compression for Adaptive Image Offloading under
  Timing Constraints
Progressive Neural Compression for Adaptive Image Offloading under Timing ConstraintsIEEE Real-Time Systems Symposium (RTSS), 2023
Ruiqi Wang
Hanyang Liu
Jiaming Qiu
Moran Xu
Roch Guérin
Chenyang Lu
170
8
0
08 Oct 2023
Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM
Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM
Luoming Zhang
Wen Fei
Weijia Wu
Yefei He
Zhenyu Lou
Hong Zhou
MQ
182
5
0
07 Oct 2023
Extract-Transform-Load for Video Streams
Extract-Transform-Load for Video StreamsProceedings of the VLDB Endowment (PVLDB), 2023
Ferdinand Kossmann
Ziniu Wu
Eugenie Lai
Nesime Tatbul
Lei Cao
Tim Kraska
Samuel Madden
162
18
0
07 Oct 2023
The Cost of Down-Scaling Language Models: Fact Recall Deteriorates
  before In-Context Learning
The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning
Tian Jin
Nolan Clement
Xin Dong
Vaishnavh Nagarajan
Michael Carbin
Jonathan Ragan-Kelley
Gintare Karolina Dziugaite
LRM
275
5
0
07 Oct 2023
Model Compression in Practice: Lessons Learned from Practitioners
  Creating On-device Machine Learning Experiences
Model Compression in Practice: Lessons Learned from Practitioners Creating On-device Machine Learning ExperiencesInternational Conference on Human Factors in Computing Systems (CHI), 2023
Fred Hohman
Mary Beth Kery
Donghao Ren
Dominik Moritz
325
24
0
06 Oct 2023
Previous
123...131415...717273
Next