Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1510.00149
Cited By
v1
v2
v3
v4
v5 (latest)
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,628 papers shown
Title
Deep Coherence Learning: An Unsupervised Deep Beamformer for High Quality Single Plane Wave Imaging in Medical Ultrasound
Hyunwoo Cho
Seongjun Park
Jinbum Kang
Yangmo Yoo
OOD
54
14
0
18 Nov 2023
ECLM: Efficient Edge-Cloud Collaborative Learning with Continuous Environment Adaptation
Zhuang Yan
Zhenzhe Zheng
Yunfeng Shao
Bingshuai Li
Fan Wu
Guihai Chen
149
6
0
18 Nov 2023
Improved TokenPose with Sparsity
Anning Li
ViT
176
0
0
16 Nov 2023
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Ting-Yun Chang
Jesse Thomason
Robin Jia
263
24
0
15 Nov 2023
FedCode: Communication-Efficient Federated Learning via Transferring Codebooks
International Conference on Edge Computing [Services Society] (EDGE), 2023
Saeed Khalilian Gourtani
Vasileios Tsouvalas
T. Ozcelebi
N. Meratnia
FedML
274
7
0
15 Nov 2023
Boolean Variation and Boolean Logic BackPropagation
Van Minh Nguyen
185
2
0
13 Nov 2023
Training A Multi-stage Deep Classifier with Feedback Signals
Chao Xu
Yu Yang
Rong Wang
Guan Wang
Bojia Lin
112
0
0
12 Nov 2023
5G Positioning Advancements with AI/ML
Mohammad Alawieh
Georgios Kontes
104
6
0
10 Nov 2023
Quantized Distillation: Optimizing Driver Activity Recognition Models for Resource-Constrained Environments
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Calvin Tanama
Kunyu Peng
Zdravko Marinov
Rainer Stiefelhagen
Alina Roitberg
182
2
0
10 Nov 2023
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
International Conference on Learning Representations (ICLR), 2023
Daniel Y. Fu
Hermann Kumbong
Eric N. D. Nguyen
Christopher Ré
VLM
236
38
0
10 Nov 2023
Compressed and Sparse Models for Non-Convex Decentralized Learning
Andrew Campbell
Hang Liu
Leah Woldemariam
Anna Scaglione
174
0
0
09 Nov 2023
Adaptive Compression-Aware Split Learning and Inference for Enhanced Network Efficiency
Akrit Mudvari
Antero Vainio
Iason Ofeidis
Sasu Tarkoma
Leandros Tassiulas
301
11
0
09 Nov 2023
Game Theory Solutions in Sensor-Based Human Activity Recognition: A Review
M. Shayesteh
Behrooz Sharokhzadeh
B. Masoumi
98
3
0
09 Nov 2023
Exploiting Neural-Network Statistics for Low-Power DNN Inference
IEEE Open Journal of Circuits and Systems (JOCS), 2023
Lennart Bamberg
Ardalan Najafi
Alberto García-Ortiz
64
1
0
09 Nov 2023
Beyond Size: How Gradients Shape Pruning Decisions in Large Language Models
Rocktim Jyoti Das
Mingjie Sun
Liqun Ma
Zhiqiang Shen
VLM
154
23
0
08 Nov 2023
Mini but Mighty: Finetuning ViTs with Mini Adapters
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Imad Eddine Marouf
Enzo Tartaglione
Stéphane Lathuilière
145
11
0
07 Nov 2023
Machine learning's own Industrial Revolution
Yuan Luo
Song Han
Jingjing Liu
AI4CE
204
0
0
04 Nov 2023
AFPQ: Asymmetric Floating Point Quantization for LLMs
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Yijia Zhang
Sicheng Zhang
Shijie Cao
Dayou Du
Jianyu Wei
Ting Cao
Ningyi Xu
MQ
127
7
0
03 Nov 2023
Flow-Based Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection
Neural Information Processing Systems (NeurIPS), 2023
Haibao Yu
Yingjuan Tang
Enze Xie
Jilei Mao
Ping Luo
Zaiqing Nie
3DPC
215
48
0
03 Nov 2023
Ultra-Efficient On-Device Object Detection on AI-Integrated Smart Glasses with TinyissimoYOLO
Julian Moosmann
Pietro Bonazzi
Yawei Li
Sizhen Bian
Philipp Mayer
Luca Benini
Michele Magno
363
20
0
02 Nov 2023
Efficient LLM Inference on CPUs
Haihao Shen
Hanwen Chang
Bo Dong
Yu Luo
Hengyu Meng
MQ
197
30
0
01 Nov 2023
Federated Topic Model and Model Pruning Based on Variational Autoencoder
Chengjie Ma
Yawen Li
M. Liang
Ang Li
FedML
58
1
0
01 Nov 2023
Importance Estimation with Random Gradient for Neural Network Pruning
Suman Sapkota
Binod Bhattarai
178
2
0
31 Oct 2023
PriPrune: Quantifying and Preserving Privacy in Pruned Federated Learning
ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS), 2023
Tianyue Chu
Mengwei Yang
Nikolaos Laoutaris
A. Markopoulou
234
9
0
30 Oct 2023
SparseByteNN: A Novel Mobile Inference Acceleration Framework Based on Fine-Grained Group Sparsity
Haitao Xu
Songwei Liu
Yuyang Xu
Shuai Wang
Jiashi Li
Chenqian Yan
Liangqiang Li
Xing Mei
Xin Pan
Fangmin Chen
MQ
105
3
0
30 Oct 2023
Efficient IoT Inference via Context-Awareness
Mohammad Mehdi Rastikerdar
Jin Huang
Shiwei Fang
Hui Guan
Deepak Ganesan
236
0
0
29 Oct 2023
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Conference on Machine Learning and Systems (MLSys), 2023
Yilong Zhao
Chien-Yu Lin
Kan Zhu
Zihao Ye
Lequn Chen
Wenlei Bao
Luis Ceze
Arvind Krishnamurthy
Tianqi Chen
Baris Kasikci
MQ
323
228
0
29 Oct 2023
FedPEAT: Convergence of Federated Learning, Parameter-Efficient Fine Tuning, and Emulator Assisted Tuning for Artificial Intelligence Foundation Models with Mobile Edge Computing
Terence Jie Chua
Wen-li Yu
Junfeng Zhao
Kwok-Yan Lam
FedML
210
6
0
26 Oct 2023
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
International Conference on Machine Learning (ICML), 2023
Zichang Liu
Jue Wang
Tri Dao
Wanrong Zhu
Binhang Yuan
...
Anshumali Shrivastava
Ce Zhang
Yuandong Tian
Christopher Ré
Beidi Chen
BDL
284
271
0
26 Oct 2023
How Robust is Federated Learning to Communication Error? A Comparison Study Between Uplink and Downlink Channels
IEEE Wireless Communications and Networking Conference (WCNC), 2023
Linping Qu
Shenghui Song
Chi-Ying Tsui
Yuyi Mao
126
3
0
25 Oct 2023
E-Sparse: Boosting the Large Language Model Inference through Entropy-based N:M Sparsity
Yun Li
Lin Niu
Xipeng Zhang
Kai Liu
Jianchen Zhu
Zhanhui Kang
MoE
169
17
0
24 Oct 2023
LoRAShear: Efficient Large Language Model Structured Pruning and Knowledge Recovery
Tianyi Chen
Tianyu Ding
Badal Yadav
Ilya Zharkov
Luming Liang
245
37
0
24 Oct 2023
Federated learning compression designed for lightweight communications
International Conference on Electronics, Circuits, and Systems (ICECS), 2023
Lucas Grativol Ribeiro
Mathieu Léonardon
Guillaume Muller
Virginie Fresse
Matthieu Arzel
FedML
169
5
0
23 Oct 2023
Large Search Model: Redefining Search Stack in the Era of LLMs
Liang Wang
Nan Yang
Xiaolong Huang
Linjun Yang
Rangan Majumder
Furu Wei
LRM
KELM
216
23
0
23 Oct 2023
One is More: Diverse Perspectives within a Single Network for Efficient DRL
Yiqin Tan
Ling Pan
Longbo Huang
OffRL
261
0
0
21 Oct 2023
Breaking through Deterministic Barriers: Randomized Pruning Mask Generation and Selection
Jianwei Li
Weizhi Gao
Qi Lei
Dongkuan Xu
312
3
0
19 Oct 2023
SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
International Conference on Learning Representations (ICLR), 2023
Chongyu Fan
Jiancheng Liu
Yihua Zhang
Eric Wong
Dennis Wei
Sijia Liu
MU
465
251
0
19 Oct 2023
Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads
Micro (MICRO), 2023
Hongxiang Fan
Stylianos I. Venieris
Alexandros Kouris
Nicholas D. Lane
182
10
0
17 Oct 2023
RefConv: Re-parameterized Refocusing Convolution for Powerful ConvNets
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Zhicheng Cai
Xiaohan Ding
Qiu Shen
Xun Cao
210
29
0
16 Oct 2023
The Road to On-board Change Detection: A Lightweight Patch-Level Change Detection Network via Exploring the Potential of Pruning and Pooling
Lihui Xue
Zhihao Wang
Xueqian Wang
Gang Li
229
2
0
16 Oct 2023
Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models
Wenqi Jiang
Marco Zeller
R. Waleffe
Torsten Hoefler
Gustavo Alonso
365
38
0
15 Oct 2023
Edge-InversionNet: Enabling Efficient Inference of InversionNet on Edge Devices
Zhepeng Wang
Isaacshubhanand Putla
Weiwen Jiang
Youzuo Lin
183
3
0
14 Oct 2023
Prompt Backdoors in Visual Prompt Learning
Hai Huang
Subrat Kishore Dutta
Michael Backes
Yun Shen
Yang Zhang
VLM
VPVLM
AAML
SILM
171
3
0
11 Oct 2023
Efficient machine-learning surrogates for large-scale geological carbon and energy storage
T. Kadeethum
Stephen J Verzi
Hongkyu Yoon
AI4CE
146
2
0
11 Oct 2023
Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
International Conference on Learning Representations (ICLR), 2023
Mengzhou Xia
Tianyu Gao
Zhiyuan Zeng
Danqi Chen
377
402
0
10 Oct 2023
Progressive Neural Compression for Adaptive Image Offloading under Timing Constraints
IEEE Real-Time Systems Symposium (RTSS), 2023
Ruiqi Wang
Hanyang Liu
Jiaming Qiu
Moran Xu
Roch Guérin
Chenyang Lu
170
8
0
08 Oct 2023
Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM
Luoming Zhang
Wen Fei
Weijia Wu
Yefei He
Zhenyu Lou
Hong Zhou
MQ
182
5
0
07 Oct 2023
Extract-Transform-Load for Video Streams
Proceedings of the VLDB Endowment (PVLDB), 2023
Ferdinand Kossmann
Ziniu Wu
Eugenie Lai
Nesime Tatbul
Lei Cao
Tim Kraska
Samuel Madden
162
18
0
07 Oct 2023
The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning
Tian Jin
Nolan Clement
Xin Dong
Vaishnavh Nagarajan
Michael Carbin
Jonathan Ragan-Kelley
Gintare Karolina Dziugaite
LRM
275
5
0
07 Oct 2023
Model Compression in Practice: Lessons Learned from Practitioners Creating On-device Machine Learning Experiences
International Conference on Human Factors in Computing Systems (CHI), 2023
Fred Hohman
Mary Beth Kery
Donghao Ren
Dominik Moritz
325
24
0
06 Oct 2023
Previous
1
2
3
...
13
14
15
...
71
72
73
Next