Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2001.00138
Cited By
v1
v2
v3
v4 (latest)
PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2020
1 January 2020
Wei Niu
Xiaolong Ma
Sheng Lin
Shihao Wang
Xuehai Qian
Xinyu Lin
Yanzhi Wang
Bin Ren
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning"
50 / 84 papers shown
Optimizing 3D Gaussian Splattering for Mobile GPUs
Md. Musfiqur Rahman Sanim
Zhihao Shu
Bahram Afsharmanesh
AmirAli Mirian
Jiexiong Guan
Wei Niu
Bin Ren
G. Agrawal
3DGS
180
0
0
20 Nov 2025
Optimizing Storage Overhead of User Behavior Log for ML-embedded Mobile Apps
Chen Gong
Yan Zhuang
Zhenzhe Zheng
Yiliu Chen
S. Wang
Fan Wu
Guihai Chen
172
1
0
15 Oct 2025
Dynamic Gradient Sparse Update for Edge Training
International Symposium on Circuits and Systems (ISCAS), 2024
I-Hsuan Li
Tian-Sheuan Chang
271
1
0
23 Mar 2025
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models
ACM Computing Surveys (ACM Comput. Surv.), 2025
Xubin Wang
Zhiqing Tang
Jianxiong Guo
Tianhui Meng
Chenhao Wang
Tian-sheng Wang
Weijia Jia
472
115
0
08 Mar 2025
Low-Rank Compression for IMC Arrays
Design, Automation and Test in Europe (DATE), 2025
Kang Eun Jeon
Johnny Rhe
J. Ko
219
2
0
10 Feb 2025
UPAQ: A Framework for Real-Time and Energy-Efficient 3D Object Detection in Autonomous Vehicles
Design, Automation and Test in Europe (DATE), 2025
Abhishek Balasubramaniam
Febin P. Sunny
S. Pasricha
3DPC
318
1
0
08 Jan 2025
AutoSculpt: A Pattern-based Model Auto-pruning Framework Using Reinforcement Learning and Graph Learning
Lixian Jing
Haobing Liu
Junyu Dong
Yanwei Yu
3DPC
AI4CE
343
2
0
24 Dec 2024
BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation
Haechan Mark Bong
Ricardo de Azambuja
Giovanni Beltrame
VLM
215
1
0
16 Oct 2024
AdapMTL: Adaptive Pruning Framework for Multitask Learning Model
ACM Multimedia (MM), 2024
Mingcan Xiang
Steven Jiaxun Tang
Qizheng Yang
Hui Guan
Tongping Liu
VLM
294
4
0
07 Aug 2024
Realizing Unaligned Block-wise Pruning for DNN Acceleration on Mobile Devices
Hayun Lee
Dongkun Shin
MQ
260
0
0
29 Jul 2024
AyE-Edge: Automated Deployment Space Search Empowering Accuracy yet Efficient Real-Time Object Detection on the Edge
Chao Wu
Yifan Gong
Liangkai Liu
Mengquan Li
Yushu Wu
Xuan Shen
Zhimin Li
Geng Yuan
Weisong Shi
Yanzhi Wang
283
2
0
25 Jul 2024
SoD
2
^2
2
: Statically Optimizing Dynamic Deep Neural Network
Wei Niu
Gagan Agrawal
Bin Ren
389
11
0
29 Feb 2024
REPrune: Channel Pruning via Kernel Representative Selection
Mincheol Park
Dongjin Kim
Cheonjun Park
Yuna Park
Gyeong Eun Gong
Won Woo Ro
Suhyun Kim
VLM
331
4
0
27 Feb 2024
Enhance DNN Adversarial Robustness and Efficiency via Injecting Noise to Non-Essential Neurons
Zhenyu Liu
Garrett Gagnon
Swagath Venkataramani
Liu Liu
AAML
289
2
0
06 Feb 2024
SmartFRZ: An Efficient Training Framework using Attention-Based Layer Freezing
Sheng Li
Geng Yuan
Yuezhen Dai
Youtao Zhang
Yanzhi Wang
Xulong Tang
425
27
0
30 Jan 2024
DTMM: Deploying TinyML Models on Extremely Weak IoT Devices with Pruning
IEEE Conference on Computer Communications (INFOCOM), 2024
Lixiang Han
Zhen Xiao
Zhenjiang Li
371
17
0
17 Jan 2024
Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision Quantization
K. Balaskas
Andreas Karatzas
Christos Sad
K. Siozios
Iraklis Anagnostopoulos
Georgios Zervakis
Jörg Henkel
MQ
248
29
0
23 Dec 2023
Real-time Neural Network Inference on Extremely Weak Devices: Agile Offloading with Explainable AI
Kai Huang
Wei Gao
263
49
0
21 Dec 2023
EdgeFM: Leveraging Foundation Model for Open-set Learning on the Edge
Bufang Yang
Lixing He
Neiwen Ling
Zhenyu Yan
Guoliang Xing
Xian Shuai
Xiaozhe Ren
Xin Jiang
522
36
0
18 Nov 2023
SparseByteNN: A Novel Mobile Inference Acceleration Framework Based on Fine-Grained Group Sparsity
Haitao Xu
Songwei Liu
Yuyang Xu
Shuai Wang
Jiashi Li
Chenqian Yan
Liangqiang Li
Xing Mei
Xin Pan
Fangmin Chen
MQ
163
3
0
30 Oct 2023
Edge-InversionNet: Enabling Efficient Inference of InversionNet on Edge Devices
Zhepeng Wang
Isaacshubhanand Putla
Weiwen Jiang
Youzuo Lin
249
3
0
14 Oct 2023
Enabling Resource-efficient AIoT System with Cross-level Optimization: A survey
IEEE Communications Surveys and Tutorials (COMST), 2023
Sicong Liu
Bin Guo
Cheng Fang
Ziqi Wang
Shiyan Luo
Zimu Zhou
Zhiwen Yu
AI4CE
355
40
0
27 Sep 2023
Efficient N:M Sparse DNN Training Using Algorithm, Architecture, and Dataflow Co-Design
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (IEEE TCAD), 2023
Chao Fang
Wei Sun
Aojun Zhou
Zhongfeng Wang
230
20
0
22 Sep 2023
Towards Artificial General Intelligence (AGI) in the Internet of Things (IoT): Opportunities and Challenges
Fei Dou
Jin Ye
Geng Yuan
Qin Lu
Wei Niu
...
Hongyue Sun
Yunli Shao
Changying Li
Tianming Liu
Wenzhan Song
AI4CE
252
40
0
14 Sep 2023
LLMCad: Fast and Scalable On-device Large Language Model Inference
Daliang Xu
Wangsong Yin
Xin Jin
Yanzhe Zhang
Shiyun Wei
Mengwei Xu
Xuanzhe Liu
263
72
0
08 Sep 2023
EdgeMoE: Empowering Sparse Large Language Models on Mobile Devices
IEEE Transactions on Mobile Computing (IEEE TMC), 2023
Rongjie Yi
Liwei Guo
Shiyun Wei
Ao Zhou
Shangguang Wang
Mengwei Xu
MoE
241
24
0
28 Aug 2023
FPGA Resource-aware Structured Pruning for Real-Time Neural Networks
International Conference on Field-Programmable Technology (ICFPT), 2023
Benjamin Ramhorst
Vladimir Loncar
George A. Constantinides
254
13
0
09 Aug 2023
Towards Machine Learning and Inference for Resource-constrained MCUs
ACM SIGMOBILE International Conference on Mobile Systems, Applications, and Services (MobiSys), 2023
Yu-Shan Huang
Hamed Haddadi
237
2
0
30 May 2023
Revisiting Data Augmentation in Model Compression: An Empirical and Comprehensive Study
IEEE International Joint Conference on Neural Network (IJCNN), 2023
Muzhou Yu
Linfeng Zhang
Kaisheng Ma
330
2
0
22 May 2023
HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity
Yannan Nellie Wu
Po-An Tsai
Saurav Muralidharan
A. Parashar
Vivienne Sze
J. Emer
312
46
0
22 May 2023
Surrogate Lagrangian Relaxation: A Path To Retrain-free Deep Neural Network Pruning
Shangli Zhou
Mikhail A. Bragin
Lynn Pepin
Deniz Gurevin
Fei Miao
Caiwen Ding
190
3
0
08 Apr 2023
Mobiprox: Supporting Dynamic Approximate Computing on Mobiles
IEEE Internet of Things Journal (IEEE IoT J.), 2023
Matevz Fabjancic
O. Machidon
Hashim Sharif
Yifan Zhao
Sasa Misailovic
V. Pejović
343
2
0
16 Mar 2023
R-TOSS: A Framework for Real-Time Object Detection using Semi-Structured Pruning
Design Automation Conference (DAC), 2023
Abhishek Balasubramaniam
Febin P. Sunny
S. Pasricha
VLM
212
15
0
03 Mar 2023
When Layers Play the Lottery, all Tickets Win at Initialization
Artur Jordão
George Correa de Araujo
H. Maia
Hélio Pedrini
356
4
0
25 Jan 2023
SGCN: Exploiting Compressed-Sparse Features in Deep Graph Convolutional Network Accelerators
International Symposium on High-Performance Computer Architecture (HPCA), 2023
Mingi Yoo
Jaeyong Song
Jounghoo Lee
Namhyung Kim
Youngsok Kim
Jinho Lee
GNN
320
33
0
25 Jan 2023
Slice-and-Forge: Making Better Use of Caches for Graph Convolutional Network Accelerators
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2022
Min-hee Yoo
Jaeyong Song
Hyeyoon Lee
Jounghoo Lee
Namhyung Kim
Youngsok Kim
Jinho Lee
GNN
318
5
0
24 Jan 2023
Reaching the Edge of the Edge: Image Analysis in Space
R. Bayer
Julian Priest
Pınar Tözün
406
9
0
12 Jan 2023
All-in-One: A Highly Representative DNN Pruning Framework for Edge Devices with Dynamic Power Management
Yifan Gong
Zheng Zhan
Pu Zhao
Yushu Wu
Chaoan Wu
Caiwen Ding
Weiwen Jiang
Minghai Qin
Yanzhi Wang
213
9
0
09 Dec 2022
Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge Devices
Asia and South Pacific Design Automation Conference (ASP-DAC), 2022
Yimeng Zhang
A. Kamath
Qiucheng Wu
Zhiwen Fan
Wuyang Chen
Zinan Lin
Shiyu Chang
Sijia Liu
Cong Hao
312
6
0
16 Oct 2022
Advancing Model Pruning via Bi-level Optimization
Neural Information Processing Systems (NeurIPS), 2022
Yihua Zhang
Yuguang Yao
Parikshit Ram
Pu Zhao
Tianlong Chen
Min-Fong Hong
Yanzhi Wang
Sijia Liu
489
89
0
08 Oct 2022
Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training
Neural Information Processing Systems (NeurIPS), 2022
Geng Yuan
Yanyu Li
Sheng Li
Zhenglun Kong
Sergey Tulyakov
Xulong Tang
Yanzhi Wang
Jian Ren
329
21
0
22 Sep 2022
SparCL: Sparse Continual Learning on the Edge
Neural Information Processing Systems (NeurIPS), 2022
Zifeng Wang
Zheng Zhan
Yifan Gong
Geng Yuan
Wei Niu
T. Jian
Bin Ren
Stratis Ioannidis
Yanzhi Wang
Jennifer Dy
CLL
363
85
0
20 Sep 2022
Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution
European Conference on Computer Vision (ECCV), 2022
Yushu Wu
Yifan Gong
Pu Zhao
Yanyu Li
Zheng Zhan
Wei Niu
Hao Tang
Minghai Qin
Bin Ren
Yanzhi Wang
SupR
MQ
293
35
0
25 Jul 2022
EVE: Environmental Adaptive Neural Network Models for Low-power Energy Harvesting System
Sahidul Islam
Shangli Zhou
Ran Ran
Yufang Jin
Wu-Shao Wen
Caiwen Ding
Mimi Xie
209
10
0
14 Jul 2022
Sparse Periodic Systolic Dataflow for Lowering Latency and Power Dissipation of Convolutional Neural Network Accelerators
International Symposium on Low Power Electronics and Design (ISLPED), 2022
J. Heo
A. Fayyazi
Amirhossein Esmaili
Massoud Pedram
259
4
0
30 Jun 2022
Compressing Pre-trained Transformers via Low-Bit NxM Sparsity for Natural Language Understanding
Connor Holmes
Minjia Zhang
Yuxiong He
Bo Wu
190
3
0
30 Jun 2022
CoCoPIE XGen: A Full-Stack AI-Oriented Optimizing Framework
Xiaofeng Li
Bin Ren
Xipeng Shen
Yanzhi Wang
GNN
161
0
0
21 Jun 2022
Boosting DNN Cold Inference on Edge Devices
ACM SIGMOBILE International Conference on Mobile Systems, Applications, and Services (MobiSys), 2022
Rongjie Yi
Ting Cao
Ao Zhou
Xiao Ma
Shangguang Wang
Mengwei Xu
828
16
0
15 Jun 2022
Slim-neck by GSConv: A lightweight-design for real-time detector architectures
Journal of Real-Time Image Processing (JRTIP), 2022
Hulin Li
Jun Li
Hanbing Wei
Zheng Liu
Zhenfei Zhan
Qiliang Ren
308
493
0
06 Jun 2022
Compilation and Optimizations for Efficient Machine Learning on Embedded Systems
Xiaofan Zhang
Yao Chen
Cong Hao
Sitao Huang
Yuhong Li
Deming Chen
358
2
0
06 Jun 2022
1
2
Next
Page 1 of 2