v1v2v3v4 (latest)

PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning

International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2020

1 January 2020

Papers citing "PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning"

50 / 84 papers shown

Optimizing 3D Gaussian Splattering for Mobile GPUs

Md. Musfiqur Rahman Sanim

180

20 Nov 2025

Optimizing Storage Overhead of User Behavior Log for ML-embedded Mobile Apps

172

15 Oct 2025

Dynamic Gradient Sparse Update for Edge TrainingInternational Symposium on Circuits and Systems (ISCAS), 2024

I-Hsuan Li

Tian-Sheuan Chang

271

23 Mar 2025

Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI ModelsACM Computing Surveys (ACM Comput. Surv.), 2025

472

115

08 Mar 2025

Low-Rank Compression for IMC ArraysDesign, Automation and Test in Europe (DATE), 2025

Kang Eun Jeon

Johnny Rhe

J. Ko

219

10 Feb 2025

UPAQ: A Framework for Real-Time and Energy-Efficient 3D Object Detection in Autonomous VehiclesDesign, Automation and Test in Europe (DATE), 2025

Abhishek Balasubramaniam

Febin P. Sunny

S. Pasricha

3DPC

318

08 Jan 2025

AutoSculpt: A Pattern-based Model Auto-pruning Framework Using Reinforcement Learning and Graph Learning

343

24 Dec 2024

BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation

215

16 Oct 2024

AdapMTL: Adaptive Pruning Framework for Multitask Learning ModelACM Multimedia (MM), 2024

294

07 Aug 2024

Realizing Unaligned Block-wise Pruning for DNN Acceleration on Mobile Devices

Hayun Lee

Dongkun Shin

260

29 Jul 2024

AyE-Edge: Automated Deployment Space Search Empowering Accuracy yet Efficient Real-Time Object Detection on the Edge

283

25 Jul 2024

SoD$^2$: Statically Optimizing Dynamic Deep Neural Network

SoD

^2

: Statically Optimizing Dynamic Deep Neural Network

Wei Niu

Gagan Agrawal

Bin Ren

389

29 Feb 2024

REPrune: Channel Pruning via Kernel Representative Selection

331

27 Feb 2024

Enhance DNN Adversarial Robustness and Efficiency via Injecting Noise to Non-Essential Neurons

Zhenyu Liu

Garrett Gagnon

Swagath Venkataramani

Liu Liu

AAML

289

06 Feb 2024

SmartFRZ: An Efficient Training Framework using Attention-Based Layer Freezing

425

30 Jan 2024

DTMM: Deploying TinyML Models on Extremely Weak IoT Devices with PruningIEEE Conference on Computer Communications (INFOCOM), 2024

Lixiang Han

Zhen Xiao

Zhenjiang Li

371

17 Jan 2024

Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision Quantization

Iraklis Anagnostopoulos

Georgios Zervakis

Jörg Henkel

248

23 Dec 2023

Real-time Neural Network Inference on Extremely Weak Devices: Agile Offloading with Explainable AI

Kai Huang

Wei Gao

263

21 Dec 2023

EdgeFM: Leveraging Foundation Model for Open-set Learning on the Edge

522

18 Nov 2023

SparseByteNN: A Novel Mobile Inference Acceleration Framework Based on Fine-Grained Group Sparsity

163

30 Oct 2023

Edge-InversionNet: Enabling Efficient Inference of InversionNet on Edge Devices

Zhepeng Wang

Isaacshubhanand Putla

Weiwen Jiang

Youzuo Lin

249

14 Oct 2023

Enabling Resource-efficient AIoT System with Cross-level Optimization: A surveyIEEE Communications Surveys and Tutorials (COMST), 2023

355

27 Sep 2023

Efficient N:M Sparse DNN Training Using Algorithm, Architecture, and Dataflow Co-DesignIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (IEEE TCAD), 2023

230

22 Sep 2023

Towards Artificial General Intelligence (AGI) in the Internet of Things (IoT): Opportunities and Challenges

...

Changying Li

Tianming Liu

Wenzhan Song

AI4CE

252

14 Sep 2023

LLMCad: Fast and Scalable On-device Large Language Model Inference

Daliang Xu

Wangsong Yin

Xin Jin

Yanzhe Zhang

Shiyun Wei

Mengwei Xu

Xuanzhe Liu

263

08 Sep 2023

EdgeMoE: Empowering Sparse Large Language Models on Mobile DevicesIEEE Transactions on Mobile Computing (IEEE TMC), 2023

Mengwei Xu

241

28 Aug 2023

FPGA Resource-aware Structured Pruning for Real-Time Neural NetworksInternational Conference on Field-Programmable Technology (ICFPT), 2023

Benjamin Ramhorst

Vladimir Loncar

George A. Constantinides

254

09 Aug 2023

Towards Machine Learning and Inference for Resource-constrained MCUsACM SIGMOBILE International Conference on Mobile Systems, Applications, and Services (MobiSys), 2023

Yu-Shan Huang

Hamed Haddadi

237

30 May 2023

Revisiting Data Augmentation in Model Compression: An Empirical and Comprehensive StudyIEEE International Joint Conference on Neural Network (IJCNN), 2023

Muzhou Yu

Linfeng Zhang

Kaisheng Ma

330

22 May 2023

HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity

312

22 May 2023

Surrogate Lagrangian Relaxation: A Path To Retrain-free Deep Neural Network Pruning

Caiwen Ding

190

08 Apr 2023

Mobiprox: Supporting Dynamic Approximate Computing on MobilesIEEE Internet of Things Journal (IEEE IoT J.), 2023

343

16 Mar 2023

R-TOSS: A Framework for Real-Time Object Detection using Semi-Structured PruningDesign Automation Conference (DAC), 2023

Abhishek Balasubramaniam

Febin P. Sunny

S. Pasricha

VLM

212

03 Mar 2023

When Layers Play the Lottery, all Tickets Win at Initialization

Artur Jordão

George Correa de Araujo

H. Maia

Hélio Pedrini

356

25 Jan 2023

SGCN: Exploiting Compressed-Sparse Features in Deep Graph Convolutional Network AcceleratorsInternational Symposium on High-Performance Computer Architecture (HPCA), 2023

320

25 Jan 2023

Slice-and-Forge: Making Better Use of Caches for Graph Convolutional Network AcceleratorsInternational Conference on Parallel Architectures and Compilation Techniques (PACT), 2022

318

24 Jan 2023

Reaching the Edge of the Edge: Image Analysis in Space

R. Bayer

Julian Priest

Pınar Tözün

406

12 Jan 2023

All-in-One: A Highly Representative DNN Pruning Framework for Edge Devices with Dynamic Power Management

Caiwen Ding

213

09 Dec 2022

Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge DevicesAsia and South Pacific Design Automation Conference (ASP-DAC), 2022

312

16 Oct 2022

Advancing Model Pruning via Bi-level OptimizationNeural Information Processing Systems (NeurIPS), 2022

489

08 Oct 2022

Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse TrainingNeural Information Processing Systems (NeurIPS), 2022

329

22 Sep 2022

SparCL: Sparse Continual Learning on the EdgeNeural Information Processing Systems (NeurIPS), 2022

Jennifer Dy

363

20 Sep 2022

Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-ResolutionEuropean Conference on Computer Vision (ECCV), 2022

293

25 Jul 2022

EVE: Environmental Adaptive Neural Network Models for Low-power Energy Harvesting System

Caiwen Ding

209

14 Jul 2022

Sparse Periodic Systolic Dataflow for Lowering Latency and Power Dissipation of Convolutional Neural Network AcceleratorsInternational Symposium on Low Power Electronics and Design (ISLPED), 2022

259

30 Jun 2022

Compressing Pre-trained Transformers via Low-Bit NxM Sparsity for Natural Language Understanding

Connor Holmes

Minjia Zhang

Yuxiong He

Bo Wu

190

30 Jun 2022

CoCoPIE XGen: A Full-Stack AI-Oriented Optimizing Framework

161

21 Jun 2022

Boosting DNN Cold Inference on Edge DevicesACM SIGMOBILE International Conference on Mobile Systems, Applications, and Services (MobiSys), 2022

Mengwei Xu

828

15 Jun 2022

Slim-neck by GSConv: A lightweight-design for real-time detector architecturesJournal of Real-Time Image Processing (JRTIP), 2022

308

493

06 Jun 2022

Compilation and Optimizations for Efficient Machine Learning on Embedded Systems

358

06 Jun 2022