SpArch: Efficient Architecture for Sparse Matrix Multiplication

International Symposium on High-Performance Computer Architecture (HPCA), 2020

20 February 2020

Song Han

Papers citing "SpArch: Efficient Architecture for Sparse Matrix Multiplication"

50 / 60 papers shown

From Principles to Practice: A Systematic Study of LLM Serving on Multi-core NPUs

181

07 Oct 2025

SparseMap: A Sparse Tensor Accelerator Framework Based on Evolution StrategyIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2025

18 Aug 2025

The Ubiquitous Sparse Matrix-Matrix Products

Aydın Buluç

204

06 Aug 2025

Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache QuantizationInternational Symposium on Computer Architecture (ISCA), 2025

380

24 Mar 2025

An Efficient Sparse Fine-Tuning with Low Quantization Error via Neural Network Pruning

Cen-Jhih Li

Aditya Bhaskara

480

17 Feb 2025

EXION: Exploiting Inter- and Intra-Iteration Output Sparsity for Diffusion ModelsInternational Symposium on High-Performance Computer Architecture (HPCA), 2025

384

10 Jan 2025

HC-SpMM: Accelerating Sparse Matrix-Matrix Multiplication for Graphs with Hybrid GPU CoresIEEE International Conference on Data Engineering (ICDE), 2024

398

12 Dec 2024

SHyPar: A Spectral Coarsening Approach to Hypergraph PartitioningIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2024

Hamed Sajadinia

Ali Aghdaei

Zhuo Feng

408

09 Oct 2024

Sparse Refinement for Efficient High-Resolution Semantic Segmentation

Zhijian Liu

Chenfeng Xu

Song Han

380

26 Jul 2024

SCATTER: Algorithm-Circuit Co-Sparse Photonic Accelerator with Thermal-Tolerant, Power-Efficient In-situ Light Redistribution

Rena Huang

Jiaqi Gu

366

07 Jul 2024

Misam: Using ML in Dataflow Selection of Sparse-Sparse Matrix Multiplication

Sanjali Yadav

Bahar Asgari

121

14 Jun 2024

Secure and Efficient General Matrix Multiplication On Cloud Using Homomorphic EncryptionJournal of Supercomputing (J. Supercomput.), 2024

388

03 May 2024

Privacy-aware Berrut Approximated Coded Computing for Federated Learning

Xavier Martínez Luana

Rebeca P. Díaz Redondo

Manuel Fernández-Veiga

FedML

531

02 May 2024

FLAASH: Flexible Accelerator Architecture for Sparse High-Order Tensor Contraction

Gabriel Kulp

Andrew Ensinger

Lizhong Chen

224

25 Apr 2024

NeuraChip: Accelerating GNN Computations with a Hash-based Decoupled Spatial Accelerator

Kaustubh Shivdikar

Nicolas Bohm Agostini

427

23 Apr 2024

Random Search as a Baseline for Sparse Neural Network Architecture Search

Rezsa Farahani

338

13 Mar 2024

No Free Prune: Information-Theoretic Barriers to Pruning at Initialization

Tanishq Kumar

Kevin Luo

Mark Sellke

355

02 Feb 2024

Transformer-QEC: Quantum Error Correction Code Decoding with Transferable Transformers

Song Han

309

27 Nov 2023

TorchSparse++: Efficient Training and Inference Framework for Sparse Convolution on GPUsMicro (MICRO), 2023

Zhijian Liu

Song Han

329

25 Oct 2023

SpikingNeRF: Making Bio-inspired Neural Networks See through the Real World

416

20 Sep 2023

Rosko: Row Skipping Outer Products for Sparse Matrix Multiplication Kernels

220

08 Jul 2023

Reparo: Loss-Resilient Generative Codec for Video Conferencing

Tianhong Li

Vibhaalakshmi Sivaraman

315

23 May 2023

HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity

313

22 May 2023

SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous DrivingInternational Symposium on High-Performance Computer Architecture (HPCA), 2023

Mingu Kang

320

12 May 2023

VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUsInternational Symposium on High-Performance Computer Architecture (HPCA), 2023

425

17 Feb 2023

SGCN: Exploiting Compressed-Sparse Features in Deep Graph Convolutional Network AcceleratorsInternational Symposium on High-Performance Computer Architecture (HPCA), 2023

320

25 Jan 2023

Slice-and-Forge: Making Better Use of Caches for Graph Convolutional Network AcceleratorsInternational Conference on Parallel Architectures and Compilation Techniques (PACT), 2022

322

24 Jan 2023

LearningGroup: A Real-Time Sparse Training on FPGA via Learnable Weight Grouping for Multi-Agent Reinforcement LearningInternational Conference on Field-Programmable Technology (ICFPT), 2022

Jenny Yang

Jaeuk Kim

Joo-Young Kim

228

29 Oct 2022

ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-DesignInternational Symposium on High-Performance Computer Architecture (HPCA), 2022

423

135

18 Oct 2022

Chiplets and the Codelet Model

D. Fox

J. M. Diaz

Xiaoming Li

106

13 Sep 2022

DiVa: An Accelerator for Differentially Private Machine LearningMicro (MICRO), 2022

313

26 Aug 2022

OpSparse: a Highly Optimized Framework for Sparse General Matrix Multiplication on GPUsIEEE Access (IEEE Access), 2022

314

15 Jun 2022

Accelerating CPU-Based Sparse General Matrix Multiplication With Binary Row MergingIEEE Access (IEEE Access), 2022

327

14 Jun 2022

Sparseloop: An Analytical Approach To Sparse Tensor Accelerator ModelingMicro (MICRO), 2022

291

12 May 2022

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Zhijian Liu

Song Han

290

138

25 Apr 2022

Boosting Pruned Networks with Linear Over-parameterizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

262

25 Apr 2022

GROW: A Row-Stationary Sparse-Dense GEMM Accelerator for Memory-Efficient Graph Convolutional Neural NetworksInternational Symposium on High-Performance Computer Architecture (HPCA), 2022

647

01 Mar 2022

QOC: Quantum On-Chip Training with Parameter Shift and Gradient PruningDesign Automation Conference (DAC), 2022

Song Han

607

26 Feb 2022

SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian ApproximationInternational Conference on Learning Representations (ICLR), 2022

Cong Guo

Jingwen Leng

Fan Yang

Yuhao Zhu

Minyi Guo

289

14 Feb 2022

Blocking Techniques for Sparse Matrix Multiplication on Tensor Accelerators

149

11 Feb 2022

SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Systems

448

13 Jan 2022

Phantom: A High-Performance Computational Core for Sparse Convolutional Neural Networks

Mahmood Azhar Qureshi

Arslan Munir

272

09 Nov 2021

Sextans: A Streaming Accelerator for General-Purpose Sparse-Matrix Dense-Matrix MultiplicationSymposium on Field Programmable Gate Arrays (FPGA), 2021

363

22 Sep 2021

Towards Memory-Efficient Neural Networks via Multi-Level in situ GenerationIEEE International Conference on Computer Vision (ICCV), 2021

249

25 Aug 2021

QuantumNAS: Noise-Adaptive Search for Robust Quantum CircuitsInternational Symposium on High-Performance Computer Architecture (HPCA), 2021

Frederic T. Chong

Song Han

690

272

22 Jul 2021

S2TA: Exploiting Structured Sparsity for Energy-Efficient Mobile CNN AccelerationInternational Symposium on High-Performance Computer Architecture (HPCA), 2021

Zhi-Gang Liu

P. Whatmough

Yuhao Zhu

Matthew Mattina

289

110

16 Jul 2021

GPTPU: Accelerating Applications using Edge Tensor Processing UnitsInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2021

Kuan-Chieh Hsu

Hung-Wei Tseng

261

22 Jun 2021

SMASH: Sparse Matrix Atomic Scratchpad Hashing

Kaustubh Shivdikar

272

29 May 2021

Dual-side Sparse Tensor CoreInternational Symposium on Computer Architecture (ISCA), 2021

Cong Guo

Jingwen Leng

288

20 May 2021

GPU Semiring Primitives for Sparse Neighborhood MethodsConference on Machine Learning and Systems (MLSys), 2021

245

13 Apr 2021