Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.10901
Cited By
Sparse GPU Kernels for Deep Learning
18 June 2020
Trevor Gale
Matei A. Zaharia
C. Young
Erich Elsen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sparse GPU Kernels for Deep Learning"
50 / 120 papers shown
Title
Efficient Mixed Precision Quantization in Graph Neural Networks
Samir Moustafa
Nils M. Kriege
Wilfried Gansterer
GNN
MQ
35
0
0
14 May 2025
Fused3S: Fast Sparse Attention on Tensor Cores
Zitong Li
Aparna Chandramowlishwaran
GNN
45
0
0
12 May 2025
Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores
Chenpeng Wu
Qiqi Gu
Heng Shi
Jianguo Yao
Haibing Guan
MoE
48
0
0
13 Mar 2025
Exploiting Unstructured Sparsity in Fully Homomorphic Encrypted DNNs
Aidan Ferguson
Perry Gibson
Lara DÁgata
Parker McLeod
Ferhat Yaman
Amitabh Das
Ian Colbert
José Cano
58
0
0
12 Mar 2025
An Efficient Row-Based Sparse Fine-Tuning
Cen-Jhih Li
Aditya Bhaskara
56
0
0
17 Feb 2025
FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores
Jinliang Shi
Shigang Li
Youxuan Xu
Rongtian Fu
Xueying Wang
Tong Wu
75
3
0
15 Dec 2024
HC-SpMM: Accelerating Sparse Matrix-Matrix Multiplication for Graphs with Hybrid GPU Cores
Zhonggen Li
Xiangyu Ke
Yifan Zhu
Yunjun Gao
Yaofeng Tu
69
0
0
12 Dec 2024
SuperGCN: General and Scalable Framework for GCN Training on CPU-powered Supercomputers
Chen Zhuang
Peng Chen
Xin Liu
Rio Yokota
Nikoli Dryden
Toshio Endo
Satoshi Matsuoka
M. Wahib
GNN
67
0
0
25 Nov 2024
Navigating Extremes: Dynamic Sparsity in Large Output Spaces
Nasib Ullah
Erik Schultheis
Mike Lasby
Yani Andrew Ioannou
Rohit Babbar
35
0
0
05 Nov 2024
Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management
Tuowei Wang
Ruwen Fan
Minxing Huang
Zixu Hao
Kun Li
Ting Cao
Youyou Lu
Yaoxue Zhang
Ju Ren
45
2
0
25 Oct 2024
AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
Haiquan Lu
Yefan Zhou
Shiwei Liu
Zhangyang Wang
Michael W. Mahoney
Yaoqing Yang
29
0
0
14 Oct 2024
Input-Dependent Power Usage in GPUs
Theo Gregersen
Pratyush Patel
Esha Choukse
30
2
0
26 Sep 2024
High Performance Unstructured SpMM Computation Using Tensor Cores
Patrik Okanovic
Grzegorz Kwa'sniewski
P. S. Labini
Maciej Besta
Flavio Vella
Torsten Hoefler
28
4
0
21 Aug 2024
Nerva: a Truly Sparse Implementation of Neural Networks
Wieger Wesselink
Bram Grooten
Qiao Xiao
Cássio Machado de Campos
Mykola Pechenizkiy
30
0
0
24 Jul 2024
Scorch: A Library for Sparse Deep Learning
Bobby Yan
Alexander J. Root
Trevor Gale
David Broman
Fredrik Kjolstad
30
0
0
27 May 2024
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf Node
Andreas Charalampopoulos
Nikolas Chatzis
Foivos Ntoulas-Panagiotopoulos
Charilaos Papaioannou
Alexandros Potamianos
33
0
0
27 May 2024
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning
Jing Xu
Jingzhao Zhang
39
7
0
04 May 2024
MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning
Matteo Farina
Massimiliano Mancini
Elia Cunegatti
Gaowen Liu
Giovanni Iacca
Elisa Ricci
VLM
42
2
0
08 Apr 2024
GNNBENCH: Fair and Productive Benchmarking for Single-GPU GNN System
Yidong Gong
Pradeep Kumar
GNN
35
3
0
05 Apr 2024
GeoT: Tensor Centric Library for Graph Neural Network via Efficient Segment Reduction on GPU
Zhongming Yu
Genghan Zhang
Hanxian Huang
Xin Chen
Jishen Zhao
GNN
29
0
0
03 Apr 2024
LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels
Tuo Feng
Wenguan Wang
Fan Ma
Yi Yang
3DV
34
6
0
22 Mar 2024
Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons
Simon Dufort-Labbé
P. DÓro
Evgenii Nikishin
Razvan Pascanu
Pierre-Luc Bacon
A. Baratin
34
1
0
12 Mar 2024
HiRE: High Recall Approximate Top-
k
k
k
Estimation for Efficient LLM Inference
Yashas Samaga
Varun Yerram
Chong You
Srinadh Bhojanapalli
Sanjiv Kumar
Prateek Jain
Praneeth Netrapalli
51
4
0
14 Feb 2024
A2Q+: Improving Accumulator-Aware Weight Quantization
Ian Colbert
Alessandro Pappalardo
Jakoba Petri-Koenig
Yaman Umuroglu
MQ
21
4
0
19 Jan 2024
GNNShap: Scalable and Accurate GNN Explanation using Shapley Values
Selahattin Akkas
Ariful Azad
FAtt
37
3
0
09 Jan 2024
RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation
Mahdi Nikdan
Soroush Tabesh
Elvir Crnčević
Dan Alistarh
8
27
0
09 Jan 2024
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Keivan Alizadeh-Vahid
Iman Mirzadeh
Dmitry Belenko
Karen Khatamifard
Minsik Cho
C. C. D. Mundo
Mohammad Rastegari
Mehrdad Farajtabar
72
111
0
12 Dec 2023
Dimension Mixer: A Generalized Method for Structured Sparsity in Deep Neural Networks
Suman Sapkota
Binod Bhattarai
34
0
0
30 Nov 2023
A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures
Fabrizio Ferrandi
S. Curzel
Leandro Fiorin
Daniele Ielmini
Cristina Silvano
...
Salvatore Filippone
F. L. Presti
Francesco Silvestri
P. Palazzari
Stefania Perri
19
4
0
29 Nov 2023
Harnessing Manycore Processors with Distributed Memory for Accelerated Training of Sparse and Recurrent Models
Jan Finkbeiner
Thomas Gmeinder
M. Pupilli
A. Titterton
Emre Neftci
19
3
0
07 Nov 2023
Performance Optimization of Deep Learning Sparse Matrix Kernels on Intel Max Series GPU
Mohammad Zubair
Christoph Bauinger
14
0
0
01 Nov 2023
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs
Yu-xin Zhang
Lirui Zhao
Mingbao Lin
Yunyun Sun
Yiwu Yao
Xingjia Han
Jared Tanner
Shiwei Liu
Rongrong Ji
SyDa
37
40
0
13 Oct 2023
Sparse Fine-tuning for Inference Acceleration of Large Language Models
Eldar Kurtic
Denis Kuznedelev
Elias Frantar
Michael Goin
Dan Alistarh
27
8
0
10 Oct 2023
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity
Lu Yin
You Wu
Zhenyu (Allen) Zhang
Cheng-Yu Hsieh
Yaqing Wang
...
Mykola Pechenizkiy
Yi Liang
Michael Bendersky
Zhangyang Wang
Shiwei Liu
28
78
0
08 Oct 2023
VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
Roberto L. Castro
Andrei Ivanov
Diego Andrade
Tal Ben-Nun
B. Fraguela
Torsten Hoefler
19
15
0
03 Oct 2023
The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks
Cameron Shinn
Collin McCarthy
Saurav Muralidharan
Muhammad Osama
John Douglas Owens
16
2
0
30 Sep 2023
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
Haojun Xia
Zhen Zheng
Yuchao Li
Donglin Zhuang
Zhongzhu Zhou
Xiafei Qiu
Yong Li
Wei Lin
S. Song
59
11
0
19 Sep 2023
A Generalization of Continuous Relaxation in Structured Pruning
Brad Larson
Bishal Upadhyaya
Luke McDermott
Siddha Ganju
16
0
0
28 Aug 2023
A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance
Ian Colbert
Alessandro Pappalardo
Jakoba Petri-Koenig
MQ
16
9
0
25 Aug 2023
Cached Operator Reordering: A Unified View for Fast GNN Training
Julia Bazinska
Andrei Ivanov
Tal Ben-Nun
Nikoli Dryden
Maciej Besta
Siyuan Shen
Torsten Hoefler
GNN
22
3
0
23 Aug 2023
Mitigating Memory Wall Effects in CNN Engines with On-the-Fly Weights Generation
Stylianos I. Venieris
Javier Fernandez-Marques
Nicholas D. Lane
MQ
16
3
0
25 Jul 2023
Rosko: Row Skipping Outer Products for Sparse Matrix Multiplication Kernels
Vikas Natesh
Andrew Sabot
H. T. Kung
Mark Ting
22
0
0
08 Jul 2023
SparseOptimizer: Sparsify Language Models through Moreau-Yosida Regularization and Accelerate via Compiler Co-design
Fu-Ming Guo
MoE
13
0
0
27 Jun 2023
Sparse Modular Activation for Efficient Sequence Modeling
Liliang Ren
Yang Liu
Shuohang Wang
Yichong Xu
Chenguang Zhu
Chengxiang Zhai
43
13
0
19 Jun 2023
Breaking On-device Training Memory Wall: A Systematic Survey
Shitian Li
Chunlin Tian
Kahou Tam
Ruirui Ma
Li Li
21
2
0
17 Jun 2023
Dynamic Sparsity Is Channel-Level Sparsity Learner
Lu Yin
Gen Li
Meng Fang
Lijuan Shen
Tianjin Huang
Zhangyang Wang
Vlado Menkovski
Xiaolong Ma
Mykola Pechenizkiy
Shiwei Liu
25
20
0
30 May 2023
Reparo: Loss-Resilient Generative Codec for Video Conferencing
Tianhong Li
Vibhaalakshmi Sivaraman
Pantea Karimi
Lijie Fan
M. Alizadeh
Dina Katabi
19
7
0
23 May 2023
Dynamic Sparse Training with Structured Sparsity
Mike Lasby
A. Golubeva
Utku Evci
Mihai Nica
Yani Andrew Ioannou
29
19
0
03 May 2023
JaxPruner: A concise library for sparsity research
Jooyoung Lee
Wonpyo Park
Nicole Mitchell
Jonathan Pilault
J. Obando-Ceron
...
Hong-Seok Kim
Yann N. Dauphin
Karolina Dziugaite
P. S. Castro
Utku Evci
36
14
0
27 Apr 2023
STen: Productive and Efficient Sparsity in PyTorch
Andrei Ivanov
Nikoli Dryden
Tal Ben-Nun
Saleh Ashkboos
Torsten Hoefler
32
4
0
15 Apr 2023
1
2
3
Next