Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2011.06391
Cited By
v1
v2 (latest)
FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph Neural Networks
7 November 2020
Md. Khaledur Rahman
Majedul Haque Sujon
A. Azad
FedML
GNN
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph Neural Networks"
26 / 26 papers shown
AutoSAGE: Input-Aware CUDA Scheduling for Sparse GNN Aggregation (SpMM/SDDMM) and CSR Attention
Aleksandar Stankovic
169
0
0
17 Nov 2025
FuseFlow: A Fusion-Centric Compilation Framework for Sparse Deep Learning on Streaming Dataflow
Rubens Lacouture
Nathan Zhang
Ritvik Sharma
Marco Siracusa
Fredrik Kjolstad
K. Olukotun
Olivia Hsu
196
3
0
06 Nov 2025
Fused3S: Fast Sparse Attention on Tensor Cores
International Conference on Supercomputing (ICS), 2025
Zitong Li
Aparna Chandramowlishwaran
GNN
231
0
0
12 May 2025
Ember: A Compiler for Efficient Embedding Operations on Decoupled Access-Execute Architectures
Marco Siracusa
Olivia Hsu
Victor Soria-Pardos
Joshua Randall
Arnaud Grasset
...
Doug Joseph
Randy Allen
Fredrik Kjolstad
Miquel Moretó Planas
Adrià Armejach
341
0
0
14 Apr 2025
Edge Graph Intelligence: Reciprocally Empowering Edge Networks with Graph Intelligence
Liekang Zeng
Shengyuan Ye
Xu Chen
Xiaoxi Zhang
Ju Ren
Jian Tang
Yang Yang
Xuemin
Shen
546
13
0
08 Jan 2025
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Zihao Ye
Lequn Chen
Ruihang Lai
Wuwei Lin
Yineng Zhang
...
Tianqi Chen
Baris Kasikci
Vinod Grover
Arvind Krishnamurthy
Luis Ceze
686
177
0
02 Jan 2025
DF-GNN: Dynamic Fusion Framework for Attention Graph Neural Networks on GPUs
LOG IN (LOG IN), 2024
Jiahui Liu
Zhenkun Cai
Zhiyong Chen
Minjie Wang
GNN
307
2
0
25 Nov 2024
Distributed-Memory Parallel Algorithms for Sparse Matrix and Sparse Tall-and-Skinny Matrix Multiplication
International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2024
Isuru Ranawaka
Md Taufique Hussain
Charles Block
Gerasimos Gerogiannis
Josep Torrellas
Ariful Azad
242
5
0
21 Aug 2024
GeoT: Tensor Centric Library for Graph Neural Network via Efficient Segment Reduction on GPU
Zhongming Yu
Genghan Zhang
Hanxian Huang
Xin Chen
Jishen Zhao
GNN
442
0
0
03 Apr 2024
iSpLib: A Library for Accelerating Graph Neural Networks using Auto-tuned Sparse Operations
Md Saidul Hoque Anik
Pranav Badhe
Rohit Gampa
Ariful Azad
AI4CE
237
5
0
21 Mar 2024
JITSPMM: Just-in-Time Instruction Generation for Accelerated Sparse Matrix-Matrix Multiplication
IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2023
Qiang Fu
Thomas B. Rolinger
H. H. Huang
297
6
0
09 Dec 2023
Performance Optimization of Deep Learning Sparse Matrix Kernels on Intel Max Series GPU
Mohammad Zubair
Christoph Bauinger
286
0
0
01 Nov 2023
SENSEi: Input-Sensitive Compilation for Accelerating GNNs
Damitha Sandeepa Lenadora
Vimarsh Sathia
Gerasimos Gerogiannis
Serif Yesil
Josep Torrellas
Charith Mendis
GNN
218
1
0
27 Jun 2023
A Survey on Graph Neural Network Acceleration: Algorithms, Systems, and Customized Hardware
Shichang Zhang
Atefeh Sohrabizadeh
Cheng Wan
Zijie Huang
Ziniu Hu
Yewen Wang
Yingyan Lin
Lin
Jason Cong
GNN
OOD
314
33
0
24 Jun 2023
BitGNN: Unleashing the Performance Potential of Binary Graph Neural Networks on GPUs
International Conference on Supercomputing (ICS), 2023
Jou-An Chen
Hsin-Hsuan Sung
Xipeng Shen
Sutanay Choudhury
Ang Li
GNN
MQ
376
11
0
04 May 2023
PhysGraph: Physics-Based Integration Using Graph Neural Networks
Oshri Halimi
E.T Larionov
Zohar Barzelay
Philipp Herholz
Tuur Stuyck
PINN
OOD
AI4CE
247
6
0
27 Jan 2023
Hector: An Efficient Programming and Compilation Framework for Implementing Relational Graph Neural Networks in GPU Architectures
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023
Kun Wu
Mert Hidayetoğlu
Xiang Song
Sitao Huang
Da Zheng
Israt Nisa
Wen-mei W. Hwu
GNN
300
6
0
16 Jan 2023
Scalable Graph Convolutional Network Training on Distributed-Memory Systems
Proceedings of the VLDB Endowment (PVLDB), 2022
G. Demirci
Aparajita Haldar
Hakan Ferhatosmanoglu
GNN
393
17
0
09 Dec 2022
Architectural Implications of Embedding Dimension during GCN on CPU and GPU
M. Adiletta
David Brooks
Gu-Yeon Wei
GNN
110
1
0
01 Dec 2022
Distributed Graph Neural Network Training: A Survey
ACM Computing Surveys (ACM CSUR), 2022
Yingxia Shao
Hongzheng Li
Xizhi Gu
Hongbo Yin
Yawen Li
Xupeng Miao
Wentao Zhang
Tengjiao Wang
Lei Chen
GNN
AI4CE
465
99
0
01 Nov 2022
RSC: Accelerating Graph Neural Networks Training via Randomized Sparse Computations
International Conference on Machine Learning (ICML), 2022
Zirui Liu
Sheng-Wei Chen
Kaixiong Zhou
Daochen Zha
Xiao Huang
Helen Zhou
353
24
0
19 Oct 2022
SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022
Zihao Ye
Ruihang Lai
Junru Shao
Tianqi Chen
Luis Ceze
545
128
0
11 Jul 2022
Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Maciej Besta
Torsten Hoefler
GNN
580
81
0
19 May 2022
Distributed-Memory Sparse Kernels for Machine Learning
IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2022
V. Bharadwaj
A. Buluç
J. Demmel
FedML
203
15
0
15 Mar 2022
A Comprehensive Analytical Survey on Unsupervised and Semi-Supervised Graph Representation Learning Methods
Md. Khaledur Rahman
A. Azad
AI4TS
200
4
0
20 Dec 2021
Parallel Minimum Spanning Forest Computation using Sparse Matrix Kernels
Tim Baer
Raghavendra Kanakagiri
Edgar Solomonik
268
3
0
10 Oct 2021
1
Page 1 of 1