ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.03329
  4. Cited By
Load-Balanced Sparse MTTKRP on GPUs

Load-Balanced Sparse MTTKRP on GPUs

6 April 2019
Israt Nisa
Jiajia Li
Aravind Sukumaran-Rajam
R. Vuduc
P. Sadayappan
ArXiv (abs)PDFHTML

Papers citing "Load-Balanced Sparse MTTKRP on GPUs"

11 / 11 papers shown
Title
A Sparse Tensor Generator with Efficient Feature Extraction
A Sparse Tensor Generator with Efficient Feature Extraction
Tugba Torun
Eren Yenigul
Ameer Taweel
59
0
0
08 May 2024
A Programming Model for GPU Load Balancing
A Programming Model for GPU Load Balancing
Muhammad Osama
Serban D. Porumbescu
John Douglas Owens
51
8
0
12 Jan 2023
Sgap: Towards Efficient Sparse Tensor Algebra Compilation for GPU
Sgap: Towards Efficient Sparse Tensor Algebra Compilation for GPU
Genghan Zhang
Yuetong Zhao
Yanting Tao
Zhongming Yu
Guohao Dai
Sitao Huang
Yuanyuan Wen
Pavlos Petoumenos
Yu Wang
90
4
0
07 Sep 2022
Towards Programmable Memory Controller for Tensor Decomposition
Towards Programmable Memory Controller for Tensor Decomposition
Sasindu Wijeratne
Ta-Yang Wang
Rajgopal Kannan
Viktor Prasanna
19
2
0
17 Jul 2022
Efficient, Out-of-Memory Sparse MTTKRP on Massively Parallel
  Architectures
Efficient, Out-of-Memory Sparse MTTKRP on Massively Parallel Architectures
A. Nguyen
Ahmed E. Helal
Fabio Checconi
Jan Laukemann
Jesmin Jahan Tithi
Yongseok Soh
Teresa M. Ranadive
Fabrizio Petrini
Jee W. Choi
20
8
0
29 Jan 2022
Reconfigurable Low-latency Memory System for Sparse Matricized Tensor
  Times Khatri-Rao Product on FPGA
Reconfigurable Low-latency Memory System for Sparse Matricized Tensor Times Khatri-Rao Product on FPGA
Sasindu Wijeratne
Rajgopal Kannan
Viktor Prasanna
60
6
0
18 Sep 2021
Fast and Accurate Randomized Algorithms for Low-rank Tensor
  Decompositions
Fast and Accurate Randomized Algorithms for Low-rank Tensor Decompositions
Linjian Ma
Edgar Solomonik
86
26
0
02 Apr 2021
ALTO: Adaptive Linearized Storage of Sparse Tensors
ALTO: Adaptive Linearized Storage of Sparse Tensors
Ahmed E. Helal
Jan Laukemann
Fabio Checconi
Jesmin Jahan Tithi
Teresa M. Ranadive
Fabrizio Petrini
Jeewhan Choi
37
21
0
20 Feb 2021
Efficient parallel CP decomposition with pairwise perturbation and
  multi-sweep dimension tree
Efficient parallel CP decomposition with pairwise perturbation and multi-sweep dimension tree
Linjian Ma
Edgar Solomonik
65
8
0
22 Oct 2020
a-Tucker: Input-Adaptive and Matricization-Free Tucker Decomposition for
  Dense Tensors on CPUs and GPUs
a-Tucker: Input-Adaptive and Matricization-Free Tucker Decomposition for Dense Tensors on CPUs and GPUs
Min Li
Chuanfu Xiao
Chao Yang
17
3
0
20 Oct 2020
PASTA: A Parallel Sparse Tensor Algorithm Benchmark Suite
PASTA: A Parallel Sparse Tensor Algorithm Benchmark Suite
Jiajia Li
Yuchen Ma
Xiaolong Wu
Ang Li
Kevin J. Barker
38
18
0
08 Feb 2019
1