Input-Aware Auto-Tuning of Compute-Bound HPC Kernels

15 February 2018

Papers citing "Input-Aware Auto-Tuning of Compute-Bound HPC Kernels"

4 / 4 papers shown

Title
Stream-K: Work-centric Parallel Decomposition for Dense Matrix-Matrix Multiplication on the GPU Muhammad Osama D. Merrill C. Cecka M. Garland John Douglas Owens 61 28 0 09 Jan 2023
Using hardware performance counters to speed up autotuning convergence on GPUs Jiri Filipovic Jana Hozzová A. Nezarat Jaroslav Olha Filip Petrovic 36 12 0 10 Feb 2021
SparseRT: Accelerating Unstructured Sparsity on GPUs for Deep Learning Inference Ziheng Wang 86 68 0 26 Aug 2020
A model-driven approach for a new generation of adaptive libraries Marco Cianfriglia Damiano Perri C. Nugteren Anton Lokhmotov G. Fursin 74 14 0 19 Jun 2018