Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.01691
Cited By
DeLTA: GPU Performance Model for Deep Learning Applications with In-depth Memory System Traffic Analysis
2 April 2019
Sangkug Lym
Donghyuk Lee
Mike O'Connor
Niladrish Chatterjee
M. Erez
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DeLTA: GPU Performance Model for Deep Learning Applications with In-depth Memory System Traffic Analysis"
7 / 7 papers shown
Title
ALCOP: Automatic Load-Compute Pipelining in Deep Learning Compiler for AI-GPUs
Guyue Huang
Yang Bai
Liu Liu
Yuke Wang
Bei Yu
Yufei Ding
Yuan Xie
88
18
0
29 Oct 2022
Inference Latency Prediction at the Edge
Zhuojin Li
Marco Paolieri
L. Golubchik
56
3
0
06 Oct 2022
Building a Performance Model for Deep Learning Recommendation Model Training on GPUs
Zhongyi Lin
Louis Feng
E. K. Ardestani
Jaewon Lee
J. Lundell
Changkyu Kim
A. Kejariwal
John Douglas Owens
47
19
0
19 Jan 2022
Characterizing and Demystifying the Implicit Convolution Algorithm on Commercial Matrix-Multiplication Accelerators
Yangjie Zhou
Mengtian Yang
Cong Guo
Jingwen Leng
Yun Liang
Quan Chen
Minyi Guo
Yuhao Zhu
63
35
0
08 Oct 2021
Training Energy-Efficient Deep Spiking Neural Networks with Single-Spike Hybrid Input Encoding
Gourav Datta
Souvik Kundu
Peter A. Beerel
131
29
0
26 Jul 2021
FusionStitching: Boosting Memory Intensive Computations for Deep Learning Workloads
Zhen Zheng
Pengzhan Zhao
Guoping Long
Feiwen Zhu
Kai Zhu
Wenyi Zhao
Lansong Diao
Jun Yang
Wei Lin
70
31
0
23 Sep 2020
FlexSA: Flexible Systolic Array Architecture for Efficient Pruned DNN Model Training
Sangkug Lym
M. Erez
28
26
0
27 Apr 2020
1