Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.00952
Cited By
Relay: A New IR for Machine Learning Frameworks
26 September 2018
Jared Roesch
Steven Lyubomirsky
Logan Weber
Josh Pollock
Marisa Kirisame
Tianqi Chen
Zachary Tatlock
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Relay: A New IR for Machine Learning Frameworks"
13 / 13 papers shown
Title
A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures
Fabrizio Ferrandi
S. Curzel
Leandro Fiorin
Daniele Ielmini
Cristina Silvano
...
Cristian Zambelli
V. Cardellini
G. Ascia
Enrico Russo
N. Petra
36
4
0
29 Nov 2023
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
Ruihang Lai
Junru Shao
Siyuan Feng
Steven Lyubomirsky
Bohan Hou
...
Sunghyun Park
Prakalp Srivastava
Jared Roesch
T. Mowry
Tianqi Chen
47
9
0
01 Nov 2023
Kernel-as-a-Service: A Serverless Interface to GPUs
Nathan Pemberton
Anton Zabreyko
Zhoujie Ding
R. Katz
Joseph E. Gonzalez
29
8
0
15 Dec 2022
AGO: Boosting Mobile AI Inference Performance by Removing Constraints on Graph Optimization
Zhiying Xu
H. Peng
Wei Wang
GNN
26
3
0
02 Dec 2022
ALT: Boosting Deep Learning Performance by Breaking the Wall between Graph and Operator Level Optimizations
Zhiying Xu
Jiafan Xu
H. Peng
Wei Wang
Xiaoliang Wang
...
Haipeng Dai
Yixu Xu
Hao Cheng
Kun Wang
Guihai Chen
35
0
0
22 Oct 2022
Bifrost: End-to-End Evaluation and Optimization of Reconfigurable DNN Accelerators
Axel Stjerngren
Perry Gibson
José Cano
34
4
0
26 Apr 2022
Memory Planning for Deep Neural Networks
Maksim Levental
33
4
0
23 Feb 2022
GSPMD: General and Scalable Parallelization for ML Computation Graphs
Yuanzhong Xu
HyoukJoong Lee
Dehao Chen
Blake A. Hechtman
Yanping Huang
...
Noam M. Shazeer
Shibo Wang
Tao Wang
Yonghui Wu
Zhifeng Chen
MoE
28
128
0
10 May 2021
SplitSR: An End-to-End Approach to Super-Resolution on Mobile Devices
Xin Liu
Yuang Li
Josh Fromm
Yuntao wang
Ziheng Jiang
Alex Mariakakis
Shwetak N. Patel
SupR
42
10
0
20 Jan 2021
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin
HyoukJoong Lee
Yuanzhong Xu
Dehao Chen
Orhan Firat
Yanping Huang
M. Krikun
Noam M. Shazeer
Zhehuai Chen
MoE
43
1,116
0
30 Jun 2020
Polystore++: Accelerated Polystore System for Heterogeneous Workloads
Rekha Singhal
Nathan Zhang
Luigi Nardi
M. Shahbaz
K. Olukotun
11
8
0
24 May 2019
A Hardware-Software Blueprint for Flexible Deep Learning Specialization
T. Moreau
Tianqi Chen
Luis Vega
Jared Roesch
Eddie Q. Yan
...
Josh Fromm
Ziheng Jiang
Luis Ceze
Carlos Guestrin
Arvind Krishnamurthy
26
70
0
11 Jul 2018
Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN Training
Bojian Zheng
Abhishek Tiwari
Nandita Vijaykumar
Gennady Pekhimenko
27
44
0
22 May 2018
1