Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1612.03079
Cited By
Clipper: A Low-Latency Online Prediction Serving System
9 December 2016
D. Crankshaw
Xin Wang
Giulio Zhou
Michael Franklin
Joseph E. Gonzalez
Ion Stoica
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Clipper: A Low-Latency Online Prediction Serving System"
28 / 78 papers shown
Title
ModelPS: An Interactive and Collaborative Platform for Editing Pre-trained Models at Scale
Yuanming Li
Huaizheng Zhang
Shanshan Jiang
Fan Yang
Yonggang Wen
Yong Luo
21
2
0
18 May 2021
Software Engineering for AI-Based Systems: A Survey
Silverio Martínez-Fernández
Justus Bogner
Xavier Franch
Marc Oriol
Julien Siebert
Adam Trendowicz
Anna Maria Vollmer
Stefan Wagner
27
211
0
05 May 2021
DeepRT: A Soft Real Time Scheduler for Computer Vision Applications on the Edge
Zhe Yang
Klara Nahrstedt
Hongpeng Guo
Qian Zhou
19
21
0
05 May 2021
Accelerating Deep Learning Inference via Learned Caches
Arjun Balasubramanian
Adarsh Kumar
Yuhan Liu
Han Cao
Shivaram Venkataraman
Aditya Akella
28
18
0
18 Jan 2021
PACSET (Packed Serialized Trees): Reducing Inference Latency for Tree Ensemble Deployment
Meghana Madhyastha
Kunal Lillaney
J. Browne
Joshua T. Vogelstein
Randal C. Burns
30
1
0
10 Nov 2020
A Tensor Compiler for Unified Machine Learning Prediction Serving
Supun Nakandala Karla Saur
Karla Saur
Gyeong-In Yu
Konstantinos Karanasos
Carlo Curino
Markus Weimer
Matteo Interlandi
37
53
0
09 Oct 2020
HOLMES: Health OnLine Model Ensemble Serving for Deep Learning Models in Intensive Care Units
linda Qiao
Yanbo Xu
Alind Khare
Satria Priambada
K. Maher
Alaa Aljiffry
Jimeng Sun
Alexey Tumanov
OOD
31
84
0
10 Aug 2020
Hippo: Taming Hyper-parameter Optimization of Deep Learning with Stage Trees
Ahnjae Shin
Do Yoon Kim
Joo Seong Jeong
Byung-Gon Chun
28
4
0
22 Jun 2020
DCAF: A Dynamic Computation Allocation Framework for Online Serving System
Biye Jiang
Pengye Zhang
Rihan Chen
Binding Dai
Xinchen Luo
Yifan Yang
Guan Wang
Guorui Zhou
Xiaoqiang Zhu
Kun Gai
14
15
0
17 Jun 2020
Real-Time Video Inference on Edge Devices via Adaptive Model Streaming
Mehrdad Khani Shirkoohi
Pouya Hamadanian
Arash Nasr-Esfahany
Mohammad Alizadeh
28
44
0
11 Jun 2020
Hoplite: Efficient and Fault-Tolerant Collective Communication for Task-Based Distributed Systems
Siyuan Zhuang
Zhuohan Li
Danyang Zhuo
Stephanie Wang
Eric Liang
Robert Nishihara
Philipp Moritz
Ion Stoica
27
23
0
13 Feb 2020
Accelerating Deep Learning Inference via Freezing
Adarsh Kumar
Arjun Balasubramanian
Shivaram Venkataraman
Aditya Akella
AI4CE
31
22
0
07 Feb 2020
The Design and Implementation of a Scalable DL Benchmarking Platform
Cheng-rong Li
Abdul Dakkak
Jinjun Xiong
Wen-mei W. Hwu
ALM
ELM
21
4
0
19 Nov 2019
Extending Relational Query Processing with ML Inference
Konstantinos Karanasos
Matteo Interlandi
Doris Xin
Fotis Psallidas
Rathijit Sen
...
Subru Krishnan
Markus Weimer
Yuan Yu
R. Ramakrishnan
Carlo Curino
11
61
0
01 Nov 2019
ALERT: Accurate Learning for Energy and Timeliness
Chengcheng Wan
M. Santriaji
E. Rogers
H. Hoffmann
Michael Maire
Shan Lu
AI4CE
48
40
0
31 Oct 2019
Collage Inference: Using Coded Redundancy for Low Variance Distributed Image Classification
Krishnagiri Narra
Zhifeng Lin
Ganesh Ananthanarayanan
A. Avestimehr
M. Annavaram
VLM
28
6
0
27 Apr 2019
Stratum: A Serverless Framework for Lifecycle Management of Machine Learning based Data Analytics Tasks
Anirban Bhattacharjee
Yogesh D. Barve
S. Khare
Shunxing Bao
A. Gokhale
Thomas Damiano
32
28
0
03 Apr 2019
BARISTA: Efficient and Scalable Serverless Serving System for Deep Learning Prediction Services
Anirban Bhattacharjee
A. Chhokra
Zhuangwei Kang
Hongyang Sun
A. Gokhale
G. Karsai
24
63
0
02 Apr 2019
Dynamic Space-Time Scheduling for GPU Inference
Paras Jain
Xiangxi Mo
Ajay Jain
Harikaran Subbaraj
Rehana Durrani
Alexey Tumanov
Joseph E. Gonzalez
Ion Stoica
35
64
0
31 Dec 2018
Serverless Computing: One Step Forward, Two Steps Back
J. M. Hellerstein
Jose M. Faleiro
Joseph E. Gonzalez
Johann Schleier-Smith
Vikram Sreekanti
Alexey Tumanov
Chenggang Wu
11
387
0
10 Dec 2018
TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep LearningInference in Function as a Service Environments
Abdul Dakkak
Cheng-rong Li
Simon Garcia De Gonzalo
Jinjun Xiong
Wen-mei W. Hwu
21
19
0
24 Nov 2018
MMLSpark: Unifying Machine Learning Ecosystems at Massive Scales
Mark Hamilton
S. Raghunathan
Ilya Matiach
Andrew Schonhoffer
Anand Raman
...
Minsoo Thigpen
J. Mahajan
Courtney Cochrane
Abhiram Eswaran
Ari Green
19
2
0
20 Oct 2018
Latency and Throughput Characterization of Convolutional Neural Networks for Mobile Computer Vision
Jussi Hanhirova
Teemu Kämäräinen
S. Seppälä
M. Siekkinen
V. Hirvisalo
Antti Ylä-Jääski
26
90
0
26 Mar 2018
The Three Pillars of Machine Programming
Justin Emile Gottschlich
Armando Solar-Lezama
Nesime Tatbul
Michael Carbin
Martin Rinard
Regina Barzilay
Saman P. Amarasinghe
J. Tenenbaum
Tim Mattson
27
62
0
20 Mar 2018
A Berkeley View of Systems Challenges for AI
Ion Stoica
D. Song
Raluca A. Popa
D. Patterson
Michael W. Mahoney
...
Joseph E. Gonzalez
Ken Goldberg
A. Ghodsi
David Culler
Pieter Abbeel
35
199
0
15 Dec 2017
Serving deep learning models in a serverless platform
Vatche Isahagian
Vinod Muthusamy
Aleksander Slominski
21
169
0
23 Oct 2017
IDK Cascades: Fast Deep Learning by Learning not to Overthink
Xin Wang
Yujia Luo
D. Crankshaw
Alexey Tumanov
Fisher Yu
Joseph E. Gonzalez
35
107
0
03 Jun 2017
Net2Vec: Deep Learning for the Network
Roberto Gonzalez
Filipe Manco
Alberto García-Durán
Jose Mendes
Felipe Huici
S. Niccolini
Mathias Niepert
3DH
GNN
25
23
0
10 May 2017
Previous
1
2