Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1504.04343
Cited By
Caffe con Troll: Shallow Ideas to Speed Up Deep Learning
16 April 2015
Stefan Hadjis
Firas Abuzaid
Ce Zhang
Christopher Ré
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Caffe con Troll: Shallow Ideas to Speed Up Deep Learning"
20 / 20 papers shown
Title
SAFFIRA: a Framework for Assessing the Reliability of Systolic-Array-Based DNN Accelerators
Mahdi Taheri
Masoud Daneshtalab
J. Raik
M. Jenihhin
Salvatore Pappalardo
Paul Jiménez
Bastien Deveautour
A. Bosio
21
7
0
05 Mar 2024
Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs
Alexandros Kouris
Stylianos I. Venieris
Stefanos Laskaridis
Nicholas D. Lane
42
8
0
27 Sep 2022
High Performance Convolution Using Sparsity and Patterns for Inference in Deep Convolutional Neural Networks
Hossam Amer
Ahmed H. Salamah
A. Sajedi
En-Hui Yang
11
6
0
16 Apr 2021
Dense for the Price of Sparse: Improved Performance of Sparsely Initialized Networks via a Subspace Offset
Ilan Price
Jared Tanner
29
15
0
12 Feb 2021
LCP: A Low-Communication Parallelization Method for Fast Neural Network Inference in Image Recognition
Ramyad Hadidi
Bahar Asgari
Jiashen Cao
Younmin Bae
Da Eun Shim
Hyojong Kim
Sung-Kyu Lim
Michael S. Ryoo
Hyesoon Kim
13
1
0
13 Mar 2020
Cluster Pruning: An Efficient Filter Pruning Method for Edge AI Vision Applications
Chinthaka Gamanayake
Lahiru Jayasinghe
Benny Kai Kiat Ng
Chau Yuen
VLM
23
45
0
05 Mar 2020
Optimization of hybrid parallel application execution in heterogeneous high performance computing systems considering execution time and power consumption
P. Rosciszewski
14
2
0
20 Sep 2018
Deep Learning Approximation: Zero-Shot Neural Network Speedup
Michele Pratusevich
22
0
0
15 Jun 2018
Escoin: Efficient Sparse Convolutional Neural Network Inference on GPUs
Xuhao Chen
18
25
0
28 Feb 2018
Cuttlefish: A Lightweight Primitive for Adaptive Query Processing
Tomer Kaftan
Magdalena Balazinska
Alvin Cheung
J. Gehrke
21
26
0
26 Feb 2018
Stochastic Gradient Descent on Highly-Parallel Architectures
Yujing Ma
Florin Rusu
Martin Torres
27
6
0
24 Feb 2018
Distributed Training Large-Scale Deep Architectures
Shang-Xuan Zou
Chun-Yen Chen
Jui-Lin Wu
Chun-Nan Chou
Chia-Chin Tsao
Kuan-Chieh Tung
Ting-Wei Lin
Cheng-Lung Sung
Edward Y. Chang
26
22
0
10 Aug 2017
MEC: Memory-efficient Convolution for Deep Neural Network
Minsik Cho
D. Brand
16
86
0
21 Jun 2017
Using Convolutional Neural Networks in Robots with Limited Computational Resources: Detecting NAO Robots while Playing Soccer
Nicolás Cruz
Kenzo Lobos-Tsunekawa
Javier Ruiz-del-Solar
27
35
0
20 Jun 2017
On-the-fly Operation Batching in Dynamic Computation Graphs
Graham Neubig
Yoav Goldberg
Chris Dyer
34
60
0
22 May 2017
Faster CNNs with Direct Sparse Convolutions and Guided Pruning
Jongsoo Park
Sheng Li
W. Wen
P. T. P. Tang
Hai Helen Li
Yiran Chen
Pradeep Dubey
33
182
0
04 Aug 2016
Omnivore: An Optimizer for Multi-device Deep Learning on CPUs and GPUs
Stefan Hadjis
Ce Zhang
Ioannis Mitliagkas
Dan Iter
Christopher Ré
20
65
0
14 Jun 2016
A Systematic Approach to Blocking Convolutional Neural Networks
Xuan S. Yang
Jing Pu
Blaine Rister
Nikhil Bhagdikar
Stephen Richardson
Shahar Kvatinsky
Jonathan Ragan-Kelley
A. Pedram
M. Horowitz
16
54
0
14 Jun 2016
SparkNet: Training Deep Networks in Spark
Philipp Moritz
Robert Nishihara
Ion Stoica
Michael I. Jordan
36
169
0
19 Nov 2015
Automatic differentiation in machine learning: a survey
A. G. Baydin
Barak A. Pearlmutter
Alexey Radul
J. Siskind
PINN
AI4CE
ODL
75
2,751
0
20 Feb 2015
1