Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1801.08058
Cited By
v1
v2 (latest)
Intel nGraph: An Intermediate Representation, Compiler, and Executor for Deep Learning
24 January 2018
D. S. Cyphers
Arjun K. Bansal
Anahita Bhiwandiwalla
J. Bobba
M. Brookhart
Avijit Chakraborty
William Constable
Christian Convey
Leona Cook
Omar Kanawi
R. Kimball
Jason Knight
Nikolay Korovaiko
V. Vijay
Yixing Lao
C. Lishka
J. Menon
Jennifer Myers
Sandeep Aswath Narayana
A. Procter
T. Webb
GNN
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Intel nGraph: An Intermediate Representation, Compiler, and Executor for Deep Learning"
48 / 48 papers shown
vMCU: Coordinated Memory Management and Kernel Optimization for DNN Inference on MCUs
Wenlei Bao
Renze Chen
Meng Li
Zihao Ye
Luis Ceze
Yun Liang
151
4
0
01 May 2024
Proteus: Preserving Model Confidentiality during Graph Optimizations
Yubo Gao
Maryam Haghifam
Christina Giannoula
Renbo Tu
Gennady Pekhimenko
Nandita Vijaykumar
AAML
278
1
0
18 Apr 2024
CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory Accelerators
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024
Songyun Qu
Shixin Zhao
Bing Li
Yintao He
Xuyi Cai
Lei Zhang
Ying Wang
161
9
0
23 Jan 2024
PolyTOPS: Reconfigurable and Flexible Polyhedral Scheduler
IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2024
Gianpietro Consolaro
Zhen Zhang
Harenome Razanajato
Nelson Lossing
Nassim Tchoulak
...
Artur Cesar Araujo Alves
Renwei Zhang
Denis Barthou
Corinne Ancourt
Cédric Bastoul
112
3
0
12 Jan 2024
GEVO-ML: Optimizing Machine Learning Code with Evolutionary Computation
Jhe-Yu Liou
Stephanie Forrest
Carole-Jean Wu
VLM
271
0
0
16 Oct 2023
LoopTune: Optimizing Tensor Computations with Reinforcement Learning
Dejan Grubisic
Bram Wasti
Chris Cummins
John Mellor-Crummey
A. Zlateski
302
3
0
04 Sep 2023
DyCL: Dynamic Neural Network Compilation Via Program Rewriting and Graph Optimization
International Symposium on Software Testing and Analysis (ISSTA), 2023
Simin Chen
Shiyi Wei
Cong Liu
Wei Yang
203
12
0
11 Jul 2023
CMLCompiler: A Unified Compiler for Classical Machine Learning
International Conference on Supercomputing (ICS), 2023
Xu Wen
Wanling Gao
An-Dong Li
Lei Wang
Zihan Jiang
Jianfeng Zhan
289
1
0
31 Jan 2023
TLP: A Deep Learning-based Cost Model for Tensor Program Tuning
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022
Yiqiang Zhai
Yu Zhang
Shuo Liu
Xiaomeng Chu
Jie Peng
Jianmin Ji
Yanyong Zhang
197
52
0
07 Nov 2022
ALT: Boosting Deep Learning Performance by Breaking the Wall between Graph and Operator Level Optimizations
Zhiying Xu
Jiafan Xu
H. Peng
Wei Wang
Xiaoliang Wang
...
Haipeng Dai
Yixu Xu
Hao Cheng
Kun Wang
Guihai Chen
338
1
0
22 Oct 2022
Optimizing DNN Compilation for Distributed Training with Joint OP and Tensor Fusion
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2022
Xiaodong Yi
Shiwei Zhang
Lansong Diao
Chuan Wu
Zhen Zheng
Shiqing Fan
Siyu Wang
Jun Yang
W. Lin
161
6
0
26 Sep 2022
Dropbear: Machine Learning Marketplaces made Trustworthy with Byzantine Model Agreement
A. Shamis
Peter R. Pietzuch
Antoine Delignat-Lavaud
Andrew Paverd
Manuel Costa
OOD
121
0
0
31 May 2022
GraphQ IR: Unifying the Semantic Parsing of Graph Query Languages with One Intermediate Representation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
L. Nie
S. Cao
Jiaxin Shi
Jiu Sun
Qingwen Tian
Lei Hou
Juanzi Li
Jidong Zhai
GNN
202
27
0
24 May 2022
Learning to Reverse DNNs from AI Programs Automatically
Simin Chen
Hamed Khanpour
Cong Liu
Wei Yang
201
19
0
20 May 2022
Benchmarking the Linear Algebra Awareness of TensorFlow and PyTorch
IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2022
Aravind Sankaran
Navid Akbari Alashti
C. Psarras
Paolo Bientinesi
132
7
0
20 Feb 2022
Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment
Future generations computer systems (FGCS), 2022
Jemin Lee
Misun Yu
Yongin Kwon
Teaho Kim
MQ
210
22
0
10 Feb 2022
FamilySeer: Towards Optimized Tensor Codes by Exploiting Computation Subgraph Similarity
Shanjun Zhang
Mingzhen Li
Hailong Yang
Yi Liu
Zhongzhi Luan
D. Qian
206
1
0
01 Jan 2022
Collage: Seamless Integration of Deep Learning Backends with Automatic Placement
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2021
Byungsoo Jeon
Sunghyun Park
Peiyuan Liao
Sheng Xu
Tianqi Chen
Zhihao Jia
VLM
291
7
0
01 Nov 2021
A Data-Centric Optimization Framework for Machine Learning
International Conference on Supercomputing (ICS), 2021
Oliver Rausch
Tal Ben-Nun
Nikoli Dryden
Andrei Ivanov
Shigang Li
Torsten Hoefler
AI4CE
312
19
0
20 Oct 2021
Joint Program and Layout Transformations to enable Convolutional Operators on Specialized Hardware based on Constraint Programming
ACM Transactions on Architecture and Code Optimization (TACO) (TACO), 2021
D. Rieber
Axel Acosta
Holger Fröning
254
0
0
10 Apr 2021
Enabling Homomorphically Encrypted Inference for Large DNN Models
IEEE transactions on computers (IEEE Trans. Comput.), 2021
Guillermo Lloret-Talavera
Marc Jordà
Harald Servat
Fabian Boemer
C. Chauhan
S. Tomishima
Nilesh N. Shah
Antonio J. Peña
AI4CE
FedML
264
34
0
30 Mar 2021
Compiler Toolchains for Deep Learning Workloads on Embedded Platforms
Max Sponner
Bernd Waschneck
Akash Kumar
MQ
183
7
0
08 Mar 2021
Neural Architecture Search as Program Transformation Exploration
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2021
Jack Turner
Elliot J. Crowley
Michael F. P. O'Boyle
213
23
0
12 Feb 2021
SoK: Fully Homomorphic Encryption Compilers
IEEE Symposium on Security and Privacy (IEEE S&P), 2021
Alexander Viand
Patrick Jattke
Anwar Hithnawi
174
113
0
18 Jan 2021
NNStreamer: Efficient and Agile Development of On-Device AI Systems
MyungJoo Ham
Jijoong Moon
Geunsik Lim
Jaeyun Jung
Hyoungjoo Ahn
...
Parichay Kapoor
Dongju Chae
Gichan Jang
Y. Ahn
Jihoon Lee
124
6
0
16 Jan 2021
Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning
Neural Information Processing Systems (NeurIPS), 2020
Woosuk Kwon
Gyeong-In Yu
Eunji Jeong
Byung-Gon Chun
188
86
0
04 Dec 2020
Fusion-Catalyzed Pruning for Optimizing Deep Learning on Intelligent Edge Devices
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2020
Guangli Li
Xiu Ma
Xueying Wang
Lei Liu
Jingling Xue
Xiaobing Feng
226
37
0
30 Oct 2020
Optimising AI Training Deployments using Graph Compilers and Containers
IEEE Conference on High Performance Extreme Computing (HPEC), 2020
Nina Mujkanovic
K. Sivalingam
A. Lazzaro
GNN
118
2
0
26 Aug 2020
Woodpecker-DL: Accelerating Deep Neural Networks via Hardware-Aware Multifaceted Optimizations
Yongchao Liu
Yue Jin
Yongqi Chen
Teng Teng
Hang Ou
Rui Zhao
Yao Zhang
258
1
0
11 Aug 2020
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning
International Conference on Learning Representations (ICLR), 2020
Shauharda Khadka
Estelle Aflalo
Mattias Marder
Avrech Ben-David
Santiago Miret
Shie Mannor
Tamir Hazan
Hanlin Tang
Somdeb Majumdar
GNN
195
13
0
14 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
447
175
0
30 Jun 2020
Efficient Execution of Quantized Deep Learning Models: A Compiler Approach
Animesh Jain
Shoubhik Bhattacharya
Masahiro Masuda
Vin Sharma
Yida Wang
MQ
287
38
0
18 Jun 2020
Vyasa: A High-Performance Vectorizing Compiler for Tensor Convolutions on the Xilinx AI Engine
IEEE Conference on High Performance Extreme Computing (HPEC), 2020
Prasanth Chatarasi
S. Neuendorffer
Samuel Bayliss
K. Vissers
Vivek Sarkar
100
22
0
02 Jun 2020
FlexSA: Flexible Systolic Array Architecture for Efficient Pruned DNN Model Training
Sangkug Lym
M. Erez
158
27
0
27 Apr 2020
SOL: Effortless Device Support for AI Frameworks without Source Code Changes
IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2020
Nicolas Weber
Felipe Huici
136
3
0
24 Mar 2020
Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural Networks for Edge Devices
Conference on Machine Learning and Systems (MLSys), 2020
Byung Hoon Ahn
Jinwon Lee
J. Lin
Hsin-Pai Cheng
Jilei Hou
H. Esmaeilzadeh
204
58
0
04 Mar 2020
A C Code Generator for Fast Inference and Simple Deployment of Convolutional Neural Networks on Resource Constrained Systems
Oliver Urbann
Simon Camphausen
Arne Moos
Ingmar Schwarz
Sören Kerner
M. Otten
110
5
0
14 Jan 2020
MIOpen: An Open Source Library For Deep Learning Primitives
Jehandad Khan
Paul Fultz
Artem Tamazov
Daniel Lowell
Chao-Jung Liu
...
Vasilii Filippov
Jing Zhang
Jing Zhou
Bragadeesh Natarajan
Mayank Daga
VLM
MoE
242
48
0
30 Sep 2019
Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training
Yuxin Wang
Qiang-qiang Wang
Shaoshuai Shi
Xin He
Zhenheng Tang
Kaiyong Zhao
Xiaowen Chu
427
4
0
15 Sep 2019
TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in TensorFlow Using Tapir
IEEE Conference on High Performance Extreme Computing (HPEC), 2019
S. Samsi
Michael Houle
127
5
0
29 Aug 2019
nGraph-HE2: A High-Throughput Framework for Neural Network Inference on Encrypted Data
IACR Cryptology ePrint Archive (IACR ePrint), 2019
Fabian Boemer
Anamaria Costache
Rosario Cammarota
Casimir Wierzynski
GNN
353
192
0
12 Aug 2019
swTVM: Towards Optimized Tensor Code Generation for Deep Learning on Sunway Many-Core Processor
Mingzhen Li
Changxi Liu
Jian-He Liao
Xuegui Zheng
Hailong Yang
...
Jun Xu
L. Gan
Guangwen Yang
Zhongzhi Luan
D. Qian
219
3
0
16 Apr 2019
Stripe: Tensor Compilation via the Nested Polyhedral Model
Tim Zerrell
J. Bruestle
154
34
0
14 Mar 2019
DNNVM : End-to-End Compiler Leveraging Heterogeneous Optimizations on FPGA-based CNN Accelerators
Yu Xing
Shuang Liang
Lingzhi Sui
Xijie Jia
Jiantao Qiu
Xin Liu
Yushun Wang
Yu Wang
Yi Shan
185
79
0
20 Feb 2019
nGraph-HE: A Graph Compiler for Deep Learning on Homomorphically Encrypted Data
Fabian Boemer
Yixing Lao
Rosario Cammarota
Casimir Wierzynski
FedML
390
195
0
23 Oct 2018
Optimizing CNN Model Inference on CPUs
Yizhi Liu
Yao Wang
Ruofei Yu
Mu Li
Vin Sharma
Yida Wang
288
169
0
07 Sep 2018
A Hardware-Software Blueprint for Flexible Deep Learning Specialization
T. Moreau
Tianqi Chen
Luis Vega
Jared Roesch
Eddie Q. Yan
...
Josh Fromm
Ziheng Jiang
Luis Ceze
Carlos Guestrin
Arvind Krishnamurthy
173
81
0
11 Jul 2018
Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN Training
Bojian Zheng
Abhishek Tiwari
Nandita Vijaykumar
Gennady Pekhimenko
220
48
0
22 May 2018
1
Page 1 of 1