ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.08058
  4. Cited By
Intel nGraph: An Intermediate Representation, Compiler, and Executor for
  Deep Learning
v1v2 (latest)

Intel nGraph: An Intermediate Representation, Compiler, and Executor for Deep Learning

24 January 2018
D. S. Cyphers
Arjun K. Bansal
Anahita Bhiwandiwalla
J. Bobba
M. Brookhart
Avijit Chakraborty
William Constable
Christian Convey
Leona Cook
Omar Kanawi
R. Kimball
Jason Knight
Nikolay Korovaiko
V. Vijay
Yixing Lao
C. Lishka
J. Menon
Jennifer Myers
Sandeep Aswath Narayana
A. Procter
T. Webb
    GNN
ArXiv (abs)PDFHTML

Papers citing "Intel nGraph: An Intermediate Representation, Compiler, and Executor for Deep Learning"

48 / 48 papers shown
vMCU: Coordinated Memory Management and Kernel Optimization for DNN
  Inference on MCUs
vMCU: Coordinated Memory Management and Kernel Optimization for DNN Inference on MCUs
Wenlei Bao
Renze Chen
Meng Li
Zihao Ye
Luis Ceze
Yun Liang
151
4
0
01 May 2024
Proteus: Preserving Model Confidentiality during Graph Optimizations
Proteus: Preserving Model Confidentiality during Graph Optimizations
Yubo Gao
Maryam Haghifam
Christina Giannoula
Renbo Tu
Gennady Pekhimenko
Nandita Vijaykumar
AAML
278
1
0
18 Apr 2024
CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory
  Accelerators
CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory AcceleratorsInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024
Songyun Qu
Shixin Zhao
Bing Li
Yintao He
Xuyi Cai
Lei Zhang
Ying Wang
161
9
0
23 Jan 2024
PolyTOPS: Reconfigurable and Flexible Polyhedral Scheduler
PolyTOPS: Reconfigurable and Flexible Polyhedral SchedulerIEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2024
Gianpietro Consolaro
Zhen Zhang
Harenome Razanajato
Nelson Lossing
Nassim Tchoulak
...
Artur Cesar Araujo Alves
Renwei Zhang
Denis Barthou
Corinne Ancourt
Cédric Bastoul
112
3
0
12 Jan 2024
GEVO-ML: Optimizing Machine Learning Code with Evolutionary Computation
GEVO-ML: Optimizing Machine Learning Code with Evolutionary Computation
Jhe-Yu Liou
Stephanie Forrest
Carole-Jean Wu
VLM
271
0
0
16 Oct 2023
LoopTune: Optimizing Tensor Computations with Reinforcement Learning
LoopTune: Optimizing Tensor Computations with Reinforcement Learning
Dejan Grubisic
Bram Wasti
Chris Cummins
John Mellor-Crummey
A. Zlateski
302
3
0
04 Sep 2023
DyCL: Dynamic Neural Network Compilation Via Program Rewriting and Graph
  Optimization
DyCL: Dynamic Neural Network Compilation Via Program Rewriting and Graph OptimizationInternational Symposium on Software Testing and Analysis (ISSTA), 2023
Simin Chen
Shiyi Wei
Cong Liu
Wei Yang
203
12
0
11 Jul 2023
CMLCompiler: A Unified Compiler for Classical Machine Learning
CMLCompiler: A Unified Compiler for Classical Machine LearningInternational Conference on Supercomputing (ICS), 2023
Xu Wen
Wanling Gao
An-Dong Li
Lei Wang
Zihan Jiang
Jianfeng Zhan
289
1
0
31 Jan 2023
TLP: A Deep Learning-based Cost Model for Tensor Program Tuning
TLP: A Deep Learning-based Cost Model for Tensor Program TuningInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022
Yiqiang Zhai
Yu Zhang
Shuo Liu
Xiaomeng Chu
Jie Peng
Jianmin Ji
Yanyong Zhang
197
52
0
07 Nov 2022
ALT: Boosting Deep Learning Performance by Breaking the Wall between
  Graph and Operator Level Optimizations
ALT: Boosting Deep Learning Performance by Breaking the Wall between Graph and Operator Level Optimizations
Zhiying Xu
Jiafan Xu
H. Peng
Wei Wang
Xiaoliang Wang
...
Haipeng Dai
Yixu Xu
Hao Cheng
Kun Wang
Guihai Chen
338
1
0
22 Oct 2022
Optimizing DNN Compilation for Distributed Training with Joint OP and
  Tensor Fusion
Optimizing DNN Compilation for Distributed Training with Joint OP and Tensor FusionIEEE Transactions on Parallel and Distributed Systems (TPDS), 2022
Xiaodong Yi
Shiwei Zhang
Lansong Diao
Chuan Wu
Zhen Zheng
Shiqing Fan
Siyu Wang
Jun Yang
W. Lin
161
6
0
26 Sep 2022
Dropbear: Machine Learning Marketplaces made Trustworthy with Byzantine
  Model Agreement
Dropbear: Machine Learning Marketplaces made Trustworthy with Byzantine Model Agreement
A. Shamis
Peter R. Pietzuch
Antoine Delignat-Lavaud
Andrew Paverd
Manuel Costa
OOD
121
0
0
31 May 2022
GraphQ IR: Unifying the Semantic Parsing of Graph Query Languages with
  One Intermediate Representation
GraphQ IR: Unifying the Semantic Parsing of Graph Query Languages with One Intermediate RepresentationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
L. Nie
S. Cao
Jiaxin Shi
Jiu Sun
Qingwen Tian
Lei Hou
Juanzi Li
Jidong Zhai
GNN
202
27
0
24 May 2022
Learning to Reverse DNNs from AI Programs Automatically
Learning to Reverse DNNs from AI Programs Automatically
Simin Chen
Hamed Khanpour
Cong Liu
Wei Yang
201
19
0
20 May 2022
Benchmarking the Linear Algebra Awareness of TensorFlow and PyTorch
Benchmarking the Linear Algebra Awareness of TensorFlow and PyTorchIEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2022
Aravind Sankaran
Navid Akbari Alashti
C. Psarras
Paolo Bientinesi
132
7
0
20 Feb 2022
Quantune: Post-training Quantization of Convolutional Neural Networks
  using Extreme Gradient Boosting for Fast Deployment
Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast DeploymentFuture generations computer systems (FGCS), 2022
Jemin Lee
Misun Yu
Yongin Kwon
Teaho Kim
MQ
210
22
0
10 Feb 2022
FamilySeer: Towards Optimized Tensor Codes by Exploiting Computation
  Subgraph Similarity
FamilySeer: Towards Optimized Tensor Codes by Exploiting Computation Subgraph Similarity
Shanjun Zhang
Mingzhen Li
Hailong Yang
Yi Liu
Zhongzhi Luan
D. Qian
206
1
0
01 Jan 2022
Collage: Seamless Integration of Deep Learning Backends with Automatic
  Placement
Collage: Seamless Integration of Deep Learning Backends with Automatic PlacementInternational Conference on Parallel Architectures and Compilation Techniques (PACT), 2021
Byungsoo Jeon
Sunghyun Park
Peiyuan Liao
Sheng Xu
Tianqi Chen
Zhihao Jia
VLM
291
7
0
01 Nov 2021
A Data-Centric Optimization Framework for Machine Learning
A Data-Centric Optimization Framework for Machine LearningInternational Conference on Supercomputing (ICS), 2021
Oliver Rausch
Tal Ben-Nun
Nikoli Dryden
Andrei Ivanov
Shigang Li
Torsten Hoefler
AI4CE
312
19
0
20 Oct 2021
Joint Program and Layout Transformations to enable Convolutional
  Operators on Specialized Hardware based on Constraint Programming
Joint Program and Layout Transformations to enable Convolutional Operators on Specialized Hardware based on Constraint ProgrammingACM Transactions on Architecture and Code Optimization (TACO) (TACO), 2021
D. Rieber
Axel Acosta
Holger Fröning
254
0
0
10 Apr 2021
Enabling Homomorphically Encrypted Inference for Large DNN Models
Enabling Homomorphically Encrypted Inference for Large DNN ModelsIEEE transactions on computers (IEEE Trans. Comput.), 2021
Guillermo Lloret-Talavera
Marc Jordà
Harald Servat
Fabian Boemer
C. Chauhan
S. Tomishima
Nilesh N. Shah
Antonio J. Peña
AI4CEFedML
264
34
0
30 Mar 2021
Compiler Toolchains for Deep Learning Workloads on Embedded Platforms
Compiler Toolchains for Deep Learning Workloads on Embedded Platforms
Max Sponner
Bernd Waschneck
Akash Kumar
MQ
183
7
0
08 Mar 2021
Neural Architecture Search as Program Transformation Exploration
Neural Architecture Search as Program Transformation ExplorationInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2021
Jack Turner
Elliot J. Crowley
Michael F. P. O'Boyle
213
23
0
12 Feb 2021
SoK: Fully Homomorphic Encryption Compilers
SoK: Fully Homomorphic Encryption CompilersIEEE Symposium on Security and Privacy (IEEE S&P), 2021
Alexander Viand
Patrick Jattke
Anwar Hithnawi
174
113
0
18 Jan 2021
NNStreamer: Efficient and Agile Development of On-Device AI Systems
NNStreamer: Efficient and Agile Development of On-Device AI Systems
MyungJoo Ham
Jijoong Moon
Geunsik Lim
Jaeyun Jung
Hyoungjoo Ahn
...
Parichay Kapoor
Dongju Chae
Gichan Jang
Y. Ahn
Jihoon Lee
124
6
0
16 Jan 2021
Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning
Nimble: Lightweight and Parallel GPU Task Scheduling for Deep LearningNeural Information Processing Systems (NeurIPS), 2020
Woosuk Kwon
Gyeong-In Yu
Eunji Jeong
Byung-Gon Chun
188
86
0
04 Dec 2020
Fusion-Catalyzed Pruning for Optimizing Deep Learning on Intelligent
  Edge Devices
Fusion-Catalyzed Pruning for Optimizing Deep Learning on Intelligent Edge DevicesIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2020
Guangli Li
Xiu Ma
Xueying Wang
Lei Liu
Jingling Xue
Xiaobing Feng
226
37
0
30 Oct 2020
Optimising AI Training Deployments using Graph Compilers and Containers
Optimising AI Training Deployments using Graph Compilers and ContainersIEEE Conference on High Performance Extreme Computing (HPEC), 2020
Nina Mujkanovic
K. Sivalingam
A. Lazzaro
GNN
118
2
0
26 Aug 2020
Woodpecker-DL: Accelerating Deep Neural Networks via Hardware-Aware
  Multifaceted Optimizations
Woodpecker-DL: Accelerating Deep Neural Networks via Hardware-Aware Multifaceted Optimizations
Yongchao Liu
Yue Jin
Yongqi Chen
Teng Teng
Hang Ou
Rui Zhao
Yao Zhang
258
1
0
11 Aug 2020
Optimizing Memory Placement using Evolutionary Graph Reinforcement
  Learning
Optimizing Memory Placement using Evolutionary Graph Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2020
Shauharda Khadka
Estelle Aflalo
Mattias Marder
Avrech Ben-David
Santiago Miret
Shie Mannor
Tamir Hazan
Hanlin Tang
Somdeb Majumdar
GNN
195
13
0
14 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
447
175
0
30 Jun 2020
Efficient Execution of Quantized Deep Learning Models: A Compiler
  Approach
Efficient Execution of Quantized Deep Learning Models: A Compiler Approach
Animesh Jain
Shoubhik Bhattacharya
Masahiro Masuda
Vin Sharma
Yida Wang
MQ
287
38
0
18 Jun 2020
Vyasa: A High-Performance Vectorizing Compiler for Tensor Convolutions
  on the Xilinx AI Engine
Vyasa: A High-Performance Vectorizing Compiler for Tensor Convolutions on the Xilinx AI EngineIEEE Conference on High Performance Extreme Computing (HPEC), 2020
Prasanth Chatarasi
S. Neuendorffer
Samuel Bayliss
K. Vissers
Vivek Sarkar
100
22
0
02 Jun 2020
FlexSA: Flexible Systolic Array Architecture for Efficient Pruned DNN
  Model Training
FlexSA: Flexible Systolic Array Architecture for Efficient Pruned DNN Model Training
Sangkug Lym
M. Erez
158
27
0
27 Apr 2020
SOL: Effortless Device Support for AI Frameworks without Source Code
  Changes
SOL: Effortless Device Support for AI Frameworks without Source Code ChangesIEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2020
Nicolas Weber
Felipe Huici
136
3
0
24 Mar 2020
Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural
  Networks for Edge Devices
Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural Networks for Edge DevicesConference on Machine Learning and Systems (MLSys), 2020
Byung Hoon Ahn
Jinwon Lee
J. Lin
Hsin-Pai Cheng
Jilei Hou
H. Esmaeilzadeh
204
58
0
04 Mar 2020
A C Code Generator for Fast Inference and Simple Deployment of
  Convolutional Neural Networks on Resource Constrained Systems
A C Code Generator for Fast Inference and Simple Deployment of Convolutional Neural Networks on Resource Constrained Systems
Oliver Urbann
Simon Camphausen
Arne Moos
Ingmar Schwarz
Sören Kerner
M. Otten
110
5
0
14 Jan 2020
MIOpen: An Open Source Library For Deep Learning Primitives
MIOpen: An Open Source Library For Deep Learning Primitives
Jehandad Khan
Paul Fultz
Artem Tamazov
Daniel Lowell
Chao-Jung Liu
...
Vasilii Filippov
Jing Zhang
Jing Zhou
Bragadeesh Natarajan
Mayank Daga
VLMMoE
242
48
0
30 Sep 2019
Benchmarking the Performance and Energy Efficiency of AI Accelerators
  for AI Training
Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training
Yuxin Wang
Qiang-qiang Wang
Shaoshuai Shi
Xin He
Zhenheng Tang
Kaiyong Zhao
Xiaowen Chu
427
4
0
15 Sep 2019
TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in
  TensorFlow Using Tapir
TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in TensorFlow Using TapirIEEE Conference on High Performance Extreme Computing (HPEC), 2019
S. Samsi
Michael Houle
127
5
0
29 Aug 2019
nGraph-HE2: A High-Throughput Framework for Neural Network Inference on
  Encrypted Data
nGraph-HE2: A High-Throughput Framework for Neural Network Inference on Encrypted DataIACR Cryptology ePrint Archive (IACR ePrint), 2019
Fabian Boemer
Anamaria Costache
Rosario Cammarota
Casimir Wierzynski
GNN
353
192
0
12 Aug 2019
swTVM: Towards Optimized Tensor Code Generation for Deep Learning on
  Sunway Many-Core Processor
swTVM: Towards Optimized Tensor Code Generation for Deep Learning on Sunway Many-Core Processor
Mingzhen Li
Changxi Liu
Jian-He Liao
Xuegui Zheng
Hailong Yang
...
Jun Xu
L. Gan
Guangwen Yang
Zhongzhi Luan
D. Qian
219
3
0
16 Apr 2019
Stripe: Tensor Compilation via the Nested Polyhedral Model
Stripe: Tensor Compilation via the Nested Polyhedral Model
Tim Zerrell
J. Bruestle
154
34
0
14 Mar 2019
DNNVM : End-to-End Compiler Leveraging Heterogeneous Optimizations on
  FPGA-based CNN Accelerators
DNNVM : End-to-End Compiler Leveraging Heterogeneous Optimizations on FPGA-based CNN Accelerators
Yu Xing
Shuang Liang
Lingzhi Sui
Xijie Jia
Jiantao Qiu
Xin Liu
Yushun Wang
Yu Wang
Yi Shan
185
79
0
20 Feb 2019
nGraph-HE: A Graph Compiler for Deep Learning on Homomorphically
  Encrypted Data
nGraph-HE: A Graph Compiler for Deep Learning on Homomorphically Encrypted Data
Fabian Boemer
Yixing Lao
Rosario Cammarota
Casimir Wierzynski
FedML
390
195
0
23 Oct 2018
Optimizing CNN Model Inference on CPUs
Optimizing CNN Model Inference on CPUs
Yizhi Liu
Yao Wang
Ruofei Yu
Mu Li
Vin Sharma
Yida Wang
288
169
0
07 Sep 2018
A Hardware-Software Blueprint for Flexible Deep Learning Specialization
A Hardware-Software Blueprint for Flexible Deep Learning Specialization
T. Moreau
Tianqi Chen
Luis Vega
Jared Roesch
Eddie Q. Yan
...
Josh Fromm
Ziheng Jiang
Luis Ceze
Carlos Guestrin
Arvind Krishnamurthy
173
81
0
11 Jul 2018
Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN
  Training
Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN Training
Bojian Zheng
Abhishek Tiwari
Nandita Vijaykumar
Gennady Pekhimenko
220
48
0
22 May 2018
1
Page 1 of 1