Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
1801.08058
Cited By

Intel nGraph: An Intermediate Representation, Compiler, and Executor for
Deep Learning

v1v2 (latest)

Intel nGraph: An Intermediate Representation, Compiler, and Executor for Deep Learning

24 January 2018

Arjun K. Bansal

Anahita Bhiwandiwalla

Avijit Chakraborty

William Constable

Christian Convey

Nikolay Korovaiko

Sandeep Aswath Narayana

ArXiv (abs)PDF HTML

Papers citing "Intel nGraph: An Intermediate Representation, Compiler, and Executor for Deep Learning"

48 / 48 papers shown

vMCU: Coordinated Memory Management and Kernel Optimization for DNN
Inference on MCUs

vMCU: Coordinated Memory Management and Kernel Optimization for DNN Inference on MCUs

151

4

0

01 May 2024

Proteus: Preserving Model Confidentiality during Graph Optimizations

Proteus: Preserving Model Confidentiality during Graph Optimizations

Maryam Haghifam

Christina Giannoula

Gennady Pekhimenko

Nandita Vijaykumar

278

1

0

18 Apr 2024

CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory
Accelerators

CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory AcceleratorsInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024

161

9

0

23 Jan 2024

PolyTOPS: Reconfigurable and Flexible Polyhedral Scheduler

PolyTOPS: Reconfigurable and Flexible Polyhedral SchedulerIEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2024

Gianpietro Consolaro

Harenome Razanajato

Nassim Tchoulak

...

Artur Cesar Araujo Alves

Corinne Ancourt

Cédric Bastoul

112

3

0

12 Jan 2024

GEVO-ML: Optimizing Machine Learning Code with Evolutionary Computation

GEVO-ML: Optimizing Machine Learning Code with Evolutionary Computation

Stephanie Forrest

271

0

0

16 Oct 2023

LoopTune: Optimizing Tensor Computations with Reinforcement Learning

LoopTune: Optimizing Tensor Computations with Reinforcement Learning

John Mellor-Crummey

302

3

0

04 Sep 2023

DyCL: Dynamic Neural Network Compilation Via Program Rewriting and Graph
Optimization

DyCL: Dynamic Neural Network Compilation Via Program Rewriting and Graph OptimizationInternational Symposium on Software Testing and Analysis (ISSTA), 2023

203

12

0

11 Jul 2023

CMLCompiler: A Unified Compiler for Classical Machine Learning

CMLCompiler: A Unified Compiler for Classical Machine LearningInternational Conference on Supercomputing (ICS), 2023

289

1

0

31 Jan 2023

TLP: A Deep Learning-based Cost Model for Tensor Program Tuning

TLP: A Deep Learning-based Cost Model for Tensor Program TuningInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022

197

52

0

07 Nov 2022

ALT: Boosting Deep Learning Performance by Breaking the Wall between
Graph and Operator Level Optimizations

ALT: Boosting Deep Learning Performance by Breaking the Wall between Graph and Operator Level Optimizations

...

338

1

0

22 Oct 2022

Optimizing DNN Compilation for Distributed Training with Joint OP and
Tensor Fusion

Optimizing DNN Compilation for Distributed Training with Joint OP and Tensor FusionIEEE Transactions on Parallel and Distributed Systems (TPDS), 2022

Zhen Zheng

161

6

0

26 Sep 2022

Dropbear: Machine Learning Marketplaces made Trustworthy with Byzantine
Model Agreement

Dropbear: Machine Learning Marketplaces made Trustworthy with Byzantine Model Agreement

Peter R. Pietzuch

Antoine Delignat-Lavaud

121

0

0

31 May 2022

GraphQ IR: Unifying the Semantic Parsing of Graph Query Languages with
One Intermediate Representation

GraphQ IR: Unifying the Semantic Parsing of Graph Query Languages with One Intermediate RepresentationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Juanzi Li

202

27

0

24 May 2022

Learning to Reverse DNNs from AI Programs Automatically

Learning to Reverse DNNs from AI Programs Automatically

201

19

0

20 May 2022

Benchmarking the Linear Algebra Awareness of TensorFlow and PyTorch

Benchmarking the Linear Algebra Awareness of TensorFlow and PyTorchIEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2022

Aravind Sankaran

Navid Akbari Alashti

Paolo Bientinesi

132

7

0

20 Feb 2022

Quantune: Post-training Quantization of Convolutional Neural Networks
using Extreme Gradient Boosting for Fast Deployment

Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast DeploymentFuture generations computer systems (FGCS), 2022

210

22

0

10 Feb 2022

FamilySeer: Towards Optimized Tensor Codes by Exploiting Computation
Subgraph Similarity

FamilySeer: Towards Optimized Tensor Codes by Exploiting Computation Subgraph Similarity

206

1

0

01 Jan 2022

Collage: Seamless Integration of Deep Learning Backends with Automatic
Placement

Collage: Seamless Integration of Deep Learning Backends with Automatic PlacementInternational Conference on Parallel Architectures and Compilation Techniques (PACT), 2021

291

7

0

01 Nov 2021

A Data-Centric Optimization Framework for Machine Learning

A Data-Centric Optimization Framework for Machine LearningInternational Conference on Supercomputing (ICS), 2021

Torsten Hoefler

312

19

0

20 Oct 2021

Joint Program and Layout Transformations to enable Convolutional
Operators on Specialized Hardware based on Constraint Programming

Joint Program and Layout Transformations to enable Convolutional Operators on Specialized Hardware based on Constraint ProgrammingACM Transactions on Architecture and Code Optimization (TACO) (TACO), 2021

Holger Fröning

254

0

0

10 Apr 2021

Enabling Homomorphically Encrypted Inference for Large DNN Models

Enabling Homomorphically Encrypted Inference for Large DNN ModelsIEEE transactions on computers (IEEE Trans. Comput.), 2021

Guillermo Lloret-Talavera

Antonio J. Peña

264

34

0

30 Mar 2021

Compiler Toolchains for Deep Learning Workloads on Embedded Platforms

Compiler Toolchains for Deep Learning Workloads on Embedded Platforms

Bernd Waschneck

183

7

0

08 Mar 2021

Neural Architecture Search as Program Transformation Exploration

Neural Architecture Search as Program Transformation ExplorationInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2021

Elliot J. Crowley

Michael F. P. O'Boyle

213

23

0

12 Feb 2021

SoK: Fully Homomorphic Encryption Compilers

SoK: Fully Homomorphic Encryption CompilersIEEE Symposium on Security and Privacy (IEEE S&P), 2021

Alexander Viand

174

113

0

18 Jan 2021

NNStreamer: Efficient and Agile Development of On-Device AI Systems

NNStreamer: Efficient and Agile Development of On-Device AI Systems

...

Parichay Kapoor

124

6

0

16 Jan 2021

Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning

Nimble: Lightweight and Parallel GPU Task Scheduling for Deep LearningNeural Information Processing Systems (NeurIPS), 2020

188

86

0

04 Dec 2020

Fusion-Catalyzed Pruning for Optimizing Deep Learning on Intelligent
Edge Devices

Fusion-Catalyzed Pruning for Optimizing Deep Learning on Intelligent Edge DevicesIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2020

226

37

0

30 Oct 2020

Optimising AI Training Deployments using Graph Compilers and Containers

Optimising AI Training Deployments using Graph Compilers and ContainersIEEE Conference on High Performance Extreme Computing (HPEC), 2020

Nina Mujkanovic

118

2

0

26 Aug 2020

Woodpecker-DL: Accelerating Deep Neural Networks via Hardware-Aware
Multifaceted Optimizations

Woodpecker-DL: Accelerating Deep Neural Networks via Hardware-Aware Multifaceted Optimizations

258

1

0

11 Aug 2020

Optimizing Memory Placement using Evolutionary Graph Reinforcement
Learning

Optimizing Memory Placement using Evolutionary Graph Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2020

Shauharda Khadka

Avrech Ben-David

Somdeb Majumdar

195

13

0

14 Jul 2020

Data Movement Is All You Need: A Case Study on Optimizing Transformers

Data Movement Is All You Need: A Case Study on Optimizing Transformers

Torsten Hoefler

447

175

0

30 Jun 2020

Efficient Execution of Quantized Deep Learning Models: A Compiler
Approach

Efficient Execution of Quantized Deep Learning Models: A Compiler Approach

Shoubhik Bhattacharya

Masahiro Masuda

287

38

0

18 Jun 2020

Vyasa: A High-Performance Vectorizing Compiler for Tensor Convolutions
on the Xilinx AI Engine

Vyasa: A High-Performance Vectorizing Compiler for Tensor Convolutions on the Xilinx AI EngineIEEE Conference on High Performance Extreme Computing (HPEC), 2020

Prasanth Chatarasi

S. Neuendorffer

100

22

0

02 Jun 2020

FlexSA: Flexible Systolic Array Architecture for Efficient Pruned DNN
Model Training

FlexSA: Flexible Systolic Array Architecture for Efficient Pruned DNN Model Training

158

27

0

27 Apr 2020

SOL: Effortless Device Support for AI Frameworks without Source Code
Changes

SOL: Effortless Device Support for AI Frameworks without Source Code ChangesIEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2020

136

3

0

24 Mar 2020

Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural
Networks for Edge Devices

Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural Networks for Edge DevicesConference on Machine Learning and Systems (MLSys), 2020

H. Esmaeilzadeh

204

58

0

04 Mar 2020

A C Code Generator for Fast Inference and Simple Deployment of
Convolutional Neural Networks on Resource Constrained Systems

A C Code Generator for Fast Inference and Simple Deployment of Convolutional Neural Networks on Resource Constrained Systems

Simon Camphausen

110

5

0

14 Jan 2020

MIOpen: An Open Source Library For Deep Learning Primitives

MIOpen: An Open Source Library For Deep Learning Primitives

...

Vasilii Filippov

Bragadeesh Natarajan

242

48

0

30 Sep 2019

Benchmarking the Performance and Energy Efficiency of AI Accelerators
for AI Training

Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training

Qiang-qiang Wang

427

4

0

15 Sep 2019

TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in
TensorFlow Using Tapir

TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in TensorFlow Using TapirIEEE Conference on High Performance Extreme Computing (HPEC), 2019

127

5

0

29 Aug 2019

nGraph-HE2: A High-Throughput Framework for Neural Network Inference on
Encrypted Data

nGraph-HE2: A High-Throughput Framework for Neural Network Inference on Encrypted DataIACR Cryptology ePrint Archive (IACR ePrint), 2019

Anamaria Costache

Rosario Cammarota

Casimir Wierzynski

353

192

0

12 Aug 2019

swTVM: Towards Optimized Tensor Code Generation for Deep Learning on
Sunway Many-Core Processor

swTVM: Towards Optimized Tensor Code Generation for Deep Learning on Sunway Many-Core Processor

...

219

3

0

16 Apr 2019

Stripe: Tensor Compilation via the Nested Polyhedral Model

Stripe: Tensor Compilation via the Nested Polyhedral Model

154

34

0

14 Mar 2019

DNNVM : End-to-End Compiler Leveraging Heterogeneous Optimizations on
FPGA-based CNN Accelerators

DNNVM : End-to-End Compiler Leveraging Heterogeneous Optimizations on FPGA-based CNN Accelerators

185

79

0

20 Feb 2019

nGraph-HE: A Graph Compiler for Deep Learning on Homomorphically
Encrypted Data

nGraph-HE: A Graph Compiler for Deep Learning on Homomorphically Encrypted Data

Rosario Cammarota

Casimir Wierzynski

390

195

0

23 Oct 2018

Optimizing CNN Model Inference on CPUs

Optimizing CNN Model Inference on CPUs

288

169

0

07 Sep 2018

A Hardware-Software Blueprint for Flexible Deep Learning Specialization

A Hardware-Software Blueprint for Flexible Deep Learning Specialization

...

Carlos Guestrin

Arvind Krishnamurthy

173

81

0

11 Jul 2018

Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN
Training

Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN Training

Abhishek Tiwari

Nandita Vijaykumar

Gennady Pekhimenko

220

48

0

22 May 2018

Page 1 of 1