ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.08166
  4. Cited By
Learning to Optimize Tensor Programs
v1v2v3v4 (latest)

Learning to Optimize Tensor Programs

21 May 2018
Tianqi Chen
Lianmin Zheng
Eddie Q. Yan
Ziheng Jiang
T. Moreau
Luis Ceze
Carlos Guestrin
Arvind Krishnamurthy
ArXiv (abs)PDFHTML

Papers citing "Learning to Optimize Tensor Programs"

50 / 147 papers shown
Title
AGO: Boosting Mobile AI Inference Performance by Removing Constraints on
  Graph Optimization
AGO: Boosting Mobile AI Inference Performance by Removing Constraints on Graph Optimization
Zhiying Xu
H. Peng
Wei Wang
GNN
78
3
0
02 Dec 2022
HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler
  for Neural Networks
HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler for Neural Networks
Zining Zhang
Bingsheng He
Zhenjie Zhang
52
5
0
21 Nov 2022
TLP: A Deep Learning-based Cost Model for Tensor Program Tuning
TLP: A Deep Learning-based Cost Model for Tensor Program Tuning
Yiqiang Zhai
Yu Zhang
Shuo Liu
Xiaomeng Chu
Jie Peng
Jianmin Ji
Yanyong Zhang
47
33
0
07 Nov 2022
Rethinking Storage Management for Data Processing Pipelines in Cloud
  Data Centers
Rethinking Storage Management for Data Processing Pipelines in Cloud Data Centers
Ubaid Ullah Hafeez
Martin Maas
Mustafa Uysal
Richard McDougall
27
0
0
04 Nov 2022
Exploring Effects of Computational Parameter Changes to Image
  Recognition Systems
Exploring Effects of Computational Parameter Changes to Image Recognition Systems
Nikolaos Louloudakis
Perry Gibson
José Cano
A. Rajan
54
6
0
01 Nov 2022
ALCOP: Automatic Load-Compute Pipelining in Deep Learning Compiler for
  AI-GPUs
ALCOP: Automatic Load-Compute Pipelining in Deep Learning Compiler for AI-GPUs
Guyue Huang
Yang Bai
Liu Liu
Yuke Wang
Bei Yu
Yufei Ding
Yuan Xie
88
18
0
29 Oct 2022
ALT: Boosting Deep Learning Performance by Breaking the Wall between
  Graph and Operator Level Optimizations
ALT: Boosting Deep Learning Performance by Breaking the Wall between Graph and Operator Level Optimizations
Zhiying Xu
Jiafan Xu
H. Peng
Wei Wang
Xiaoliang Wang
...
Haipeng Dai
Yixu Xu
Hao Cheng
Kun Wang
Guihai Chen
83
0
0
22 Oct 2022
Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor
  Programs
Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor Programs
Yaoyao Ding
Cody Hao Yu
Bojian Zheng
Yizhi Liu
Yida Wang
Gennady Pekhimenko
82
32
0
18 Oct 2022
Demystifying Map Space Exploration for NPUs
Demystifying Map Space Exploration for NPUs
Sheng-Chun Kao
A. Parashar
Po-An Tsai
T. Krishna
98
10
0
07 Oct 2022
Decompiling x86 Deep Neural Network Executables
Decompiling x86 Deep Neural Network Executables
Zhibo Liu
Yuanyuan Yuan
Shuai Wang
Xiaofei Xie
Lei Ma
AAML
76
15
0
03 Oct 2022
SONAR: Joint Architecture and System Optimization Search
SONAR: Joint Architecture and System Optimization Search
Elias Jääsaari
Michelle Ma
Ameet Talwalkar
Tianqi Chen
66
1
0
25 Aug 2022
OLLIE: Derivation-based Tensor Program Optimizer
OLLIE: Derivation-based Tensor Program Optimizer
Liyan Zheng
Haojie Wang
Jidong Zhai
Muyan Hu
Zixuan Ma
Tuowei Wang
Shizhi Tang
Lei Xie
Kezhao Huang
Zhihao Jia
73
3
0
02 Aug 2022
NNSmith: Generating Diverse and Valid Test Cases for Deep Learning
  Compilers
NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers
Jiawei Liu
Jinkun Lin
Fabian Ruffy
Cheng Tan
Jinyang Li
Aurojit Panda
Lingming Zhang
158
74
0
26 Jul 2022
SparseTIR: Composable Abstractions for Sparse Compilation in Deep
  Learning
SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning
Zihao Ye
Ruihang Lai
Junru Shao
Tianqi Chen
Luis Ceze
125
98
0
11 Jul 2022
TensorIR: An Abstraction for Automatic Tensorized Program Optimization
TensorIR: An Abstraction for Automatic Tensorized Program Optimization
Siyuan Feng
Bohan Hou
Hongyi Jin
Wuwei Lin
Junru Shao
...
Zihao Ye
Lianmin Zheng
Cody Hao Yu
Yong Yu
Tianqi Chen
57
68
0
09 Jul 2022
CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN
  Execution
CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution
Taeho Kim
Yongin Kwon
Jemin Lee
Taeho Kim
Sangtae Ha
37
2
0
04 Jul 2022
Productive Reproducible Workflows for DNNs: A Case Study for Industrial
  Defect Detection
Productive Reproducible Workflows for DNNs: A Case Study for Industrial Defect Detection
Perry Gibson
José Cano
AI4CE
65
1
0
19 Jun 2022
HW-Aware Initialization of DNN Auto-Tuning to Improve Exploration Time
  and Robustness
HW-Aware Initialization of DNN Auto-Tuning to Improve Exploration Time and Robustness
D. Rieber
Moritz Reiber
Oliver Bringmann
Holger Fröning
69
5
0
31 May 2022
Tensor Program Optimization with Probabilistic Programs
Tensor Program Optimization with Probabilistic Programs
Junru Shao
Xiyou Zhou
Siyuan Feng
Bohan Hou
Ruihang Lai
Hongyi Jin
Wuwei Lin
Masahiro Masuda
Cody Hao Yu
Tianqi Chen
88
31
0
26 May 2022
LoopStack: a Lightweight Tensor Algebra Compiler Stack
LoopStack: a Lightweight Tensor Algebra Compiler Stack
Bram Wasti
J. Cambronero
Benoit Steiner
Hugh Leather
A. Zlateski
26
3
0
02 May 2022
Bifrost: End-to-End Evaluation and Optimization of Reconfigurable DNN
  Accelerators
Bifrost: End-to-End Evaluation and Optimization of Reconfigurable DNN Accelerators
Axel Stjerngren
Perry Gibson
José Cano
83
4
0
26 Apr 2022
Learning from distinctive candidates to optimize reduced-precision
  convolution program on tensor cores
Learning from distinctive candidates to optimize reduced-precision convolution program on tensor cores
Junkyeong Choi
Hyucksung Kwon
W. Lee
Jungwook Choi
Jieun Lim
66
1
0
11 Feb 2022
Flashlight: Enabling Innovation in Tools for Machine Learning
Flashlight: Enabling Innovation in Tools for Machine Learning
Jacob Kahn
Vineel Pratap
Tatiana Likhomanenko
Qiantong Xu
Awni Y. Hannun
...
Gilad Avidov
Benoit Steiner
Vitaliy Liptchinsky
Gabriel Synnaeve
R. Collobert
132
33
0
29 Jan 2022
DNNFuser: Generative Pre-Trained Transformer as a Generalized Mapper for
  Layer Fusion in DNN Accelerators
DNNFuser: Generative Pre-Trained Transformer as a Generalized Mapper for Layer Fusion in DNN Accelerators
Sheng-Chun Kao
Xiaoyu Huang
T. Krishna
AI4CE
92
9
0
26 Jan 2022
VELTAIR: Towards High-Performance Multi-tenant Deep Learning Services
  via Adaptive Compilation and Scheduling
VELTAIR: Towards High-Performance Multi-tenant Deep Learning Services via Adaptive Compilation and Scheduling
Zihan Liu
Jingwen Leng
Zhihui Zhang
Quan Chen
Chao Li
Minyi Guo
61
47
0
17 Jan 2022
Moses: Efficient Exploitation of Cross-device Transferable Features for
  Tensor Program Optimization
Moses: Efficient Exploitation of Cross-device Transferable Features for Tensor Program Optimization
Zhihe Zhao
Xian Shuai
Yang Bai
Neiwen Ling
Nan Guan
Zhenyu Yan
Guoliang Xing
98
6
0
15 Jan 2022
Transfer-Tuning: Reusing Auto-Schedules for Efficient Tensor Program
  Code Generation
Transfer-Tuning: Reusing Auto-Schedules for Efficient Tensor Program Code Generation
Perry Gibson
José Cano
65
12
0
14 Jan 2022
BoGraph: Structured Bayesian Optimization From Logs for Expensive
  Systems with Many Parameters
BoGraph: Structured Bayesian Optimization From Logs for Expensive Systems with Many Parameters
Sami Alabed
Eiko Yoneki
64
7
0
16 Dec 2021
A Highly Configurable Hardware/Software Stack for DNN Inference
  Acceleration
A Highly Configurable Hardware/Software Stack for DNN Inference Acceleration
Suvadeep Banerjee
Steve Burns
P. Cocchini
A. Davare
Shweta Jain
D. Kirkpatrick
A. Sorokin
Jin Yang
Zhenkun Yang
79
9
0
29 Nov 2021
Collage: Seamless Integration of Deep Learning Backends with Automatic
  Placement
Collage: Seamless Integration of Deep Learning Backends with Automatic Placement
Byungsoo Jeon
Sunghyun Park
Peiyuan Liao
Sheng Xu
Tianqi Chen
Zhihao Jia
VLM
69
5
0
01 Nov 2021
Characterizing and Taming Resolution in Convolutional Neural Networks
Characterizing and Taming Resolution in Convolutional Neural Networks
Eddie Q. Yan
Liang Luo
Luis Ceze
55
0
0
28 Oct 2021
Bolt: Bridging the Gap between Auto-tuners and Hardware-native
  Performance
Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance
Jiarong Xing
Leyuan Wang
Shang Zhang
Jack H Chen
Ang Chen
Yibo Zhu
69
44
0
25 Oct 2021
The CoRa Tensor Compiler: Compilation for Ragged Tensors with Minimal
  Padding
The CoRa Tensor Compiler: Compilation for Ragged Tensors with Minimal Padding
Pratik Fegade
Tianqi Chen
Phillip B. Gibbons
T. Mowry
76
29
0
19 Oct 2021
SoftNeuro: Fast Deep Inference using Multi-platform Optimization
SoftNeuro: Fast Deep Inference using Multi-platform Optimization
Masaki Hilaga
Yasuhiro Kuroda
Hitoshi Matsuo
Tatsuya Kawaguchi
Gabriel Ogawa
Hiroshi Miyake
Yusuke Kozawa
43
1
0
12 Oct 2021
SECDA: Efficient Hardware/Software Co-Design of FPGA-based DNN
  Accelerators for Edge Inference
SECDA: Efficient Hardware/Software Co-Design of FPGA-based DNN Accelerators for Edge Inference
Jude Haris
Perry Gibson
José Cano
Nicolas Bohm Agostini
David Kaeli
91
19
0
01 Oct 2021
Learning to Superoptimize Real-world Programs
Learning to Superoptimize Real-world Programs
Alex Shypula
Pengcheng Yin
Jeremy Lacomis
Claire Le Goues
Edward N. Schwartz
Graham Neubig
NAI
145
10
0
28 Sep 2021
DNNFusion: Accelerating Deep Neural Networks Execution with Advanced
  Operator Fusion
DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion
Wei Niu
Jiexiong Guan
Yanzhi Wang
G. Agrawal
Bin Ren
AI4CE
76
152
0
30 Aug 2021
Using Graph Neural Networks to model the performance of Deep Neural
  Networks
Using Graph Neural Networks to model the performance of Deep Neural Networks
Shikhar Singh
Benoit Steiner
James Hegarty
Hugh Leather
GNN
38
3
0
27 Aug 2021
AIRCHITECT: Learning Custom Architecture Design and Mapping Space
AIRCHITECT: Learning Custom Architecture Design and Mapping Space
A. Samajdar
J. Joseph
Matthew Denton
T. Krishna
125
7
0
16 Aug 2021
NeurObfuscator: A Full-stack Obfuscation Tool to Mitigate Neural
  Architecture Stealing
NeurObfuscator: A Full-stack Obfuscation Tool to Mitigate Neural Architecture Stealing
Jingtao Li
Zhezhi He
Adnan Siraj Rakin
Deliang Fan
C. Chakrabarti
55
26
0
20 Jul 2021
FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks
FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks
Sheng-Chun Kao
Suvinay Subramanian
Gaurav Agrawal
Amir Yazdanbakhsh
T. Krishna
128
64
0
13 Jul 2021
RHNAS: Realizable Hardware and Neural Architecture Search
RHNAS: Realizable Hardware and Neural Architecture Search
Yash Akhauri
Adithya Niranjan
J. P. Muñoz
Suvadeep Banerjee
A. Davare
P. Cocchini
A. Sorokin
R. Iyer
Nilesh Jain
49
3
0
17 Jun 2021
NAAS: Neural Accelerator Architecture Search
NAAS: Neural Accelerator Architecture Search
Chengyue Wu
Mengtian Yang
Song Han
84
60
0
27 May 2021
Customized Monte Carlo Tree Search for LLVM/Polly's Composable Loop
  Optimization Transformations
Customized Monte Carlo Tree Search for LLVM/Polly's Composable Loop Optimization Transformations
Jaehoon Koo
Prasanna Balaprakash
Michael Kruse
Xingfu Wu
P. Hovland
Mary W. Hall
68
7
0
10 May 2021
HASCO: Towards Agile HArdware and Software CO-design for Tensor
  Computation
HASCO: Towards Agile HArdware and Software CO-design for Tensor Computation
Qingcheng Xiao
Wenlei Bao
Bingzhe Wu
Pengcheng Xu
Xuehai Qian
Yun Liang
122
69
0
04 May 2021
Bring Your Own Codegen to Deep Learning Compiler
Bring Your Own Codegen to Deep Learning Compiler
Zhi Chen
Cody Hao Yu
Trevor Morris
Jorn Tuyls
Yi-Hsiang Lai
Jared Roesch
Elliott Delaye
Vin Sharma
Yida Wang
45
15
0
03 May 2021
Tuna: A Static Analysis Approach to Optimizing Deep Neural Networks
Tuna: A Static Analysis Approach to Optimizing Deep Neural Networks
Yao Wang
Xingyu Zhou
Yanming Wang
Rui Li
Yong Wu
Vin Sharma
74
8
0
29 Apr 2021
A Deep Learning Based Cost Model for Automatic Code Optimization
A Deep Learning Based Cost Model for Automatic Code Optimization
Riyadh Baghdadi
Massinissa Merouani
Mohamed-Hicham Leghettas
K. Abdous
T. Arbaoui
K. Benatchba
Saman P. Amarasinghe
81
71
0
11 Apr 2021
Joint Program and Layout Transformations to enable Convolutional
  Operators on Specialized Hardware based on Constraint Programming
Joint Program and Layout Transformations to enable Convolutional Operators on Specialized Hardware based on Constraint Programming
D. Rieber
Axel Acosta
Holger Fröning
25
0
0
10 Apr 2021
Automated Backend-Aware Post-Training Quantization
Automated Backend-Aware Post-Training Quantization
Ziheng Jiang
Animesh Jain
An Liu
Josh Fromm
Chengqian Ma
Tianqi Chen
Luis Ceze
MQ
72
2
0
27 Mar 2021
Previous
123
Next