Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1909.13639
Cited By
v1
v2
v3
v4 (latest)
NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement Learning
IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2019
20 September 2019
Ameer Haj-Ali
Nesreen Ahmed
Theodore L. Willke
Sophia Shao
Krste Asanović
Ion Stoica
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement Learning"
42 / 42 papers shown
A Deep Learning Model for Predicting Transformation Legality
Avani Tiwari
Yacine Hakimi
Riyadh Baghdadi
155
1
0
08 Nov 2025
VibeCodeHPC: An Agent-Based Iterative Prompting Auto-Tuner for HPC Code Generation Using LLMs
Shun-ichiro Hayashi
Koki Morita
Daichi Mukunoki
Tetsuya Hoshino
Takahiro Katagiri
161
1
0
26 Sep 2025
VecTrans: Enhancing Compiler Auto-Vectorization through LLM-Assisted Code Transformations
Zhongchun Zheng
Long Cheng
Long Cheng
Lu Li
Rodrigo C. O. Rocha
Tianyi Liu
Wei Wei
Jianjiang Zeng
Xianwei Zhang
Yaoqing Gao
242
0
0
25 Mar 2025
Enhancing Deployment-Time Predictive Model Robustness for Code Analysis and Optimization
IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2024
Huanting Wang
Patrick Lenihan
Zheng Wang
204
1
0
03 Jan 2025
Scheduling Languages: A Past, Present, and Future Taxonomy
Mary Hall
Cosmin Oancea
Anne C. Elster
Ari Rasch
Sameeran Joshi
Amir Mohammad Tavakkoli
Richard Schulze
268
1
0
25 Oct 2024
Improving Parallel Program Performance with LLM Optimizers via Agent-System Interfaces
Anjiang Wei
Allen Nie
Diyi Yang
Rohan Yadav
Wonchan Lee
Ke Wang
Alex Aiken
432
0
0
21 Oct 2024
A Study of Performance Programming of CPU, GPU accelerated Computers and SIMD Architecture
Xinyao Yi
152
5
0
16 Sep 2024
FTuner: A Fast Dynamic Shape Tensors Program Auto-Tuner for Deep Learning Compilers
Pengyu Mu
Linquan Wei
Yi Liu
Rui Wang
254
2
0
31 Jul 2024
MIREncoder: Multi-modal IR-based Pretrained Embeddings for Performance Optimizations
Akash Dutta
Ali Jannesari
298
3
0
02 Jul 2024
LLM-Vectorizer: LLM-based Verified Loop Vectorizer
IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2024
Jubi Taneja
Avery Laird
Cong Yan
Madan Musuvathi
Shuvendu K. Lahiri
199
31
0
07 Jun 2024
Compiler generated feedback for Large Language Models
Dejan Grubisic
Chris Cummins
Volker Seeker
Hugh Leather
203
14
0
18 Mar 2024
The Next 700 ML-Enabled Compiler Optimizations
S. VenkataKeerthy
Siddharth Jain
Umesh Kalvakuntla
Pranav Sai Gorantla
R. Chitale
E. Brevdo
Albert Cohen
Mircea Trofin
Ramakrishna Upadrasta
210
6
0
17 Nov 2023
YFlows: Systematic Dataflow Exploration and Code Generation for Efficient Neural Network Inference using SIMD Architectures on CPUs
International Conference on Compiler Construction (CC), 2023
Cyrus Zhou
Zack Hassman
Ruize Xu
Dhirpal Shah
Vaughn Richard
Yanjing Li
587
5
0
01 Oct 2023
Large Language Models for Compiler Optimization
Chris Cummins
Volker Seeker
Dejan Grubisic
Mostafa Elhoushi
Youwei Liang
...
Jonas Gehring
Fabian Gloeckle
Kim M. Hazelwood
Gabriel Synnaeve
Hugh Leather
248
87
0
11 Sep 2023
LoopTune: Optimizing Tensor Computations with Reinforcement Learning
Dejan Grubisic
Bram Wasti
Chris Cummins
John Mellor-Crummey
A. Zlateski
331
3
0
04 Sep 2023
Target-independent XLA optimization using Reinforcement Learning
Milan Ganai
Haichen Li
Theodore Enns
Yida Wang
Randy Huang
225
1
0
28 Aug 2023
ACC Saturator: Automatic Kernel Optimization for Directive-Based GPU Code
Kazuaki Matsumura
Simon Garcia De Gonzalo
Antonio J. Peña
246
1
0
22 Jun 2023
Performance Optimization using Multimodal Modeling and Heterogeneous GNN
IEEE International Symposium on High-Performance Parallel Distributed Computing (HPDC), 2023
Akashnil Dutta
J. Alcaraz
Ali TehraniJamsaz
E. César
A. Sikora
Ali Jannesari
258
14
0
25 Apr 2023
NPS: A Framework for Accurate Program Sampling Using Graph Neural Network
Yuanwei Fang
Zihao Liu
YanHeng Lu
Jiawei Liu
Jiajie Li
Yingqi Jin
Jing Chen
Yen-kuang Chen
Hongzhong Zheng
Yuan Xie
MLAU
123
5
0
18 Apr 2023
Power Constrained Autotuning using Graph Neural Networks
IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2023
Akashnil Dutta
JeeWhan Choi
Ali Jannesari
229
6
0
22 Feb 2023
ML-driven Hardware Cost Model for MLIR
Dibyendu Das
Sandya Mannarswamy
272
0
0
14 Feb 2023
Learning Compiler Pass Orders using Coreset and Normalized Value Prediction
International Conference on Machine Learning (ICML), 2023
Youwei Liang
Kevin R. Stone
A. Shameli
Chris Cummins
Mostafa Elhoushi
...
Benoit Steiner
Xiaomeng Yang
P. Xie
Hugh Leather
Yuandong Tian
318
26
0
09 Jan 2023
Compiler Optimization for Quantum Computing Using Reinforcement Learning
Design Automation Conference (DAC), 2022
Nils Quetschlich
Lukas Burgholzer
Robert Wille
192
35
0
08 Dec 2022
POSET-RL: Phase ordering for Optimizing Size and Execution Time using Reinforcement Learning
IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2022
Shalini Jain
Yashas Andaluri
S. VenkataKeerthy
Ramakrishna Upadrasta
144
15
0
27 Jul 2022
MLGOPerf: An ML Guided Inliner to Optimize Performance
Amir H. Ashouri
Mostafa Elhoushi
Yu-Wei Hua
Xiang Wang
Muhammad Asif Manzoor
Bryan Chan
Yaoqing Gao
270
19
0
18 Jul 2022
End-to-end Mapping in Heterogeneous Systems Using Graph Representation Learning
Yao Xiao
Guixiang Ma
Nesreen Ahmed
Mihai Capota
Ted Willke
Shahin Nazarian
P. Bogdan
237
2
0
25 Apr 2022
RL4ReAl: Reinforcement Learning for Register Allocation
International Conference on Compiler Construction (CC), 2022
S. VenkataKeerthy
Siddhartha Jain
Anilava Kundu
Rohit Aggarwal
Albert Cohen
Ramakrishna Upadrasta
OffRL
395
13
0
05 Apr 2022
Learning to Combine Instructions in LLVM Compiler
Sandya Mannarswamy
Dibyendu Das
233
6
0
22 Feb 2022
Profile Guided Optimization without Profiles: A Machine Learning Approach
Nadav Rotem
Chris Cummins
OffRL
279
11
0
24 Dec 2021
A Highly Configurable Hardware/Software Stack for DNN Inference Acceleration
Suvadeep Banerjee
Steve Burns
P. Cocchini
A. Davare
Shweta Jain
D. Kirkpatrick
A. Sorokin
Jin Yang
Zhenkun Yang
242
10
0
29 Nov 2021
Generating GPU Compiler Heuristics using Reinforcement Learning
Ian Colbert
Jake Daly
Norman Rubin
225
3
0
23 Nov 2021
CompilerGym: Robust, Performant Compiler Optimization Environments for AI Research
Chris Cummins
Bram Wasti
Jiadong Guo
Brandon Cui
Jason Ansel
...
Jia-Wei Liu
O. Teytaud
Benoit Steiner
Yuandong Tian
Hugh Leather
287
111
0
17 Sep 2021
Sonic: A Sampling-based Online Controller for Streaming Applications
Yan Pei
K. Pingali
187
0
0
15 Aug 2021
Customized Monte Carlo Tree Search for LLVM/Polly's Composable Loop Optimization Transformations
International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), 2021
Jaehoon Koo
Dali Wang
Michael Kruse
Xingfu Wu
P. Hovland
Mary W. Hall
234
9
0
10 May 2021
A Reinforcement Learning Environment for Polyhedral Optimizations
Alexander Brauckmann
Andrés Goens
J. Castrillón
218
10
0
28 Apr 2021
MLGO: a Machine Learning Guided Compiler Optimizations Framework
Mircea Trofin
Yundi Qian
E. Brevdo
Zinan Lin
K. Choromanski
Didong Li
286
85
0
13 Jan 2021
Deep Data Flow Analysis
Chris Cummins
Hugh Leather
Zacharias V. Fisches
Tal Ben-Nun
Torsten Hoefler
Michael F. P. O'Boyle
212
7
0
21 Nov 2020
Ansor: Generating High-Performance Tensor Programs for Deep Learning
USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2020
Lianmin Zheng
Chengfan Jia
Minmin Sun
Zhao Wu
Cody Hao Yu
...
Jun Yang
Danyang Zhuo
Koushik Sen
Joseph E. Gonzalez
Ion Stoica
723
529
0
11 Jun 2020
ProTuner: Tuning Programs with Monte Carlo Tree Search
Ameer Haj-Ali
Hasan Genç
Qijing Huang
William S. Moses
J. Wawrzynek
Krste Asanović
Ion Stoica
221
30
0
27 May 2020
ProGraML: Graph-based Deep Learning for Program Optimization and Analysis
Chris Cummins
Zacharias V. Fisches
Tal Ben-Nun
Torsten Hoefler
Hugh Leather
371
67
0
23 Mar 2020
AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement Learning
Conference on Machine Learning and Systems (MLSys), 2020
Qijing Huang
Ameer Haj-Ali
William S. Moses
J. Xiang
Ion Stoica
Krste Asanović
J. Wawrzynek
309
84
0
02 Mar 2020
Benanza: Automatic
μ
μ
μ
Benchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs
IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2019
Cheng-rong Li
Abdul Dakkak
Jinjun Xiong
Wen-mei W. Hwu
320
11
0
16 Nov 2019
1
Page 1 of 1