Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2002.11054
Cited By
v1
v2 (latest)
MLIR: A Compiler Infrastructure for the End of Moore's Law
25 February 2020
Chris Lattner
M. Amini
Uday Bondhugula
Albert Cohen
Andy Davis
J. Pienaar
River Riddle
T. Shpeisman
Nicolas Vasilache
O. Zinenko
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"MLIR: A Compiler Infrastructure for the End of Moore's Law"
50 / 75 papers shown
SkyEgg: Joint Implementation Selection and Scheduling for Hardware Synthesis using E-graphs
International Conference on Information Photonics (ICIP), 2024
Youwei Xiao
Yuyang Zou
Yun Liang
78
0
0
19 Nov 2025
TurkEmbed4Retrieval: Turkish Embedding Model for Retrieval Task
Özay Ezerceli
Gizem Gümüşçekiçci
Tuğba Erkoç
Berke Özenç
RALM
135
0
0
10 Nov 2025
Resource Estimation of CGGI and CKKS scheme workloads on FracTLcore Computing Fabric
Denis Ovichinnikov
Hemant Kavadia
Satya Keerti Chand Kudupudi
Ilya Rempel
Vineet Chadha
...
Paul Master
Craig Gentry
Darlene Kindler
Alberto Reyes
Muthu Annamalai
89
0
0
15 Oct 2025
LightCode: Compiling LLM Inference for Photonic-Electronic Systems
Ryan Tomich
Zhizhen Zhong
Dirk Englund
115
0
0
19 Sep 2025
GraphMend: Code Transformations for Fixing Graph Breaks in PyTorch 2
Savini Kashmira
Jayanaka L. Dantanarayana
Thamirawaran Sathiyalogeswaran
Yichao Yuan
Nishil Talati
Krisztian Flautner
Lingjia Tang
Jason Mars
201
1
0
17 Sep 2025
Astra: A Multi-Agent System for GPU Kernel Performance Optimization
Anjiang Wei
Tianran Sun
Yogesh Seenichamy
Hang Song
Anne Ouyang
Azalia Mirhoseini
Ke Wang
Alex Aiken
240
28
0
09 Sep 2025
Accelerating GenAI Workloads by Enabling RISC-V Microkernel Support in IREE
Adeel Ahmad
Ahmad Tameem Kamal
Nouman Amir
Bilal Zafar
Saad Bin Nasir
LRM
75
0
0
07 Jul 2025
DiTOX: Fault Detection and Localization in the ONNX Optimizer
Nikolaos Louloudakis
Ajitha Rajan
598
2
0
03 May 2025
Rulebook: bringing co-routines to reinforcement learning environments
Massimo Fioravanti
Samuele Pasini
Giovanni Agosta
218
1
0
28 Apr 2025
Morphing-based Compression for Data-centric ML Pipelines
Sebastian Baunsgaard
Matthias Boehm
258
0
0
15 Apr 2025
DSP-MLIR: A MLIR Dialect for Digital Signal Processing
ACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), 2024
Abhinav Kumar
Atharva Khedkar
Aviral Shrivastava
104
3
0
20 Aug 2024
vMCU: Coordinated Memory Management and Kernel Optimization for DNN Inference on MCUs
Wenlei Bao
Renze Chen
Meng Li
Zihao Ye
Luis Ceze
Yun Liang
161
5
0
01 May 2024
UniSparse: An Intermediate Language for General Sparse Format Customization
Jie Liu
Zhongyuan Zhao
Zijian Ding
Benjamin Brock
Hongbo Rong
Zhiru Zhang
188
10
0
09 Mar 2024
Architectural Neural Backdoors from First Principles
IEEE Symposium on Security and Privacy (S&P), 2024
Harry Langford
Ilia Shumailov
Yiren Zhao
Robert D. Mullins
Nicolas Papernot
AAML
268
11
0
10 Feb 2024
PolyTOPS: Reconfigurable and Flexible Polyhedral Scheduler
IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2024
Gianpietro Consolaro
Zhen Zhang
Harenome Razanajato
Nelson Lossing
Nassim Tchoulak
...
Artur Cesar Araujo Alves
Renwei Zhang
Denis Barthou
Corinne Ancourt
Cédric Bastoul
119
3
0
12 Jan 2024
HElium: A Language and Compiler for Fully Homomorphic Encryption with Support for Proxy Re-Encryption
Mirko Günther
L. Schütze
Kilian Becher
Thorsten Strufe
J. Castrillón
101
5
0
21 Dec 2023
Zero Bubble Pipeline Parallelism
Penghui Qi
Xinyi Wan
Guangxing Huang
Jialin Li
283
48
0
30 Nov 2023
XLB: A differentiable massively parallel lattice Boltzmann library in Python
Computer Physics Communications (CPC), 2023
Mohammadmehdi Ataei
H. Salehipour
AI4CE
457
33
0
27 Nov 2023
Neuromorphic Intermediate Representation: A Unified Instruction Set for Interoperable Brain-Inspired Computing
Nature Communications (Nat. Commun.), 2023
Jens Egholm Pedersen
Steven Abreu
Matthias Jobst
Gregor Lenz
Vittorio Fra
...
Gianvito Urgese
Sadasivan Shankar
Terrence C. Stewart
Nhan Duy Truong
Sadique Sheik
342
57
0
24 Nov 2023
CDMPP: A Device-Model Agnostic Framework for Latency Prediction of Tensor Programs
Hanpeng Hu
Junwei Su
Juntao Zhao
Size Zheng
Yibo Zhu
Yanghua Peng
Chuan Wu
407
7
0
16 Nov 2023
Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM
ACM Transactions on Mathematical Software (TOMS), 2023
Guillermo Alaejos
Adrián Castelló
P. Alonso-Jordá
Francisco D. Igual
Héctor J. Martínez
Enrique S. Quintana-Ortí
224
7
0
31 Oct 2023
Tackling the Matrix Multiplication Micro-kernel Generation with Exo
IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2023
Adrián Castelló
Julian Bellavita
Grace Dinh
Yuka Ikarashi
Héctor J. Martínez
249
9
0
26 Oct 2023
GEVO-ML: Optimizing Machine Learning Code with Evolutionary Computation
Jhe-Yu Liou
Stephanie Forrest
Carole-Jean Wu
VLM
273
0
0
16 Oct 2023
SimplePIM: A Software Framework for Productive and Efficient Processing-in-Memory
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2023
Jinfan Chen
Juan Gómez Luna
I. E. Hajj
Yu-Yin Guo
Onur Mutlu
207
33
0
03 Oct 2023
A Portable Framework for Accelerating Stencil Computations on Modern Node Architectures
R. Sai
John Mellor-Crummey
Jinfan Xu
Mauricio Araya-Polo
117
1
0
09 Sep 2023
On the Tool Manipulation Capability of Open-source Large Language Models
Qiantong Xu
Fenglu Hong
Yangqiu Song
Changran Hu
Zheng Chen
Jian Zhang
LLMAG
371
110
0
25 May 2023
ACRoBat: Optimizing Auto-batching of Dynamic Deep Learning at Compile Time
Conference on Machine Learning and Systems (MLSys), 2023
Pratik Fegade
Tianqi Chen
Phillip B. Gibbons
T. Mowry
235
3
0
17 May 2023
Experiences in Building a Composable and Functional API for Runtime SPIR-V Code Generation
J. Fumero
György Réthy
Athanasios Stratikopoulos
N. Foutris
Christos Kotselidis
117
0
0
16 May 2023
Ada-Grouper: Accelerating Pipeline Parallelism in Preempted Network by Adaptive Group-Scheduling for Micro-Batches
Siyu Wang
Zongyan Cao
Chang Si
Lansong Diao
Jiamang Wang
W. Lin
153
0
0
03 Mar 2023
Auto-Parallelizing Large Models with Rhino: A Systematic Approach on Production AI Platform
Shiwei Zhang
Lansong Diao
Siyu Wang
Zongyan Cao
Yiliang Gu
Chang Si
Ziji Shi
Zhen Zheng
Chuan Wu
W. Lin
AI4CE
211
4
0
16 Feb 2023
OpenHLS: High-Level Synthesis for Low-Latency Deep Neural Networks for Experimental Science
Maksim Levental
A. Khan
Kyle Chard
Kazutomo Yoshi
Ryan Chard
Ian Foster
354
5
0
13 Feb 2023
CMLCompiler: A Unified Compiler for Classical Machine Learning
International Conference on Supercomputing (ICS), 2023
Xu Wen
Wanling Gao
An-Dong Li
Lei Wang
Zihan Jiang
Jianfeng Zhan
311
1
0
31 Jan 2023
oneDNN Graph Compiler: A Hybrid Approach for High-Performance Deep Learning Compilation
IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2023
Jianhui Li
Zhennan Qin
Yijie Mei
Jingze Cui
Yunfei Song
...
Baihui Jin
Yan Zhang
Jason Ye
Eric Lin
Daniel M. Lavery
GNN
276
20
0
03 Jan 2023
Python FPGA Programming with Data-Centric Multi-Level Design
Johannes de Fine Licht
T. De Matteis
Tal Ben-Nun
Andreas Kuster
Oliver Rausch
Manuel Burger
Carl-Johannes Johnsen
Torsten Hoefler
278
1
0
28 Dec 2022
On Physics-Informed Neural Networks for Quantum Computers
Frontiers in Applied Mathematics and Statistics (FAMS), 2022
Stefano Markidis
PINN
296
35
0
28 Sep 2022
Optimizing DNN Compilation for Distributed Training with Joint OP and Tensor Fusion
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2022
Xiaodong Yi
Shiwei Zhang
Lansong Diao
Chuan Wu
Zhen Zheng
Shiqing Fan
Siyu Wang
Jun Yang
W. Lin
170
6
0
26 Sep 2022
Programming Autonomous Machines
International Conference on Embedded Software (EMSOFT), 2022
Shaoshan Liu
Xiaoming Li
Tongsheng Geng
Stéphane Zuckerman
J. Gaudiot
131
2
0
06 Sep 2022
GraphQ IR: Unifying the Semantic Parsing of Graph Query Languages with One Intermediate Representation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
L. Nie
S. Cao
Jiaxin Shi
Jiu Sun
Qingwen Tian
Lei Hou
Juanzi Li
Jidong Zhai
GNN
215
27
0
24 May 2022
Special Session: Towards an Agile Design Methodology for Efficient, Reliable, and Secure ML Systems
IEEE VLSI Test Symposium (VTS), 2022
Shail Dave
Alberto Marchisio
Muhammad Abdullah Hanif
Amira Guesmi
Aviral Shrivastava
Ihsen Alouani
Mohamed Bennai
288
14
0
18 Apr 2022
Query Processing on Tensor Computation Runtimes
Proceedings of the VLDB Endowment (PVLDB), 2022
Dong He
Supun Nakandala
Dalitso Banda
Rathijit Sen
Karla Saur
Kwanghyun Park
Carlo Curino
Jesús Camacho-Rodríguez
Konstantinos Karanasos
Matteo Interlandi
475
51
0
03 Mar 2022
Memory Planning for Deep Neural Networks
Maksim Levental
210
4
0
23 Feb 2022
Coverage-Guided Tensor Compiler Fuzzing with Joint IR-Pass Mutation
Jiawei Liu
Yuxiang Wei
Sen Yang
Yinlin Deng
Lingming Zhang
164
59
0
21 Feb 2022
Implementing Spiking Neural Networks on Neuromorphic Architectures: A Review
Phu Khanh Huynh
M. L. Varshika
A. Paul
Murat Isik
Adarsha Balaji
Anup Das
202
47
0
17 Feb 2022
HECO: Fully Homomorphic Encryption Compiler
USENIX Security Symposium (USENIX Security), 2022
Alexander Viand
Patrick Jattke
Miro Haller
Anwar Hithnawi
424
29
0
03 Feb 2022
Compiler-Driven Simulation of Reconfigurable Hardware Accelerators
International Symposium on High-Performance Computer Architecture (HPCA), 2022
Zhijing Li
Yuwei Ye
S. Neuendorffer
Adrian Sampson
191
5
0
01 Feb 2022
Lifting C Semantics for Dataflow Optimization
International Conference on Supercomputing (ICS), 2021
A. Calotoiu
Tal Ben-Nun
Grzegorz Kwa'sniewski
Johannes de Fine Licht
Timo Schneider
Philipp Schaad
Torsten Hoefler
342
6
0
22 Dec 2021
Torch.fx: Practical Program Capture and Transformation for Deep Learning in Python
James K. Reed
Zach DeVito
Horace He
Ansley Ussery
Jason Ansel
CLIP
235
71
0
15 Dec 2021
A Highly Configurable Hardware/Software Stack for DNN Inference Acceleration
Suvadeep Banerjee
Steve Burns
P. Cocchini
A. Davare
Shweta Jain
D. Kirkpatrick
A. Sorokin
Jin Yang
Zhenkun Yang
250
10
0
29 Nov 2021
A Data-Centric Optimization Framework for Machine Learning
International Conference on Supercomputing (ICS), 2021
Oliver Rausch
Tal Ben-Nun
Nikoli Dryden
Andrei Ivanov
Shigang Li
Torsten Hoefler
AI4CE
329
19
0
20 Oct 2021
DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion
ACM Transactions on Architecture and Code Optimization (TACO) (TACO), 2020
Wei Niu
Jiexiong Guan
Yanzhi Wang
G. Agrawal
Bin Ren
AI4CE
327
208
0
30 Aug 2021
1
2
Next
Page 1 of 2