Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1802.04730
Cited By
v1
v2
v3 (latest)
Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions
13 February 2018
Nicolas Vasilache
O. Zinenko
Theodoros Theodoridis
Priya Goyal
Zach DeVito
William S. Moses
Sven Verdoolaege
Andrew Adams
Albert Cohen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions"
50 / 150 papers shown
Morphling: Fast, Fused, and Flexible GNN Training at Scale
Anubhab
Rupesh Nasre
GNN
AI4CE
LRM
449
0
0
27 Mar 2026
STAGE: A Symbolic Tensor grAph GEnerator for distributed AI system co-design
Changhai Man
Joongun Park
Hanjiang Wu
Huan Xu
Srinivas Sridharan
Tushar Krishna
356
0
0
13 Nov 2025
Agentic Auto-Scheduling: An Experimental Study of LLM-Guided Loop Optimization
Massinissa Merouani
Islem Kara Bernou
Riyadh Baghdadi
173
2
0
01 Nov 2025
VibeCodeHPC: An Agent-Based Iterative Prompting Auto-Tuner for HPC Code Generation Using LLMs
Shun-ichiro Hayashi
Koki Morita
Daichi Mukunoki
Tetsuya Hoshino
Takahiro Katagiri
163
1
0
26 Sep 2025
The Syntax and Semantics of einsum
Maurice Wenig
Paul G. Rump
Mark Blacher
Joachim Giesen
206
0
0
24 Sep 2025
REASONING COMPILER: LLM-Guided Optimizations for Efficient Model Serving
Sujun Tang
Christopher Priebe
R. Mahapatra
Lianhui Qin
H. Esmaeilzadeh
LRM
346
1
0
02 Jun 2025
Tilus: A Tile-Level GPGPU Programming Language for Low-Precision Computation
Yaoyao Ding
Bohan Hou
Xinyu Zhang
Allan Lin
Tianqi Chen
Cody Yu Hao
Yida Wang
Gennady Pekhimenko
381
0
0
17 Apr 2025
Scheduling Languages: A Past, Present, and Future Taxonomy
Mary Hall
Cosmin Oancea
Anne C. Elster
Ari Rasch
Sameeran Joshi
Amir Mohammad Tavakkoli
Richard Schulze
272
1
0
25 Oct 2024
Tadashi: Enabling AI-Based Automated Code Generation With Guaranteed Correctness
Emil Vatai
Aleksandr Drozd
Ivan R. Ivanov
João Eduardo Batista
Yinghao Ren
Mohamed Wahib
315
2
0
04 Oct 2024
A Reinforcement Learning Environment for Automatic Code Optimization in the MLIR Compiler
Nazim Bendib
Iheb Nassim Aouadj
Riyadh Baghdadi
Iheb Nassim Aouadj
Bouchama Djad
Rafik Bouloudene
Riyadh Baghdadi
291
7
0
17 Sep 2024
CoolerSpace: A Language for Physically Correct and Computationally Efficient Color Programming
Ethan Chen
Jiwon Chang
Yuhao Zhu
101
1
0
04 Sep 2024
Stream-K++: Adaptive GPU GEMM Kernel Scheduling and Selection using Bloom Filters
Harisankar Sadasivan
Muhammad Osama
Maksim Podkorytov
Carlus Huang
Jun Liu
198
0
0
21 Aug 2024
Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor with T10
Symposium on Operating Systems Principles (SOSP), 2024
Yiqi Liu
Yuqi Xue
Yu Cheng
Lingxiao Ma
Ziming Miao
Jilong Xue
Jian Huang
GNN
303
9
0
09 Aug 2024
Dynamic Co-Optimization Compiler: Leveraging Multi-Agent Reinforcement Learning for Enhanced DNN Accelerator Performance
Arya Fayyazi
M. Kamal
Massoud Pedram
380
0
0
11 Jul 2024
Composing Distributed Computations Through Task and Kernel Fusion
Rohan Yadav
S. Sundram
Wonchan Lee
Michael Garland
Michael Bauer
Alex Aiken
Fredrik Kjolstad
189
5
0
26 Jun 2024
Scorch: A Library for Sparse Deep Learning
Bobby Yan
Alexander J. Root
Trevor Gale
David Broman
Fredrik Kjolstad
302
3
0
27 May 2024
Graph neural networks with configuration cross-attention for tensor compilers
Dmitrii Khizbullin
Eduardo Rocha de Andrade
Thanh Hau Nguyen
Matheus Pedroza Ferreira
David R. Pugh
GNN
214
0
0
26 May 2024
Allo: A Programming Model for Composable Accelerator Design
Hongzheng Chen
Niansong Zhang
Shaojie Xiang
Zhichen Zeng
Mengjia Dai
Zhiru Zhang
254
48
0
07 Apr 2024
LOOPer: A Learned Automatic Code Optimizer For Polyhedral Compilers
Massinissa Merouani
Khaled Afif Boudaoud
Iheb Nassim Aouadj
Nassim Tchoulak
Islam Kara Bernou
Hamza Benyamina
F. B. Tayeb
K. Benatchba
Hugh Leather
Riyadh Baghdadi
498
13
0
18 Mar 2024
SoD
2
^2
2
: Statically Optimizing Dynamic Deep Neural Network
Wei Niu
Gagan Agrawal
Bin Ren
389
11
0
29 Feb 2024
Unraveling the Key of Machine Learning Solutions for Android Malware Detection
Jiahao Liu
Jun Zeng
Fabio Pierazzi
Lorenzo Cavallaro
Zhenkai Liang
AAML
221
14
0
05 Feb 2024
CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory Accelerators
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024
Songyun Qu
Shixin Zhao
Bing Li
Yintao He
Xuyi Cai
Lei Zhang
Ying Wang
181
9
0
23 Jan 2024
Fast Kronecker Matrix-Matrix Multiplication on GPUs
Abhinav Jangda
Mohit Yadav
413
4
0
18 Jan 2024
PolyTOPS: Reconfigurable and Flexible Polyhedral Scheduler
IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2024
Gianpietro Consolaro
Zhen Zhang
Harenome Razanajato
Nelson Lossing
Nassim Tchoulak
...
Artur Cesar Araujo Alves
Renwei Zhang
Denis Barthou
Corinne Ancourt
Cédric Bastoul
120
3
0
12 Jan 2024
conv_einsum: A Framework for Representation and Fast Evaluation of Multilinear Operations in Convolutional Tensorial Neural Networks
Tahseen Rabbani
Jiahao Su
Xiaoyu Liu
David Chan
Geoffrey Sangston
Furong Huang
266
1
0
07 Jan 2024
GraphRARE: Reinforcement Learning Enhanced Graph Neural Network with Relative Entropy
IEEE International Conference on Data Engineering (ICDE), 2023
Tianhao Peng
Wenjun Wu
Haitao Yuan
Zhifeng Bao
Pengrui Zhao
Xin Yu
Xuetao Lin
Yu Liang
Yanjun Pu
434
22
0
15 Dec 2023
Packrat: Automatic Reconfiguration for Latency Minimization in CPU-based DNN Serving
Ankit Bhardwaj
Amar Phanishayee
Deepak Narayanan
Mihail Tarta
Ryan Stutsman
201
2
0
30 Nov 2023
A Compiler from Array Programs to Vectorized Homomorphic Encryption
Rolph Recto
Andrew C. Myers
221
5
0
10 Nov 2023
Restoring the Broken Covenant Between Compilers and Deep Learning Accelerators
Sean Kinzer
Soroush Ghodrati
R. Mahapatra
Byung Hoon Ahn
Edwin Mascarenhas
Xiaolong Li
J. Matai
Liang Zhang
H. Esmaeilzadeh
127
2
0
27 Oct 2023
Serving Deep Learning Model in Relational Databases
International Conference on Extending Database Technology (EDBT), 2023
Alexandre Eichenberger
Qi Lin
Saif Masood
Hong Min
Alexander Sim
...
Yida Wang
Kesheng Wu
Binhang Yuan
Lixi Zhou
Jia Zou
233
0
0
07 Oct 2023
YFlows: Systematic Dataflow Exploration and Code Generation for Efficient Neural Network Inference using SIMD Architectures on CPUs
International Conference on Compiler Construction (CC), 2023
Cyrus Zhou
Zack Hassman
Ruize Xu
Dhirpal Shah
Vaughn Richard
Yanjing Li
605
5
0
01 Oct 2023
LoopTune: Optimizing Tensor Computations with Reinforcement Learning
Dejan Grubisic
Bram Wasti
Chris Cummins
John Mellor-Crummey
A. Zlateski
336
3
0
04 Sep 2023
Saturn: An Optimized Data System for Large Model Deep Learning Workloads
Proceedings of the VLDB Endowment (PVLDB), 2023
Kabir Nagrecha
Arun Kumar
406
8
0
03 Sep 2023
Target-independent XLA optimization using Reinforcement Learning
Milan Ganai
Haichen Li
Theodore Enns
Yida Wang
Randy Huang
226
1
0
28 Aug 2023
TpuGraphs: A Performance Prediction Dataset on Large Tensor Computational Graphs
Neural Information Processing Systems (NeurIPS), 2023
P. Phothilimthana
Sami Abu-El-Haija
Kaidi Cao
Bahare Fatemi
Mike Burrows
Charith Mendis
Bryan Perozzi
GNN
AI4TS
482
29
0
25 Aug 2023
MARS: Exploiting Multi-Level Parallelism for DNN Workloads on Adaptive Multi-Accelerator Systems
Design Automation Conference (DAC), 2023
Guan Shen
Jieru Zhao
Zeke Wang
Zhehan Lin
Wenchao Ding
Chentao Wu
Quan Chen
Minyi Guo
139
8
0
23 Jul 2023
Maximum Flows in Parametric Graph Templates
International/Italian Conference on Algorithms and Complexity (CIAC), 2023
Tal Ben-Nun
Lukas Gianinazzi
Torsten Hoefler
Yishai Oltchik
136
0
0
17 Jul 2023
Bridging Control-Centric and Data-Centric Optimization
IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2023
Tal Ben-Nun
Berke Ates
A. Calotoiu
Torsten Hoefler
351
11
0
01 Jun 2023
AMULET: Adaptive Matrix-Multiplication-Like Tasks
International Workshop on Data Management on New Hardware (DaMoN), 2023
Junyoung Kim
Kenneth Ross
Eric Sedlar
Lukas Stadler
255
1
0
12 May 2023
Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures
IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2023
E. Georganas
Dhiraj D. Kalamkar
K. Voronin
Abhisek Kundu
Antonio Noack
Hans Pabst
Alexander Breuer
A. Heinecke
266
6
0
25 Apr 2023
Full Stack Optimization of Transformer Inference: a Survey
Sehoon Kim
Coleman Hooper
Thanakul Wattanawong
Minwoo Kang
Ruohan Yan
...
Qijing Huang
Kurt Keutzer
Michael W. Mahoney
Y. Shao
A. Gholami
MQ
338
162
0
27 Feb 2023
Operator Fusion in XLA: Analysis and Evaluation
Danielle Snider
Ruofan Liang
217
10
0
30 Jan 2023
oneDNN Graph Compiler: A Hybrid Approach for High-Performance Deep Learning Compilation
IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2023
Jianhui Li
Zhennan Qin
Yijie Mei
Jingze Cui
Yunfei Song
...
Baihui Jin
Yan Zhang
Jason Ye
Eric Lin
Daniel M. Lavery
GNN
279
20
0
03 Jan 2023
AGO: Boosting Mobile AI Inference Performance by Removing Constraints on Graph Optimization
IEEE Conference on Computer Communications (INFOCOM), 2022
Zhiying Xu
H. Peng
Wei Wang
GNN
337
3
0
02 Dec 2022
AlphaSparse: Generating High Performance SpMV Codes Directly from Sparse Matrices
International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2022
Zhen Du
Jiajia Li
Yinshan Wang
Xueqi Li
Guangming Tan
N. Sun
224
33
0
07 Nov 2022
TLP: A Deep Learning-based Cost Model for Tensor Program Tuning
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022
Yiqiang Zhai
Yu Zhang
Shuo Liu
Xiaomeng Chu
Jie Peng
Jianmin Ji
Yanyong Zhang
214
54
0
07 Nov 2022
Legal-Tech Open Diaries: Lesson learned on how to develop and deploy light-weight models in the era of humongous Language Models
Stelios Maroudas
Sotiris Legkas
Prodromos Malakasiotis
Ilias Chalkidis
VLM
AILaw
ALM
ELM
341
5
0
24 Oct 2022
ALT: Boosting Deep Learning Performance by Breaking the Wall between Graph and Operator Level Optimizations
Zhiying Xu
Jiafan Xu
H. Peng
Wei Wang
Xiaoliang Wang
...
Haipeng Dai
Yixu Xu
Hao Cheng
Kun Wang
Guihai Chen
352
1
0
22 Oct 2022
Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor Programs
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022
Yaoyao Ding
Cody Hao Yu
Bojian Zheng
Yizhi Liu
Yida Wang
Gennady Pekhimenko
312
47
0
18 Oct 2022
Demystifying Map Space Exploration for NPUs
IEEE International Symposium on Workload Characterization (IISWC), 2022
Sheng-Chun Kao
A. Parashar
Po-An Tsai
T. Krishna
395
12
0
07 Oct 2022
1
2
3
Next
Page 1 of 3