v1v2v3 (latest)

Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions

13 February 2018

Nicolas Vasilache

O. Zinenko

Theodoros Theodoridis

Papers citing "Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions"

50 / 150 papers shown

Morphling: Fast, Fused, and Flexible GNN Training at Scale

Anubhab

Rupesh Nasre

GNN AI4CE LRM

449

27 Mar 2026

STAGE: A Symbolic Tensor grAph GEnerator for distributed AI system co-design

356

13 Nov 2025

Agentic Auto-Scheduling: An Experimental Study of LLM-Guided Loop Optimization

Massinissa Merouani

Islem Kara Bernou

Riyadh Baghdadi

173

01 Nov 2025

VibeCodeHPC: An Agent-Based Iterative Prompting Auto-Tuner for HPC Code Generation Using LLMs

163

26 Sep 2025

The Syntax and Semantics of einsum

206

24 Sep 2025

REASONING COMPILER: LLM-Guided Optimizations for Efficient Model Serving

346

02 Jun 2025

Tilus: A Tile-Level GPGPU Programming Language for Low-Precision Computation

381

17 Apr 2025

Scheduling Languages: A Past, Present, and Future Taxonomy

Amir Mohammad Tavakkoli

Richard Schulze

272

25 Oct 2024

Tadashi: Enabling AI-Based Automated Code Generation With Guaranteed Correctness

315

04 Oct 2024

A Reinforcement Learning Environment for Automatic Code Optimization in the MLIR Compiler

Nazim Bendib

Iheb Nassim Aouadj

Riyadh Baghdadi

Iheb Nassim Aouadj

Bouchama Djad

Rafik Bouloudene

Riyadh Baghdadi

291

17 Sep 2024

CoolerSpace: A Language for Physically Correct and Computationally Efficient Color Programming

Ethan Chen

Jiwon Chang

Yuhao Zhu

101

04 Sep 2024

Stream-K++: Adaptive GPU GEMM Kernel Scheduling and Selection using Bloom Filters

198

21 Aug 2024

Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor with T10Symposium on Operating Systems Principles (SOSP), 2024

Yuqi Xue

303

09 Aug 2024

Dynamic Co-Optimization Compiler: Leveraging Multi-Agent Reinforcement Learning for Enhanced DNN Accelerator Performance

Arya Fayyazi

M. Kamal

Massoud Pedram

380

11 Jul 2024

Composing Distributed Computations Through Task and Kernel Fusion

Alex Aiken

189

26 Jun 2024

Scorch: A Library for Sparse Deep Learning

302

27 May 2024

Graph neural networks with configuration cross-attention for tensor compilers

Dmitrii Khizbullin

Eduardo Rocha de Andrade

Thanh Hau Nguyen

Matheus Pedroza Ferreira

David R. Pugh

GNN

214

26 May 2024

Allo: A Programming Model for Composable Accelerator Design

254

07 Apr 2024

LOOPer: A Learned Automatic Code Optimizer For Polyhedral Compilers

498

18 Mar 2024

SoD$^2$: Statically Optimizing Dynamic Deep Neural Network

SoD

^2

: Statically Optimizing Dynamic Deep Neural Network

Wei Niu

Gagan Agrawal

Bin Ren

389

29 Feb 2024

Unraveling the Key of Machine Learning Solutions for Android Malware Detection

221

05 Feb 2024

CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory AcceleratorsInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024

181

23 Jan 2024

Fast Kronecker Matrix-Matrix Multiplication on GPUs

Abhinav Jangda

Mohit Yadav

413

18 Jan 2024

PolyTOPS: Reconfigurable and Flexible Polyhedral SchedulerIEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2024

...

Artur Cesar Araujo Alves

120

12 Jan 2024

conv_einsum: A Framework for Representation and Fast Evaluation of Multilinear Operations in Convolutional Tensorial Neural Networks

Xiaoyu Liu

Furong Huang

266

07 Jan 2024

GraphRARE: Reinforcement Learning Enhanced Graph Neural Network with Relative EntropyIEEE International Conference on Data Engineering (ICDE), 2023

434

15 Dec 2023

Packrat: Automatic Reconfiguration for Latency Minimization in CPU-based DNN Serving

Deepak Narayanan

201

30 Nov 2023

A Compiler from Array Programs to Vectorized Homomorphic Encryption

Rolph Recto

Andrew C. Myers

221

10 Nov 2023

Restoring the Broken Covenant Between Compilers and Deep Learning Accelerators

127

27 Oct 2023

Serving Deep Learning Model in Relational DatabasesInternational Conference on Extending Database Technology (EDBT), 2023

Alexandre Eichenberger

...

233

07 Oct 2023

YFlows: Systematic Dataflow Exploration and Code Generation for Efficient Neural Network Inference using SIMD Architectures on CPUsInternational Conference on Compiler Construction (CC), 2023

605

01 Oct 2023

LoopTune: Optimizing Tensor Computations with Reinforcement Learning

336

04 Sep 2023

Saturn: An Optimized Data System for Large Model Deep Learning WorkloadsProceedings of the VLDB Endowment (PVLDB), 2023

Kabir Nagrecha

Arun Kumar

406

03 Sep 2023

Target-independent XLA optimization using Reinforcement Learning

226

28 Aug 2023

TpuGraphs: A Performance Prediction Dataset on Large Tensor Computational GraphsNeural Information Processing Systems (NeurIPS), 2023

482

25 Aug 2023

MARS: Exploiting Multi-Level Parallelism for DNN Workloads on Adaptive Multi-Accelerator SystemsDesign Automation Conference (DAC), 2023

Jieru Zhao

Wenchao Ding

139

23 Jul 2023

Maximum Flows in Parametric Graph TemplatesInternational/Italian Conference on Algorithms and Complexity (CIAC), 2023

136

17 Jul 2023

Bridging Control-Centric and Data-Centric OptimizationIEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2023

351

01 Jun 2023

AMULET: Adaptive Matrix-Multiplication-Like TasksInternational Workshop on Data Management on New Hardware (DaMoN), 2023

255

12 May 2023

Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU ArchitecturesIEEE International Parallel and Distributed Processing Symposium (IPDPS), 2023

266

25 Apr 2023

Full Stack Optimization of Transformer Inference: a Survey

Sehoon Kim

Coleman Hooper

...

338

162

27 Feb 2023

Operator Fusion in XLA: Analysis and Evaluation

Danielle Snider

Ruofan Liang

217

30 Jan 2023

oneDNN Graph Compiler: A Hybrid Approach for High-Performance Deep Learning CompilationIEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2023

Zhennan Qin

...

279

03 Jan 2023

AGO: Boosting Mobile AI Inference Performance by Removing Constraints on Graph OptimizationIEEE Conference on Computer Communications (INFOCOM), 2022

337

02 Dec 2022

AlphaSparse: Generating High Performance SpMV Codes Directly from Sparse MatricesInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2022

224

07 Nov 2022

TLP: A Deep Learning-based Cost Model for Tensor Program TuningInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022

214

07 Nov 2022

Legal-Tech Open Diaries: Lesson learned on how to develop and deploy light-weight models in the era of humongous Language Models

Stelios Maroudas

Sotiris Legkas

Prodromos Malakasiotis

Ilias Chalkidis

VLM AILaw ALM ELM

341

24 Oct 2022

ALT: Boosting Deep Learning Performance by Breaking the Wall between Graph and Operator Level Optimizations

...

352

22 Oct 2022

Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor ProgramsInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022

312

18 Oct 2022

Demystifying Map Space Exploration for NPUsIEEE International Symposium on Workload Characterization (IISWC), 2022

395

07 Oct 2022