ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.01637
17
0

Morello: Compiling Fast Neural Networks with Dynamic Programming and Spatial Compression

3 May 2025
Samuel J. Kaufman
René Just
Rastislav Bodik
ArXivPDFHTML
Abstract

High-throughput neural network inference requires coordinating many optimization decisions, including parallel tiling, microkernel selection, and data layout. The product of these decisions forms a search space of programs which is typically intractably large. Existing approaches (e.g., auto-schedulers) often address this problem by sampling this space heuristically. In contrast, we introduce a dynamic-programming-based approach to explore more of the search space by iteratively decomposing large program specifications into smaller specifications reachable from a set of rewrites, then composing a final program from each rewrite that minimizes an affine cost model. To reduce memory requirements, we employ a novel memoization table representation, which indexes specifications by coordinates in Z≥0Z_{\geq 0}Z≥0​ and compresses identical, adjacent solutions. This approach can visit a much larger set of programs than prior work. To evaluate the approach, we developed Morello, a compiler which lowers specifications roughly equivalent to a few-node XLA computation graph to x86. Notably, we found that an affine cost model is sufficient to surface high-throughput programs. For example, Morello synthesized a collection of matrix multiplication benchmarks targeting a Zen 1 CPU, including a 1x2048x16384, bfloat16-to-float32 vector-matrix multiply, which was integrated into Google'sthis http URL.

View on arXiv
@article{kaufman2025_2505.01637,
  title={ Morello: Compiling Fast Neural Networks with Dynamic Programming and Spatial Compression },
  author={ Samuel J. Kaufman and René Just and Rastislav Bodik },
  journal={arXiv preprint arXiv:2505.01637},
  year={ 2025 }
}
Comments on this paper