ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.10700
21
0

Optimizing Data Distribution and Kernel Performance for Efficient Training of Chemistry Foundation Models: A Case Study with MACE

14 April 2025
J. Firoz
Franco Pellegrini
Mario Geiger
Darren J. Hsu
Jenna A. Bilbrey
Han-Yi Chou
Maximilian Stadler
Markus Hoehnerbach
Tingyu Wang
Dejun Lin
E. Küçükbenli
Henry W Sprueill
Ilyes Batatia
S. Xantheas
MalSoon Lee
Chris Mundy
Gábor Csányi
Justin S. Smith
Ponnuswamy Sadayappan
Sutanay Choudhury
ArXivPDFHTML
Abstract

Chemistry Foundation Models (CFMs) that leverage Graph Neural Networks (GNNs) operating on 3D molecular graph structures are becoming indispensable tools for computational chemists and materials scientists. These models facilitate the understanding of matter and the discovery of new molecules and materials. In contrast to GNNs operating on a large homogeneous graphs, GNNs used by CFMs process a large number of geometric graphs of varying sizes, requiring different optimization strategies than those developed for large homogeneous GNNs. This paper presents optimizations for two critical phases of CFM training: data distribution and model training, targeting MACE - a state-of-the-art CFM. We address the challenge of load balancing in data distribution by formulating it as a multi-objective bin packing problem. We propose an iterative algorithm that provides a highly effective, fast, and practical solution, ensuring efficient data distribution. For the training phase, we identify symmetric tensor contraction as the key computational kernel in MACE and optimize this kernel to improve the overall performance. Our combined approach of balanced data distribution and kernel optimization significantly enhances the training process of MACE. Experimental results demonstrate a substantial speedup, reducing per-epoch execution time for training from 12 to 2 minutes on 740 GPUs with a 2.6M sample dataset.

View on arXiv
@article{firoz2025_2504.10700,
  title={ Optimizing Data Distribution and Kernel Performance for Efficient Training of Chemistry Foundation Models: A Case Study with MACE },
  author={ Jesun Firoz and Franco Pellegrini and Mario Geiger and Darren Hsu and Jenna A. Bilbrey and Han-Yi Chou and Maximilian Stadler and Markus Hoehnerbach and Tingyu Wang and Dejun Lin and Emine Kucukbenli and Henry W. Sprueill and Ilyes Batatia and Sotiris S. Xantheas and MalSoon Lee and Chris Mundy and Gabor Csanyi and Justin S. Smith and Ponnuswamy Sadayappan and Sutanay Choudhury },
  journal={arXiv preprint arXiv:2504.10700},
  year={ 2025 }
}
Comments on this paper