ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.06371
9
0

The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization

9 May 2025
Jae-Won Chung
Jiachen Liu
Jeff J. Ma
Ruofan Wu
Oh Jun Kweon
Yuxuan Xia
Zhiyu Wu
Mosharaf Chowdhury
ArXivPDFHTML
Abstract

As the adoption of Generative AI in real-world services grow explosively, energy has emerged as a critical bottleneck resource. However, energy remains a metric that is often overlooked, under-explored, or poorly understood in the context of building ML systems. We present thethis http URLBenchmark, a benchmark suite and tool for measuring inference energy consumption under realistic service environments, and the correspondingthis http URLLeaderboard, which have served as a valuable resource for those hoping to understand and optimize the energy consumption of their generative AI services. In this paper, we explain four key design principles for benchmarking ML energy we have acquired over time, and then describe how they are implemented in thethis http URLBenchmark. We then highlight results from the latest iteration of the benchmark, including energy measurements of 40 widely used model architectures across 6 different tasks, case studies of how ML design choices impact energy consumption, and how automated optimization recommendations can lead to significant (sometimes more than 40%) energy savings without changing what is being computed by the model. Thethis http URLBenchmark is open-source and can be easily extended to various customized models and application scenarios.

View on arXiv
@article{chung2025_2505.06371,
  title={ The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization },
  author={ Jae-Won Chung and Jiachen Liu and Jeff J. Ma and Ruofan Wu and Oh Jun Kweon and Yuxuan Xia and Zhiyu Wu and Mosharaf Chowdhury },
  journal={arXiv preprint arXiv:2505.06371},
  year={ 2025 }
}
Comments on this paper