Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2407.00010
Cited By
Hybrid Heterogeneous Clusters Can Lower the Energy Consumption of LLM Inference Workloads
25 April 2024
Grant Wilkins
Srinivasan Keshav
Richard Mortier
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Hybrid Heterogeneous Clusters Can Lower the Energy Consumption of LLM Inference Workloads"
9 / 9 papers shown
Title
Intelligence per Watt: Measuring Intelligence Efficiency of Local AI
Jon Saad-Falcon
A. Narayan
Hakki Orhun Akengin
J. Wes Griffin
Herumb Shandilya
...
Shang Zhu
Ben Athiwaratkun
John Hennessy
Azalia Mirhoseini
Christopher Ré
149
2
0
11 Nov 2025
An Evaluation of LLMs Inference on Popular Single-board Computers
Tung
Nguyen
T. Nguyen
64
0
0
20 Oct 2025
Perturbative Gradient Training: A novel training paradigm for bridging the gap between deep neural networks and physical reservoir computing
Cliff B. Abbott
Mark Elo
Dmytro A. Bozhko
212
1
0
05 Jun 2025
Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression
Peijie Dong
Zhenheng Tang
Xiang Liu
Lujun Li
Xiaowen Chu
Bo Li
390
6
0
26 May 2025
A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval
Yu Zhang
Shutong Qiao
Jiaqi Zhang
Tzu-Heng Lin
Chen Gao
Yongqian Li
LM&Ro
LM&MA
514
18
0
07 Mar 2025
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?
Zhenheng Tang
Xiang Liu
Qian Wang
Peijie Dong
Bingsheng He
Xiaowen Chu
Bo Li
LRM
270
10
0
24 Feb 2025
TAPAS: Thermal- and Power-Aware Scheduling for LLM Inference in Cloud Platforms
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2025
Jovan Stojkovic
Chaojie Zhang
Íñigo Goiri
Esha Choukse
Haoran Qiu
Rodrigo Fonseca
Josep Torrellas
Ricardo Bianchini
209
21
0
05 Jan 2025
Harnessing Your DRAM and SSD for Sustainable and Accessible LLM Inference with Mixed-Precision and Multi-level Caching
Jie Peng
Zhang Cao
Huaizhi Qu
Zhengyu Zhang
Chang Guo
Yanyong Zhang
Zhichao Cao
Tianlong Chen
255
4
0
17 Oct 2024
Offline Energy-Optimal LLM Serving: Workload-Based Energy Models for LLM Inference on Heterogeneous Systems
Grant Wilkins
Srinivasan Keshav
Richard Mortier
224
21
0
04 Jul 2024
1