ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.05472
  4. Cited By
Compute and Energy Consumption Trends in Deep Learning Inference

Compute and Energy Consumption Trends in Deep Learning Inference

12 September 2021
Radosvet Desislavov
Fernando Martínez-Plumed
José Hernández Orallo
ArXivPDFHTML

Papers citing "Compute and Energy Consumption Trends in Deep Learning Inference"

50 / 52 papers shown
Title
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints
Ruicheng Ao
Gan Luo
D. Simchi-Levi
Xinshang Wang
26
2
0
15 Apr 2025
OscNet: Machine Learning on CMOS Oscillator Networks
OscNet: Machine Learning on CMOS Oscillator Networks
Wenxiao Cai
Thomas H. Lee
58
0
0
11 Feb 2025
Towards Sustainable NLP: Insights from Benchmarking Inference Energy in Large Language Models
Towards Sustainable NLP: Insights from Benchmarking Inference Energy in Large Language Models
S. Poddar
Paramita Koley
Janardan Misra
Niloy Ganguly
Saptarshi Ghosh
Saptarshi Ghosh
59
0
0
08 Feb 2025
Inducing Semi-Structured Sparsity by Masking for Efficient Model
  Inference in Convolutional Networks
Inducing Semi-Structured Sparsity by Masking for Efficient Model Inference in Convolutional Networks
David A. Danhofer
27
0
0
01 Nov 2024
MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI
MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI
Arya Tschand
Arun Tejusve Raghunath Rajan
S. Idgunji
Anirban Ghosh
J. Holleman
...
Rowan Taubitz
Sean Zhan
Scott Wasson
David Kanter
Vijay Janapa Reddi
62
3
0
15 Oct 2024
A framework for measuring the training efficiency of a neural
  architecture
A framework for measuring the training efficiency of a neural architecture
Eduardo Cueto-Mendoza
John D. Kelleher
38
0
0
12 Sep 2024
A Cost-Aware Approach to Adversarial Robustness in Neural Networks
A Cost-Aware Approach to Adversarial Robustness in Neural Networks
Charles Meyers
Mohammad Reza Saleh Sedghpour
Tommy Löfstedt
Erik Elmroth
OOD
AAML
31
0
0
11 Sep 2024
Verification methods for international AI agreements
Verification methods for international AI agreements
Akash R. Wasil
Tom Reed
Jack William Miller
Peter Barnett
37
2
0
28 Aug 2024
SFPrompt: Communication-Efficient Split Federated Fine-Tuning for Large
  Pre-Trained Models over Resource-Limited Devices
SFPrompt: Communication-Efficient Split Federated Fine-Tuning for Large Pre-Trained Models over Resource-Limited Devices
Linxiao Cao
Yifei Zhu
Wei Gong
FedML
18
2
0
24 Jul 2024
Offline Energy-Optimal LLM Serving: Workload-Based Energy Models for LLM
  Inference on Heterogeneous Systems
Offline Energy-Optimal LLM Serving: Workload-Based Energy Models for LLM Inference on Heterogeneous Systems
Grant Wilkins
Srinivasan Keshav
Richard Mortier
36
5
0
04 Jul 2024
HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models
HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models
R. Sukthanker
Arber Zela
B. Staffler
Aaron Klein
Lennart Purucker
Jorg K. H. Franke
Frank Hutter
ELM
38
3
0
16 May 2024
Hybrid Heterogeneous Clusters Can Lower the Energy Consumption of LLM
  Inference Workloads
Hybrid Heterogeneous Clusters Can Lower the Energy Consumption of LLM Inference Workloads
Grant Wilkins
Srinivasan Keshav
Richard Mortier
32
9
0
25 Apr 2024
AI Coders Are Among Us: Rethinking Programming Language Grammar Towards
  Efficient Code Generation
AI Coders Are Among Us: Rethinking Programming Language Grammar Towards Efficient Code Generation
Zhensu Sun
Xiaoning Du
Zhou Yang
Li Li
David Lo
28
10
0
25 Apr 2024
Measuring the Energy Consumption and Efficiency of Deep Neural Networks:
  An Empirical Analysis and Design Recommendations
Measuring the Energy Consumption and Efficiency of Deep Neural Networks: An Empirical Analysis and Design Recommendations
Charles Edison Tripp
J. Perr-Sauer
Jamil Gafur
Amabarish Nag
Avi Purkayastha
S. Zisman
Erik A. Bensen
16
7
0
13 Mar 2024
Green AI: A Preliminary Empirical Study on Energy Consumption in DL
  Models Across Different Runtime Infrastructures
Green AI: A Preliminary Empirical Study on Energy Consumption in DL Models Across Different Runtime Infrastructures
Negar Alizadeh
Fernando Castor
38
11
0
21 Feb 2024
Identifying architectural design decisions for achieving green ML
  serving
Identifying architectural design decisions for achieving green ML serving
Francisco Durán
Silverio Martínez-Fernández
Matias Martinez
Patricia Lago
11
3
0
12 Feb 2024
TeenyTinyLlama: open-source tiny language models trained in Brazilian
  Portuguese
TeenyTinyLlama: open-source tiny language models trained in Brazilian Portuguese
N. Corrêa
Sophia Falk
Shiza Fatimah
Aniket Sen
N. D. Oliveira
30
9
0
30 Jan 2024
Dynamic Semantic Compression for CNN Inference in Multi-access Edge
  Computing: A Graph Reinforcement Learning-based Autoencoder
Dynamic Semantic Compression for CNN Inference in Multi-access Edge Computing: A Graph Reinforcement Learning-based Autoencoder
Nana Li
Alexandros Iosifidis
Qi Zhang
30
2
0
19 Jan 2024
Power Hungry Processing: Watts Driving the Cost of AI Deployment?
Power Hungry Processing: Watts Driving the Cost of AI Deployment?
Sasha Luccioni
Yacine Jernite
Emma Strubell
40
162
0
28 Nov 2023
Bitformer: An efficient Transformer with bitwise operation-based
  attention for Big Data Analytics at low-cost low-precision devices
Bitformer: An efficient Transformer with bitwise operation-based attention for Big Data Analytics at low-cost low-precision devices
Gaoxiang Duan
Junkai Zhang
Xiaoying Zheng
Yongxin Zhu
28
2
0
22 Nov 2023
MACP: Efficient Model Adaptation for Cooperative Perception
MACP: Efficient Model Adaptation for Cooperative Perception
Yunsheng Ma
Juanwu Lu
Can Cui
Sicheng Zhao
Xu Cao
Wenqian Ye
Ziran Wang
24
11
0
25 Oct 2023
Unveiling Energy Efficiency in Deep Learning: Measurement, Prediction,
  and Scoring across Edge Devices
Unveiling Energy Efficiency in Deep Learning: Measurement, Prediction, and Scoring across Edge Devices
Xiaolong Tu
Anik Mallik
Dawei Chen
Kyungtae Han
Onur Altintas
Haoxin Wang
Jiang Xie
11
12
0
19 Oct 2023
Watt For What: Rethinking Deep Learning's Energy-Performance
  Relationship
Watt For What: Rethinking Deep Learning's Energy-Performance Relationship
Shreyank N. Gowda
Xinyue Hao
Gen Li
Laura Sevilla-Lara
Shashank Narayana Gowda
HAI
13
10
0
10 Oct 2023
Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the
  Ugly
Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly
Herbert Woisetschläger
Alexander Erben
Shiqiang Wang
R. Mayer
Hans-Arno Jacobsen
FedML
29
17
0
04 Oct 2023
Efficiency is Not Enough: A Critical Perspective of Environmentally Sustainable AI
Efficiency is Not Enough: A Critical Perspective of Environmentally Sustainable AI
Dustin Wright
Christian Igel
Gabrielle Samuel
Raghavendra Selvan
27
15
0
05 Sep 2023
Neural Architecture Transfer 2: A Paradigm for Improving Efficiency in
  Multi-Objective Neural Architecture Search
Neural Architecture Transfer 2: A Paradigm for Improving Efficiency in Multi-Objective Neural Architecture Search
Simone Sarti
Eugenio Lomurno
Matteo Matteucci
14
0
0
03 Jul 2023
Feature Imitating Networks Enhance The Performance, Reliability And
  Speed Of Deep Learning On Biomedical Image Processing Tasks
Feature Imitating Networks Enhance The Performance, Reliability And Speed Of Deep Learning On Biomedical Image Processing Tasks
Shangyang Min
M. Ghassemi
Tuka Alhanai
Mohammad Mahdi Ghassemi
FedML
MedIm
AI4CE
20
1
0
26 Jun 2023
Sustainable Edge Intelligence Through Energy-Aware Early Exiting
Sustainable Edge Intelligence Through Energy-Aware Early Exiting
Marcello Bullo
Seifallah Jardak
P. Carnelli
Deniz Gündüz
11
6
0
23 May 2023
Brain-inspired learning in artificial neural networks: a review
Brain-inspired learning in artificial neural networks: a review
Samuel Schmidgall
Jascha Achterberg
Thomas Miconi
Louis Kirsch
Rojin Ziaei
S. P. Hajiseyedrazi
Jason Eshraghian
26
52
0
18 May 2023
ISimDL: Importance Sampling-Driven Acceleration of Fault Injection
  Simulations for Evaluating the Robustness of Deep Learning
ISimDL: Importance Sampling-Driven Acceleration of Fault Injection Simulations for Evaluating the Robustness of Deep Learning
Alessio Colucci
A. Steininger
Muhammad Shafique
AAML
26
2
0
14 Mar 2023
Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Sagnik Majumder
Hao Jiang
Pierre Moulon
E. Henderson
P. Calamia
Kristen Grauman
V. Ithapu
EgoV
27
7
0
04 Jan 2023
The European AI Liability Directives -- Critique of a Half-Hearted
  Approach and Lessons for the Future
The European AI Liability Directives -- Critique of a Half-Hearted Approach and Lessons for the Future
P. Hacker
AILaw
21
59
0
25 Nov 2022
Neural Networks with Quantization Constraints
Neural Networks with Quantization Constraints
Ignacio Hounie
Juan Elenter
Alejandro Ribeiro
MQ
11
4
0
27 Oct 2022
Real-time Speech Interruption Analysis: From Cloud to Client Deployment
Real-time Speech Interruption Analysis: From Cloud to Client Deployment
Quchen Fu
Szu-Wei Fu
Yaran Fan
Yu-Huan Wu
Zhuo Chen
J. Gupchup
Ross Cutler
26
0
0
24 Oct 2022
The Future of Consumer Edge-AI Computing
The Future of Consumer Edge-AI Computing
Stefanos Laskaridis
Stylianos I. Venieris
Alexandros Kouris
Rui Li
Nicholas D. Lane
37
8
0
19 Oct 2022
AnalogVNN: A fully modular framework for modeling and optimizing
  photonic neural networks
AnalogVNN: A fully modular framework for modeling and optimizing photonic neural networks
Vivswan Shah
Nathan Youngblood
17
3
0
14 Oct 2022
Improving the Performance of DNN-based Software Services using Automated
  Layer Caching
Improving the Performance of DNN-based Software Services using Automated Layer Caching
M. Abedi
Yanni Iouannou
Pooyan Jamshidi
Hadi Hemmati
20
0
0
18 Sep 2022
Don't Complete It! Preventing Unhelpful Code Completion for Productive
  and Sustainable Neural Code Completion Systems
Don't Complete It! Preventing Unhelpful Code Completion for Productive and Sustainable Neural Code Completion Systems
Zhensu Sun
Xiaoning Du
Fu Song
Shangwen Wang
Mingze Ni
Li Li
21
10
0
13 Sep 2022
A Transistor Operations Model for Deep Learning Energy Consumption
  Scaling Law
A Transistor Operations Model for Deep Learning Energy Consumption Scaling Law
Chen Li
Antonios Tsourdos
Weisi Guo
AI4CE
18
1
0
30 May 2022
Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box
  Floating-Point Transformer Models
Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models
Ali Hadi Zadeh
Mostafa Mahmoud
Ameer Abdelhadi
Andreas Moshovos
MQ
11
31
0
23 Mar 2022
Compute Trends Across Three Eras of Machine Learning
Compute Trends Across Three Eras of Machine Learning
J. Sevilla
Lennart Heim
A. Ho
T. Besiroglu
Marius Hobbhahn
Pablo Villalobos
20
269
0
11 Feb 2022
High-Performance Large-Scale Image Recognition Without Normalization
High-Performance Large-Scale Image Recognition Without Normalization
Andrew Brock
Soham De
Samuel L. Smith
Karen Simonyan
VLM
223
512
0
11 Feb 2021
Bottleneck Transformers for Visual Recognition
Bottleneck Transformers for Visual Recognition
A. Srinivas
Tsung-Yi Lin
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
275
979
0
27 Jan 2021
I-BERT: Integer-only BERT Quantization
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
88
341
0
05 Jan 2021
Measuring the Algorithmic Efficiency of Neural Networks
Measuring the Algorithmic Efficiency of Neural Networks
Danny Hernandez
Tom B. Brown
233
94
0
08 May 2020
Fixing the train-test resolution discrepancy: FixEfficientNet
Fixing the train-test resolution discrepancy: FixEfficientNet
Hugo Touvron
Andrea Vedaldi
Matthijs Douze
Hervé Jégou
AAML
189
110
0
18 Mar 2020
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
221
197
0
07 Feb 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,460
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,817
0
17 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,950
0
20 Apr 2018
12
Next