ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.01500
  4. Cited By
MLPerf Training Benchmark

MLPerf Training Benchmark

2 October 2019
Arya D. McCarthy
Christine Cheng
Cody Coleman
Greg Diamos
Paulius Micikevicius
David Patterson
Hanlin Tang
Winston Wu
Peter Bailis
Victor Bittorf
David Brooks
Dehao Chen
Debojyoti Dutta
Udit Gupta
K. Hazelwood
Andrew Hock
Aaron Mueller
Atsushi Ike
Bill Jia
Daniel Kang
David Kanter
Naveen Kumar
Jeffery Liao
Guokai Ma
Deepak Narayanan
Tayo Oguntebi
Gennady Pekhimenko
Lillian Pentecost
Vijay Janapa Reddi
Taylor Robie
T. S. John
Tsuguchika Tabaru
Carole-Jean Wu
Lingjie Xu
Masafumi Yamazaki
C. Young
Matei A. Zaharia
ArXivPDFHTML

Papers citing "MLPerf Training Benchmark"

50 / 128 papers shown
Title
Enabling Reproducibility and Meta-learning Through a Lifelong Database
  of Experiments (LDE)
Enabling Reproducibility and Meta-learning Through a Lifelong Database of Experiments (LDE)
Jason Tsay
A. Bartezzaghi
Aleke Nolte
C. Malossi
17
0
0
22 Feb 2022
Benchmarking of DL Libraries and Models on Mobile Devices
Benchmarking of DL Libraries and Models on Mobile Devices
Qiyang Zhang
Xiang Li
Xiangying Che
Xiao Ma
Ao Zhou
Mengwei Xu
Shangguang Wang
Yun Ma
Xuanzhe Liu
25
48
0
14 Feb 2022
RecShard: Statistical Feature-Based Memory Optimization for
  Industry-Scale Neural Recommendation
RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural Recommendation
Geet Sethi
Bilge Acun
Niket Agarwal
Christos Kozyrakis
Caroline Trippel
Carole-Jean Wu
47
66
0
25 Jan 2022
Building a Performance Model for Deep Learning Recommendation Model
  Training on GPUs
Building a Performance Model for Deep Learning Recommendation Model Training on GPUs
Zhongyi Lin
Louis Feng
E. K. Ardestani
Jaewon Lee
J. Lundell
Changkyu Kim
A. Kejariwal
John Douglas Owens
22
19
0
19 Jan 2022
On Sampling Collaborative Filtering Datasets
On Sampling Collaborative Filtering Datasets
Noveen Sachdeva
Carole-Jean Wu
Julian McAuley
13
14
0
13 Jan 2022
Gridiron: A Technique for Augmenting Cloud Workloads with Network
  Bandwidth Requirements
Gridiron: A Technique for Augmenting Cloud Workloads with Network Bandwidth Requirements
N. Kodirov
Shane Bergsma
Syed M. Iqbal
Alan J. Hu
Ivan Beschastnikh
Margo Seltzer
10
0
0
12 Jan 2022
MLHarness: A Scalable Benchmarking System for MLCommons
MLHarness: A Scalable Benchmarking System for MLCommons
Y. Chang
Jianhao Pu
Wen-mei W. Hwu
Jinjun Xiong
9
6
0
09 Nov 2021
A Survey and Empirical Evaluation of Parallel Deep Learning Frameworks
A Survey and Empirical Evaluation of Parallel Deep Learning Frameworks
Daniel Nichols
Siddharth Singh
Shuqing Lin
A. Bhatele
OOD
16
9
0
09 Nov 2021
ML-EXray: Visibility into ML Deployment on the Edge
ML-EXray: Visibility into ML Deployment on the Edge
Hang Qiu
Ioanna Vavelidou
Jian Li
Evgenya Pergament
Pete Warden
Sandeep P. Chinchali
Zain Asgar
Sachin Katti
11
8
0
08 Nov 2021
Plumber: Diagnosing and Removing Performance Bottlenecks in Machine
  Learning Data Pipelines
Plumber: Diagnosing and Removing Performance Bottlenecks in Machine Learning Data Pipelines
Michael Kuchnik
Ana Klimovic
Jiří Šimša
Virginia Smith
George Amvrosiadis
50
30
0
07 Nov 2021
Sustainable AI: Environmental Implications, Challenges and Opportunities
Sustainable AI: Environmental Implications, Challenges and Opportunities
Carole-Jean Wu
Ramya Raghavendra
Udit Gupta
Bilge Acun
Newsha Ardalani
...
Maximilian Balandat
Joe Spisak
R. Jain
Michael G. Rabbat
K. Hazelwood
37
379
0
30 Oct 2021
OneFlow: Redesign the Distributed Deep Learning Framework from Scratch
OneFlow: Redesign the Distributed Deep Learning Framework from Scratch
Jinhui Yuan
Xinqi Li
Cheng Cheng
Juncheng Liu
Ran Guo
...
Fei Yang
Xiaodong Yi
Chuan Wu
Haoran Zhang
Jie Zhao
27
36
0
28 Oct 2021
The Efficiency Misnomer
The Efficiency Misnomer
Daoyuan Chen
Liuyi Yao
Dawei Gao
Ashish Vaswani
Yaliang Li
32
98
0
25 Oct 2021
MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning
  on HPC Systems
MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems
S. Farrell
M. Emani
J. Balma
L. Drescher
Aleksandr Drozd
...
Akihiro Tabuchi
V. Vishwanath
M. Wahib
Masafumi Yamazaki
Junqi Yin
VLM
32
35
0
21 Oct 2021
MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence
  using Federated Evaluation
MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation
Alexandros Karargyris
Renato Umeton
Micah J. Sheller
Alejandro Aristizabal
Johnu George
...
Poonam Yadav
Michael Rosenthal
M. Loda
Jason M. Johnson
Peter Mattson
FedML
46
71
0
29 Sep 2021
Understanding Data Storage and Ingestion for Large-Scale Deep
  Recommendation Model Training
Understanding Data Storage and Ingestion for Large-Scale Deep Recommendation Model Training
Mark Zhao
Niket Agarwal
Aarti Basant
B. Gedik
Satadru Pan
...
Kevin Wilfong
Harsha Rastogi
Carole-Jean Wu
Christos Kozyrakis
Parikshit Pol
GNN
15
70
0
20 Aug 2021
AutoFL: Enabling Heterogeneity-Aware Energy Efficient Federated Learning
AutoFL: Enabling Heterogeneity-Aware Energy Efficient Federated Learning
Young Geun Kim
Carole-Jean Wu
6
83
0
16 Jul 2021
Prediction Surface Uncertainty Quantification in Object Detection Models
  for Autonomous Driving
Prediction Surface Uncertainty Quantification in Object Detection Models for Autonomous Driving
Ferhat Ozgur Catak
T. Yue
Shaukat Ali
17
21
0
11 Jul 2021
Randomness In Neural Network Training: Characterizing The Impact of
  Tooling
Randomness In Neural Network Training: Characterizing The Impact of Tooling
Donglin Zhuang
Xingyao Zhang
S. Song
Sara Hooker
25
75
0
22 Jun 2021
NG+ : A Multi-Step Matrix-Product Natural Gradient Method for Deep
  Learning
NG+ : A Multi-Step Matrix-Product Natural Gradient Method for Deep Learning
Minghan Yang
Dong Xu
Qiwen Cui
Zaiwen Wen
Pengxiang Xu
13
4
0
14 Jun 2021
A Generalizable Approach to Learning Optimizers
A Generalizable Approach to Learning Optimizers
Diogo Almeida
Clemens Winter
Jie Tang
Wojciech Zaremba
AI4CE
19
29
0
02 Jun 2021
Concurrent Adversarial Learning for Large-Batch Training
Concurrent Adversarial Learning for Large-Batch Training
Yong Liu
Xiangning Chen
Minhao Cheng
Cho-Jui Hsieh
Yang You
ODL
28
13
0
01 Jun 2021
Low-Precision Hardware Architectures Meet Recommendation Model Inference
  at Scale
Low-Precision Hardware Architectures Meet Recommendation Model Inference at Scale
Zhaoxia Deng
Deng
Jongsoo Park
P. T. P. Tang
Haixin Liu
...
S. Nadathur
Changkyu Kim
Maxim Naumov
S. Naghshineh
M. Smelyanskiy
13
11
0
26 May 2021
FedScale: Benchmarking Model and System Performance of Federated
  Learning at Scale
FedScale: Benchmarking Model and System Performance of Federated Learning at Scale
Fan Lai
Yinwei Dai
Sanjay Sri Vallabh Singapuram
Jiachen Liu
Xiangfeng Zhu
H. Madhyastha
Mosharaf Chowdhury
FedML
35
194
0
24 May 2021
Demystifying BERT: Implications for Accelerator Design
Demystifying BERT: Implications for Accelerator Design
Suchita Pati
Shaizeen Aga
Nuwan Jayasena
Matthew D. Sinclair
LLMAG
30
17
0
14 Apr 2021
Software-Hardware Co-design for Fast and Scalable Training of Deep
  Learning Recommendation Models
Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models
Dheevatsa Mudigere
Y. Hao
Jianyu Huang
Zhihao Jia
Andrew Tulloch
...
Ajit Mathews
Lin Qiao
M. Smelyanskiy
Bill Jia
Vijay Rao
21
149
0
12 Apr 2021
Efficient Large-Scale Language Model Training on GPU Clusters Using
  Megatron-LM
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
Deepak Narayanan
M. Shoeybi
Jared Casper
P. LeGresley
M. Patwary
...
Prethvi Kashinkunti
J. Bernauer
Bryan Catanzaro
Amar Phanishayee
Matei A. Zaharia
MoE
11
643
0
09 Apr 2021
GPU Domain Specialization via Composable On-Package Architecture
GPU Domain Specialization via Composable On-Package Architecture
Yaosheng Fu
Evgeny Bolotin
Niladrish Chatterjee
D. Nellans
S. Keckler
12
12
0
05 Apr 2021
Moshpit SGD: Communication-Efficient Decentralized Training on
  Heterogeneous Unreliable Devices
Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
Max Ryabinin
Eduard A. Gorbunov
Vsevolod Plokhotnyuk
Gennady Pekhimenko
27
31
0
04 Mar 2021
Improving Computational Efficiency in Visual Reinforcement Learning via
  Stored Embeddings
Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings
Lili Chen
Kimin Lee
A. Srinivas
Pieter Abbeel
OffRL
11
11
0
04 Mar 2021
On the Utility of Gradient Compression in Distributed Training Systems
On the Utility of Gradient Compression in Distributed Training Systems
Saurabh Agarwal
Hongyi Wang
Shivaram Venkataraman
Dimitris Papailiopoulos
23
46
0
28 Feb 2021
Swift for TensorFlow: A portable, flexible platform for deep learning
Swift for TensorFlow: A portable, flexible platform for deep learning
Brennan Saeta
Denys Shabalin
M. Rasi
Brad Larson
Xihui Wu
...
Saleem Abdulrasool
A. Efremov
Dave Abrahams
Chris Lattner
Richard Wei
HAI
19
11
0
26 Feb 2021
A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers
  Suffice Across Batch Sizes
A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers Suffice Across Batch Sizes
Zachary Nado
Justin M. Gilmer
Christopher J. Shallue
Rohan Anil
George E. Dahl
ODL
17
27
0
12 Feb 2021
RL-Scope: Cross-Stack Profiling for Deep Reinforcement Learning
  Workloads
RL-Scope: Cross-Stack Profiling for Deep Reinforcement Learning Workloads
James Gleeson
Srivatsan Krishnan
Moshe Gabel
Vijay Janapa Reddi
Eyal de Lara
Gennady Pekhimenko
OffRL
12
11
0
08 Feb 2021
Horizontally Fused Training Array: An Effective Hardware Utilization
  Squeezer for Training Novel Deep Learning Models
Horizontally Fused Training Array: An Effective Hardware Utilization Squeezer for Training Novel Deep Learning Models
Shang Wang
Peiming Yang
Yuxuan Zheng
X. Li
Gennady Pekhimenko
14
22
0
03 Feb 2021
A Runtime-Based Computational Performance Predictor for Deep Neural
  Network Training
A Runtime-Based Computational Performance Predictor for Deep Neural Network Training
Geoffrey X. Yu
Yubo Gao
P. Golikov
Gennady Pekhimenko
3DH
17
67
0
31 Jan 2021
tf.data: A Machine Learning Data Processing Framework
tf.data: A Machine Learning Data Processing Framework
D. Murray
Jiří Šimša
Ana Klimovic
Ihor Indyk
PINN
AI4CE
LMTD
39
87
0
28 Jan 2021
TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
Chunxing Yin
Bilge Acun
Xing Liu
Carole-Jean Wu
28
102
0
25 Jan 2021
User Response Prediction in Online Advertising
User Response Prediction in Online Advertising
Zhabiz Gharibshah
Xingquan Zhu
OffRL
57
46
0
07 Jan 2021
Understanding Training Efficiency of Deep Learning Recommendation Models
  at Scale
Understanding Training Efficiency of Deep Learning Recommendation Models at Scale
Bilge Acun
Matthew Murphy
Xiaodong Wang
Jade Nie
Carole-Jean Wu
K. Hazelwood
10
109
0
11 Nov 2020
Exploring the limits of Concurrency in ML Training on Google TPUs
Exploring the limits of Concurrency in ML Training on Google TPUs
Sameer Kumar
James Bradbury
C. Young
Yu Emma Wang
Anselm Levskaya
...
Tao Wang
Tayo Oguntebi
Yazhou Zu
Yuanzhong Xu
Andy Swing
BDL
AIMat
MoE
LRM
17
27
0
07 Nov 2020
CPR: Understanding and Improving Failure Tolerant Training for Deep
  Learning Recommendation with Partial Recovery
CPR: Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery
Kiwan Maeng
Shivam Bharuka
Isabel Gao
M. C. Jeffrey
V. Saraph
...
Caroline Trippel
Jiyan Yang
Michael G. Rabbat
Brandon Lucia
Carole-Jean Wu
OffRL
11
31
0
05 Nov 2020
Understanding Capacity-Driven Scale-Out Neural Recommendation Inference
Understanding Capacity-Driven Scale-Out Neural Recommendation Inference
Michael Lui
Yavuz Yetim
Özgür Özkan
Zhuoran Zhao
Shin-Yeh Tsai
Carole-Jean Wu
Mark Hempstead
GNN
BDL
LRM
19
51
0
04 Nov 2020
Accordion: Adaptive Gradient Communication via Critical Learning Regime
  Identification
Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification
Saurabh Agarwal
Hongyi Wang
Kangwook Lee
Shivaram Venkataraman
Dimitris Papailiopoulos
34
25
0
29 Oct 2020
MicroRec: Efficient Recommendation Inference by Hardware and Data
  Structure Solutions
MicroRec: Efficient Recommendation Inference by Hardware and Data Structure Solutions
Wenqi Jiang
Zhen He
Shuai Zhang
Thomas B. Preußer
Kai Zeng
...
Tongxuan Liu
Yong Li
Jingren Zhou
Ce Zhang
Gustavo Alonso
26
7
0
12 Oct 2020
Neural Model-based Optimization with Right-Censored Observations
Neural Model-based Optimization with Right-Censored Observations
Katharina Eggensperger
Kai Haase
Philip Muller
Marius Lindauer
Frank Hutter
21
9
0
29 Sep 2020
Review: Deep Learning in Electron Microscopy
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
26
79
0
17 Sep 2020
CLEANN: Accelerated Trojan Shield for Embedded Neural Networks
CLEANN: Accelerated Trojan Shield for Embedded Neural Networks
Mojan Javaheripi
Mohammad Samragh
Gregory Fields
T. Javidi
F. Koushanfar
AAML
FedML
6
42
0
04 Sep 2020
Bosch Deep Learning Hardware Benchmark
Bosch Deep Learning Hardware Benchmark
Armin Runge
Thomas Wenzel
Dimitrios Bariamis
B. Staffler
Lucas Drumond
Michael Pfeiffer
6
0
0
24 Aug 2020
FLBench: A Benchmark Suite for Federated Learning
FLBench: A Benchmark Suite for Federated Learning
Yuan Liang
Yange Guo
Yanxia Gong
Chunjie Luo
Jianfeng Zhan
Yunyou Huang
FedML
14
10
0
17 Aug 2020
Previous
123
Next