ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.05497
  4. Cited By
Understanding Training Efficiency of Deep Learning Recommendation Models
  at Scale

Understanding Training Efficiency of Deep Learning Recommendation Models at Scale

11 November 2020
Bilge Acun
Matthew Murphy
Xiaodong Wang
Jade Nie
Carole-Jean Wu
K. Hazelwood
ArXivPDFHTML

Papers citing "Understanding Training Efficiency of Deep Learning Recommendation Models at Scale"

50 / 53 papers shown
Title
Ember: A Compiler for Efficient Embedding Operations on Decoupled Access-Execute Architectures
Ember: A Compiler for Efficient Embedding Operations on Decoupled Access-Execute Architectures
Marco Siracusa
Olivia Hsu
Victor Soria-Pardos
Joshua Randall
Arnaud Grasset
...
Doug Joseph
Randy Allen
Fredrik Kjolstad
Miquel Moretó Planas
Adrià Armejach
31
0
0
14 Apr 2025
The Sunk Carbon Fallacy: Rethinking Carbon Footprint Metrics for
  Effective Carbon-Aware Scheduling
The Sunk Carbon Fallacy: Rethinking Carbon Footprint Metrics for Effective Carbon-Aware Scheduling
Noman Bashir
Varun Gohil
Anagha Belavadi
Mohammad Shahrad
David E. Irwin
Elsa Olivetti
Christina Delimitrou
24
2
0
19 Oct 2024
Data Deletion for Linear Regression with Noisy SGD
Data Deletion for Linear Regression with Noisy SGD
Zhangjie Xia
Chi-Hua Wang
Guang Cheng
30
2
0
12 Oct 2024
CTMBIDS: Convolutional Tsetlin Machine Based Intrusion Detection System
  for DDoS attacks in an SDN environment
CTMBIDS: Convolutional Tsetlin Machine Based Intrusion Detection System for DDoS attacks in an SDN environment
Rasoul Jafari Gohari
Laya Aliahmadipour
M. Rafsanjani
AAML
23
1
0
05 Sep 2024
HopGNN: Boosting Distributed GNN Training Efficiency via Feature-Centric
  Model Migration
HopGNN: Boosting Distributed GNN Training Efficiency via Feature-Centric Model Migration
Weijian Chen
Shuibing He
Haoyang Qu
Xuechen Zhang
GNN
29
0
0
01 Sep 2024
CADC: Encoding User-Item Interactions for Compressing Recommendation
  Model Training Data
CADC: Encoding User-Item Interactions for Compressing Recommendation Model Training Data
Hossein Entezari Zarch
Abdulla Alshabanah
Chaoyi Jiang
Murali Annavaram
25
1
0
11 Jul 2024
PreSto: An In-Storage Data Preprocessing System for Training
  Recommendation Models
PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models
Yunjae Lee
Hyeseong Kim
Minsoo Rhu
34
3
0
11 Jun 2024
ElasticRec: A Microservice-based Model Serving Architecture Enabling
  Elastic Resource Scaling for Recommendation Models
ElasticRec: A Microservice-based Model Serving Architecture Enabling Elastic Resource Scaling for Recommendation Models
Yujeong Choi
Jiin Kim
Minsoo Rhu
37
1
0
11 Jun 2024
Accelerating Recommender Model Training by Dynamically Skipping Stale
  Embeddings
Accelerating Recommender Model Training by Dynamically Skipping Stale Embeddings
Yassaman Ebrahimzadeh Maboud
Muhammad Adnan
Divyat Mahajan
Prashant J. Nair
AI4TS
38
0
0
22 Mar 2024
Blox: A Modular Toolkit for Deep Learning Schedulers
Blox: A Modular Toolkit for Deep Learning Schedulers
Saurabh Agarwal
Amar Phanishayee
Shivaram Venkataraman
OffRL
24
4
0
19 Dec 2023
MAD Max Beyond Single-Node: Enabling Large Machine Learning Model
  Acceleration on Distributed Systems
MAD Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems
Samuel Hsia
Alicia Golden
Bilge Acun
Newsha Ardalani
Zach DeVito
Gu-Yeon Wei
David Brooks
Carole-Jean Wu
MoE
43
9
0
04 Oct 2023
Saturn: An Optimized Data System for Large Model Deep Learning Workloads
Saturn: An Optimized Data System for Large Model Deep Learning Workloads
Kabir Nagrecha
Arun Kumar
16
6
0
03 Sep 2023
Dynamic Embedding Size Search with Minimum Regret for Streaming
  Recommender System
Dynamic Embedding Size Search with Minimum Regret for Streaming Recommender System
Bowei He
Xu He
Renrui Zhang
Yingxue Zhang
Ruiming Tang
Chen-li Ma
AI4TS
30
12
0
15 Aug 2023
InTune: Reinforcement Learning-based Data Pipeline Optimization for Deep
  Recommendation Models
InTune: Reinforcement Learning-based Data Pipeline Optimization for Deep Recommendation Models
Kabir Nagrecha
Lingyi Liu
P. Delgado
Prasanna Padmanabhan
OffRL
AI4CE
25
5
0
13 Aug 2023
Evaluating and Enhancing Robustness of Deep Recommendation Systems
  Against Hardware Errors
Evaluating and Enhancing Robustness of Deep Recommendation Systems Against Hardware Errors
Dongning Ma
Xun Jiao
Fred Lin
Mengshi Zhang
Alban Desmaison
Thomas Sellinger
Daniel Moore
Sriram Sankar
24
2
0
17 Jul 2023
Mem-Rec: Memory Efficient Recommendation System using Alternative
  Representation
Mem-Rec: Memory Efficient Recommendation System using Alternative Representation
Gopu Krishna Jha
Anthony Thomas
Nilesh Jain
Sameh Gobriel
Tajana Rosing
Ravi Iyer
45
2
0
12 May 2023
Pre-train and Search: Efficient Embedding Table Sharding with
  Pre-trained Neural Cost Models
Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models
Daochen Zha
Louis Feng
Liangchen Luo
Bhargav Bhushanam
Zirui Liu
...
J. McMahon
Yuzhen Huang
Bryan Clarke
A. Kejariwal
Xia Hu
50
7
0
03 May 2023
MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation
MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation
Samuel Hsia
Udit Gupta
Bilge Acun
Newsha Ardalani
Pan Zhong
Gu-Yeon Wei
David Brooks
Carole-Jean Wu
41
17
0
21 Feb 2023
FlexShard: Flexible Sharding for Industry-Scale Sequence Recommendation
  Models
FlexShard: Flexible Sharding for Industry-Scale Sequence Recommendation Models
Geet Sethi
Pallab Bhattacharya
Dhruv Choudhary
Carole-Jean Wu
Christos Kozyrakis
16
5
0
08 Jan 2023
Systems for Parallel and Distributed Large-Model Deep Learning Training
Systems for Parallel and Distributed Large-Model Deep Learning Training
Kabir Nagrecha
GNN
VLM
MoE
26
7
0
06 Jan 2023
A Survey on Federated Recommendation Systems
A Survey on Federated Recommendation Systems
Zehua Sun
Yonghui Xu
Lingjuan Lyu
Weiliang He
Lanju Kong
Fangzhao Wu
Y. Jiang
Li-zhen Cui
FedML
24
60
0
27 Dec 2022
Data Leakage via Access Patterns of Sparse Features in Deep
  Learning-based Recommendation Systems
Data Leakage via Access Patterns of Sparse Features in Deep Learning-based Recommendation Systems
H. Hashemi
Wenjie Xiong
Liu Ke
Kiwan Maeng
M. Annavaram
G. E. Suh
Hsien-Hsin S. Lee
32
6
0
12 Dec 2022
COMET: A Comprehensive Cluster Design Methodology for Distributed Deep
  Learning Training
COMET: A Comprehensive Cluster Design Methodology for Distributed Deep Learning Training
D. Kadiyala
Saeed Rashidi
Taekyung Heo
A. Bambhaniya
T. Krishna
Alexandros Daglis
VLM
24
9
0
30 Nov 2022
RecD: Deduplication for End-to-End Deep Learning Recommendation Model
  Training Infrastructure
RecD: Deduplication for End-to-End Deep Learning Recommendation Model Training Infrastructure
Mark Zhao
Dhruv Choudhary
Devashish Tyagi
A. Somani
Max Kaplan
...
Jongsoo Park
Aarti Basant
Niket Agarwal
Carole-Jean Wu
Christos Kozyrakis
VLM
23
6
0
09 Nov 2022
DreamShard: Generalizable Embedding Table Placement for Recommender
  Systems
DreamShard: Generalizable Embedding Table Placement for Recommender Systems
Daochen Zha
Louis Feng
Qiaoyu Tan
Zirui Liu
Kwei-Herng Lai
Bhargav Bhushanam
Yuandong Tian
A. Kejariwal
Xia Hu
LMTD
OffRL
20
28
0
05 Oct 2022
Understanding Scaling Laws for Recommendation Models
Understanding Scaling Laws for Recommendation Models
Newsha Ardalani
Carole-Jean Wu
Zeliang Chen
Bhargav Bhushanam
Adnan Aziz
34
28
0
17 Aug 2022
AutoShard: Automated Embedding Table Sharding for Recommender Systems
AutoShard: Automated Embedding Table Sharding for Recommender Systems
Daochen Zha
Louis Feng
Bhargav Bhushanam
Dhruv Choudhary
Jade Nie
Yuandong Tian
Jay Chae
Yi-An Ma
A. Kejariwal
Xia Hu
35
30
0
12 Aug 2022
FEL: High Capacity Learning for Recommendation and Ranking via Federated
  Ensemble Learning
FEL: High Capacity Learning for Recommendation and Ranking via Federated Ensemble Learning
Meisam Hejazinia
Dzmitry Huba
Ilias Leontiadis
Kiwan Maeng
Mani Malek
Luca Melis
Ilya Mironov
Milad Nasr
Kaikai Wang
Carole-Jean Wu
FedML
9
5
0
07 Jun 2022
Towards Fair Federated Recommendation Learning: Characterizing the
  Inter-Dependence of System and Data Heterogeneity
Towards Fair Federated Recommendation Learning: Characterizing the Inter-Dependence of System and Data Heterogeneity
Kiwan Maeng
Haiyu Lu
Luca Melis
John Nguyen
Michael G. Rabbat
Carole-Jean Wu
FedML
29
31
0
30 May 2022
GBA: A Tuning-free Approach to Switch between Synchronous and
  Asynchronous Training for Recommendation Model
GBA: A Tuning-free Approach to Switch between Synchronous and Asynchronous Training for Recommendation Model
Wenbo Su
Yuanxing Zhang
Yufeng Cai
Kaixu Ren
Pengjie Wang
...
Jing Chen
Hongbo Deng
Jian Xu
Lin Qu
Bo Zheng
20
4
0
23 May 2022
Heterogeneous Acceleration Pipeline for Recommendation System Training
Heterogeneous Acceleration Pipeline for Recommendation System Training
Muhammad Adnan
Yassaman Ebrahimzadeh Maboud
Divyat Mahajan
Prashant J. Nair
26
18
0
11 Apr 2022
ORCA: A Network and Architecture Co-design for Offloading us-scale
  Datacenter Applications
ORCA: A Network and Architecture Co-design for Offloading us-scale Datacenter Applications
Yifan Yuan
Jing-yu Huang
Yan Sun
Tianchen Wang
Jacob Nelson
Dan R. K. Ports
Yipeng Wang
Ren Wang
Charlie Tai
N. Kim
21
2
0
16 Mar 2022
GPU-Initiated On-Demand High-Throughput Storage Access in the BaM System
  Architecture
GPU-Initiated On-Demand High-Throughput Storage Access in the BaM System Architecture
Zaid Qureshi
Vikram Sharma Mailthody
Isaac Gelado
S. Min
Amna Masood
...
Dmitri Vainbrand
I-Hsin Chung
M. Garland
W. Dally
Wen-mei W. Hwu
GNN
25
21
0
09 Mar 2022
BagPipe: Accelerating Deep Recommendation Model Training
BagPipe: Accelerating Deep Recommendation Model Training
Saurabh Agarwal
Chengpo Yan
Ziyi Zhang
Shivaram Venkataraman
29
17
0
24 Feb 2022
TopoOpt: Co-optimizing Network Topology and Parallelization Strategy for
  Distributed Training Jobs
TopoOpt: Co-optimizing Network Topology and Parallelization Strategy for Distributed Training Jobs
Weiyang Wang
Moein Khazraee
Zhizhen Zhong
M. Ghobadi
Zhihao Jia
Dheevatsa Mudigere
Ying Zhang
A. Kewitsch
26
81
0
01 Feb 2022
RecShard: Statistical Feature-Based Memory Optimization for
  Industry-Scale Neural Recommendation
RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural Recommendation
Geet Sethi
Bilge Acun
Niket Agarwal
Christos Kozyrakis
Caroline Trippel
Carole-Jean Wu
47
66
0
25 Jan 2022
Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders
  up to 100 Trillion Parameters
Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters
Xiangru Lian
Binhang Yuan
Xuefeng Zhu
Yulong Wang
Yongjun He
...
Lei Yuan
Hai-bo Yu
Sen Yang
Ce Zhang
Ji Liu
VLM
25
34
0
10 Nov 2021
Modeling Techniques for Machine Learning Fairness: A Survey
Modeling Techniques for Machine Learning Fairness: A Survey
Mingyang Wan
Daochen Zha
Ninghao Liu
Na Zou
SyDa
FaML
30
36
0
04 Nov 2021
Sustainable AI: Environmental Implications, Challenges and Opportunities
Sustainable AI: Environmental Implications, Challenges and Opportunities
Carole-Jean Wu
Ramya Raghavendra
Udit Gupta
Bilge Acun
Newsha Ardalani
...
Maximilian Balandat
Joe Spisak
R. Jain
Michael G. Rabbat
K. Hazelwood
40
380
0
30 Oct 2021
Boost-RS: Boosted Embeddings for Recommender Systems and its Application
  to Enzyme-Substrate Interaction Prediction
Boost-RS: Boosted Embeddings for Recommender Systems and its Application to Enzyme-Substrate Interaction Prediction
Xinmeng Li
Liping Liu
S. Hassoun
26
0
0
28 Sep 2021
Understanding Data Storage and Ingestion for Large-Scale Deep
  Recommendation Model Training
Understanding Data Storage and Ingestion for Large-Scale Deep Recommendation Model Training
Mark Zhao
Niket Agarwal
Aarti Basant
B. Gedik
Satadru Pan
...
Kevin Wilfong
Harsha Rastogi
Carole-Jean Wu
Christos Kozyrakis
Parikshit Pol
GNN
26
70
0
20 Aug 2021
AutoFL: Enabling Heterogeneity-Aware Energy Efficient Federated Learning
AutoFL: Enabling Heterogeneity-Aware Energy Efficient Federated Learning
Young Geun Kim
Carole-Jean Wu
13
83
0
16 Jul 2021
RecPipe: Co-designing Models and Hardware to Jointly Optimize
  Recommendation Quality and Performance
RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance
Udit Gupta
Samuel Hsia
J. Zhang
Mark Wilkening
Javin Pombra
Hsien-Hsin S. Lee
Gu-Yeon Wei
Carole-Jean Wu
David Brooks
41
32
0
18 May 2021
Alternate Model Growth and Pruning for Efficient Training of
  Recommendation Systems
Alternate Model Growth and Pruning for Efficient Training of Recommendation Systems
Xiaocong Du
Bhargav Bhushanam
Jiecao Yu
Dhruv Choudhary
Tianxiang Gao
Sherman Wong
Louis Feng
Jongsoo Park
Yu Cao
A. Kejariwal
26
5
0
04 May 2021
Software-Hardware Co-design for Fast and Scalable Training of Deep
  Learning Recommendation Models
Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models
Dheevatsa Mudigere
Y. Hao
Jianyu Huang
Zhihao Jia
Andrew Tulloch
...
Ajit Mathews
Lin Qiao
M. Smelyanskiy
Bill Jia
Vijay Rao
32
149
0
12 Apr 2021
ECRM: Efficient Fault Tolerance for Recommendation Model Training via
  Erasure Coding
ECRM: Efficient Fault Tolerance for Recommendation Model Training via Erasure Coding
Kaige Liu
J. Kosaian
K. V. Rashmi
22
4
0
05 Apr 2021
Accelerating Recommendation System Training by Leveraging Popular
  Choices
Accelerating Recommendation System Training by Leveraging Popular Choices
Muhammad Adnan
Yassaman Ebrahimzadeh Maboud
Divyat Mahajan
Prashant J. Nair
17
55
0
01 Mar 2021
TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
Chunxing Yin
Bilge Acun
Xing Liu
Carole-Jean Wu
40
102
0
25 Jan 2021
CPR: Understanding and Improving Failure Tolerant Training for Deep
  Learning Recommendation with Partial Recovery
CPR: Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery
Kiwan Maeng
Shivam Bharuka
Isabel Gao
M. C. Jeffrey
V. Saraph
...
Caroline Trippel
Jiyan Yang
Michael G. Rabbat
Brandon Lucia
Carole-Jean Wu
OffRL
11
31
0
05 Nov 2020
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML
  Models: A Survey and Insights
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights
Shail Dave
Riyadh Baghdadi
Tony Nowatzki
Sasikanth Avancha
Aviral Shrivastava
Baoxin Li
53
81
0
02 Jul 2020
12
Next