Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2003.05622
Cited By
Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems
Conference on Machine Learning and Systems (MLSys), 2020
12 March 2020
Weijie Zhao
Deping Xie
Ronglai Jia
Yulei Qian
Rui Ding
Mingming Sun
P. Li
MoE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems"
50 / 72 papers shown
Two-dimensional Sparse Parallelism for Large Scale Deep Learning Recommendation Model Training
Xin Zhang
Quanyu Zhu
Liangbei Xu
Zain Huda
Wang Zhou
...
Dennis van der Staay
Yuxi Hu
J. Nie
Jiyan Yang
Chunzhi Yang
180
1
0
05 Aug 2025
Rec-AD: An Efficient Computation Framework for FDIA Detection Based on Tensor Train Decomposition and Deep Learning Recommendation Model
Yunfeng Li
Junhong Liu
Zhaohui Yang
Guofu Liao
Chuyun Zhang
358
1
0
19 Jul 2025
SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models
Jinho Yang
Ji-Hoon Kim
Joo-Young Kim
290
2
0
01 Apr 2025
Stochastic Communication Avoidance for Recommendation Systems
Lutfi Eren Erdogan
Vijay Anand Raghava Kanakagiri
Kurt Keutzer
Zhen Dong
250
1
0
03 Nov 2024
EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference
Yulei Qian
Fengcun Li
Xiangyang Ji
Xiaoyu Zhao
Jianchao Tan
Jianchao Tan
Xunliang Cai
MoE
434
8
0
16 Oct 2024
ERCache: An Efficient and Reliable Caching Framework for Large-Scale User Representations in Meta's Ads System
F. I. S. Kevin Zhou
Yaning Huang
Dong Liang
Dai Li
Zhongke Zhang
...
Emanuele Maccherani
Taha Hayat
John Guo
Varna Puvvada
Uladzimir Pashkevich
198
0
0
09 Oct 2024
CADC: Encoding User-Item Interactions for Compressing Recommendation Model Training Data
Hossein Entezari Zarch
Abdulla Alshabanah
Chaoyi Jiang
Murali Annavaram
347
1
0
11 Jul 2024
Multi-Epoch learning with Data Augmentation for Deep Click-Through Rate Prediction
Zhongxiang Fan
Zhaocheng Liu
Jian Liang
Dongying Kong
Han Li
Peng Jiang
Shuang Li
Kun Gai
291
1
0
27 Jun 2024
Accelerating Recommender Model Training by Dynamically Skipping Stale Embeddings
Yassaman Ebrahimzadeh Maboud
Muhammad Adnan
Divyat Mahajan
Shiyang Chen
AI4TS
327
0
0
22 Mar 2024
Fine-Grained Embedding Dimension Optimization During Training for Recommender Systems
Qinyi Luo
Penghan Wang
Wei Zhang
Fan Lai
Jiachen Mao
...
Jun Song
Wei-Yu Tsai
Shuai Yang
Yuxi Hu
Xuehai Qian
237
3
0
09 Jan 2024
Ravnest: Decentralized Asynchronous Training on Heterogeneous Devices
A. Menon
Unnikrishnan Menon
Kailash Ahirwar
436
3
0
03 Jan 2024
Distributed Quantum Learning with co-Management in a Multi-tenant Quantum System
BigData Congress [Services Society] (BSS), 2023
Anthony DÓnofrio
Amir Hossain
Lesther Santana
Naseem Machlovi
S. Stein
Jinwei Liu
Ang Li
Y. Mao
241
7
0
13 Dec 2023
CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation Models
Hailin Zhang
Zirui Liu
Boxuan Chen
Yikai Zhao
Tong Zhao
Tong Yang
Tengjiao Wang
295
16
0
06 Dec 2023
Experimental Analysis of Large-scale Learnable Vector Storage Compression
Proceedings of the VLDB Endowment (PVLDB), 2023
Hailin Zhang
Penghao Zhao
Xupeng Miao
Yingxia Shao
Zirui Liu
Tong Yang
Tengjiao Wang
352
19
0
27 Nov 2023
Sparsity-Preserving Differentially Private Training of Large Embedding Models
Neural Information Processing Systems (NeurIPS), 2023
Badih Ghazi
Yangsibo Huang
Pritish Kamath
Ravi Kumar
Pasin Manurangsi
Amer Sinha
Chiyuan Zhang
345
6
0
14 Nov 2023
Evaluating and Enhancing Robustness of Deep Recommendation Systems Against Hardware Errors
Dongning Ma
Xun Jiao
Fred Lin
Mengshi Zhang
Alban Desmaison
Thomas Sellinger
Daniel Moore
Sriram Sankar
178
2
0
17 Jul 2023
Differentially Private One Permutation Hashing and Bin-wise Consistent Weighted Sampling
Xiaoyun Li
Ping Li
220
9
0
13 Jun 2023
Multi-Epoch Learning for Deep Click-Through Rate Prediction Models
Zhaocheng Liu
Zhongxiang Fan
Jian Liang
Dongying Kong
Han Li
273
5
0
31 May 2023
MTrainS: Improving DLRM training efficiency using heterogeneous memories
H. Kassa
Paul Johnson
Jason B. Akers
Mrinmoy Ghosh
Andrew Tulloch
Dheevatsa Mudigere
Jongsoo Park
Xing Liu
R. Dreslinski
E. K. Ardestani
226
4
0
19 Apr 2023
Hera: A Heterogeneity-Aware Multi-Tenant Inference Server for Personalized Recommendations
Yujeong Choi
John Kim
Minsoo Rhu
252
1
0
23 Feb 2023
MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023
Samuel Hsia
Udit Gupta
Bilge Acun
Newsha Ardalani
Pan Zhong
Gu-Yeon Wei
David Brooks
Carole-Jean Wu
245
20
0
21 Feb 2023
GPU-based Private Information Retrieval for On-Device Machine Learning Inference
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023
Maximilian Lam
Jeff Johnson
Wenjie Xiong
Kiwan Maeng
Udit Gupta
...
Hsien-Hsin S. Lee
Vijay Janapa Reddi
Gu-Yeon Wei
David Brooks
Edward Suh
412
21
0
26 Jan 2023
FlexShard: Flexible Sharding for Industry-Scale Sequence Recommendation Models
Geet Sethi
Pallab Bhattacharya
Dhruv Choudhary
Carole-Jean Wu
Christos Kozyrakis
288
5
0
08 Jan 2023
A GPU-specialized Inference Parameter Server for Large-Scale Deep Recommendation Models
ACM Conference on Recommender Systems (RecSys), 2022
Yingcan Wei
Matthias Langer
F. Yu
Minseok Lee
Kingsley Liu
Ji Shi
Zehuan Wang
BDL
135
29
0
17 Oct 2022
Merlin HugeCTR: GPU-accelerated Recommender System Training and Inference
ACM Conference on Recommender Systems (RecSys), 2022
Zehuan Wang
Yingcan Wei
Minseok Lee
Matthias Langer
F. Yu
...
Daniel G. Abel
Xu Guo
Jianbing Dong
Ji Shi
Kunlun Li
GNN
LRM
194
44
0
17 Oct 2022
DreamShard: Generalizable Embedding Table Placement for Recommender Systems
Neural Information Processing Systems (NeurIPS), 2022
Daochen Zha
Louis Feng
Qiaoyu Tan
Zirui Liu
Kwei-Herng Lai
Bhargav Bhushanam
Yuandong Tian
A. Kejariwal
Helen Zhou
LMTD
OffRL
353
36
0
05 Oct 2022
FeatureBox: Feature Engineering on GPUs for Massive-Scale Ads Systems
Weijie Zhao
Xuewu Jiao
Xinsheng Luo
Jingxue Li
Belhal Karimi
Ping Li
202
2
0
26 Sep 2022
A Comprehensive Survey on Trustworthy Recommender Systems
Wenqi Fan
Xiangyu Zhao
Xiao Chen
Jingran Su
Jingtong Gao
...
Qidong Liu
Yiqi Wang
Hanfeng Xu
Lei Chen
Qing Li
FaML
298
69
0
21 Sep 2022
Understanding Scaling Laws for Recommendation Models
Newsha Ardalani
Carole-Jean Wu
Zeliang Chen
Bhargav Bhushanam
Adnan Aziz
197
57
0
17 Aug 2022
AutoShard: Automated Embedding Table Sharding for Recommender Systems
Knowledge Discovery and Data Mining (KDD), 2022
Daochen Zha
Louis Feng
Bhargav Bhushanam
Dhruv Choudhary
Jade Nie
Yuandong Tian
Jay Chae
Yi-An Ma
A. Kejariwal
Helen Zhou
208
36
0
12 Aug 2022
Package for Fast ABC-Boost
Ping Li
Weijie Zhao
309
7
0
18 Jul 2022
Nimble GNN Embedding with Tensor-Train Decomposition
Knowledge Discovery and Data Mining (KDD), 2022
Chunxing Yin
Da Zheng
Israt Nisa
Christos Faloutsos
George Karypis
R. Vuduc
GNN
262
18
0
21 Jun 2022
FEL: High Capacity Learning for Recommendation and Ranking via Federated Ensemble Learning
Meisam Hejazinia
Dzmitry Huba
Ilias Leontiadis
Kiwan Maeng
Mani Malek
Luca Melis
Ilya Mironov
Milad Nasr
Kaikai Wang
Carole-Jean Wu
FedML
275
9
0
07 Jun 2022
Good Intentions: Adaptive Parameter Management via Intent Signaling
International Conference on Information and Knowledge Management (CIKM), 2022
Alexander Renz-Wieland
Andreas Kieslinger
R. Gericke
Rainer Gemulla
Zoi Kaoudi
Volker Markl
388
1
0
01 Jun 2022
Training Personalized Recommendation Systems from (GPU) Scratch: Look Forward not Backwards
International Symposium on Computer Architecture (ISCA), 2022
Youngeun Kwon
Minsoo Rhu
188
32
0
10 May 2022
CowClip: Reducing CTR Prediction Model Training Time from 12 hours to 10 minutes on 1 GPU
AAAI Conference on Artificial Intelligence (AAAI), 2022
Zangwei Zheng
Peng Xu
Xuan Zou
Da Tang
Zhen Li
...
Xiangzhuo Ding
Fuzhao Xue
Ziheng Qing
Youlong Cheng
Yang You
VLM
398
9
0
13 Apr 2022
Heterogeneous Acceleration Pipeline for Recommendation System Training
International Symposium on Computer Architecture (ISCA), 2022
Muhammad Adnan
Yassaman Ebrahimzadeh Maboud
Divyat Mahajan
Shiyang Chen
371
24
0
11 Apr 2022
PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems
IEEE International Conference on Data Engineering (ICDE), 2022
Yuanxing Zhang
Langshi Chen
Siran Yang
Man Yuan
Hui-juan Yi
...
Yong Li
Dingyang Zhang
Jialin Li
Lin Qu
Bo Zheng
219
38
0
11 Apr 2022
ORCA: A Network and Architecture Co-design for Offloading us-scale Datacenter Applications
Yifan Yuan
Jing-yu Huang
Yan Sun
Tianchen Wang
Jacob Nelson
Dan R. K. Ports
Yipeng Wang
Ren Wang
Charlie Tai
Nam Sung Kim
212
2
0
16 Mar 2022
GPU-Initiated On-Demand High-Throughput Storage Access in the BaM System Architecture
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022
Zaid Qureshi
Vikram Sharma Mailthody
Isaac Gelado
S. Min
Amna Masood
...
Dmitri Vainbrand
I-Hsin Chung
M. Garland
W. Dally
Wen-mei W. Hwu
GNN
214
58
0
09 Mar 2022
RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural Recommendation
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022
Geet Sethi
Bilge Acun
Niket Agarwal
Christos Kozyrakis
Caroline Trippel
Carole-Jean Wu
335
80
0
25 Jan 2022
Building a Performance Model for Deep Learning Recommendation Model Training on GPUs
International Conference on High Performance Computing (HiPC), 2022
Zhongyi Lin
Louis Feng
E. K. Ardestani
Jaewon Lee
J. Lundell
Changkyu Kim
A. Kejariwal
John Douglas Owens
190
20
0
19 Jan 2022
GCWSNet: Generalized Consistent Weighted Sampling for Scalable and Accurate Training of Neural Networks
International Conference on Information and Knowledge Management (CIKM), 2022
Ping Li
Weijie Zhao
BDL
231
12
0
07 Jan 2022
Communication-Efficient TeraByte-Scale Model Training Framework for Online Advertising
Weijie Zhao
Xuewu Jiao
Mingqing Hu
Xiaoyun Li
Xinming Zhang
Ping Li
3DV
200
8
0
05 Jan 2022
HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework
Xupeng Miao
Hailin Zhang
Yining Shi
Xiaonan Nie
Zhi-Xin Yang
Yangyu Tao
Tengjiao Wang
186
70
0
14 Dec 2021
End-to-end Adaptive Distributed Training on PaddlePaddle
Yulong Ao
Zhihua Wu
Dianhai Yu
Weibao Gong
Zhiqing Kui
...
Yanjun Ma
Tian Wu
Haifeng Wang
Wei Zeng
Chao Yang
306
15
0
06 Dec 2021
HeterPS: Distributed Deep Learning With Reinforcement Learning Based Scheduling in Heterogeneous Environments
Future generations computer systems (FGCS), 2021
Ji Liu
Zhihua Wu
Dianhai Yu
Yanjun Ma
Danlei Feng
Minxu Zhang
Xinxuan Wu
Xuefeng Yao
Dejing Dou
279
64
0
20 Nov 2021
Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters
Knowledge Discovery and Data Mining (KDD), 2021
Xiangru Lian
Binhang Yuan
Xuefeng Zhu
Yulong Wang
Yongjun He
...
Lei Yuan
Hai-bo Yu
Sen Yang
Ce Zhang
Ji Liu
VLM
316
42
0
10 Nov 2021
Sustainable AI: Environmental Implications, Challenges and Opportunities
Conference on Machine Learning and Systems (MLSys), 2021
Carole-Jean Wu
Ramya Raghavendra
Udit Gupta
Bilge Acun
Newsha Ardalani
...
Maximilian Balandat
Joe Spisak
R. Jain
Michael G. Rabbat
K. Hazelwood
512
600
0
30 Oct 2021
Supporting Massive DLRM Inference Through Software Defined Memory
IEEE International Conference on Distributed Computing Systems (ICDCS), 2021
E. K. Ardestani
Changkyu Kim
Seung Jae Lee
Luoshang Pan
Valmiki Rampersad
...
Krishnakumar Nair
Maxim Naumov
Christopher Peterson
M. Smelyanskiy
Vijay Rao
BDL
273
27
0
21 Oct 2021
1
2
Next
Page 1 of 2