MP-Rec: Hardware-Software Co-Design to Enable Multi-Path RecommendationInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023

Samuel Hsia

Udit Gupta

Bilge Acun

Newsha Ardalani

Pan Zhong

Gu-Yeon Wei

David Brooks

Carole-Jean Wu

245

21 Feb 2023

GPU-based Private Information Retrieval for On-Device Machine Learning InferenceInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023

Maximilian Lam

...

David Brooks

412

26 Jan 2023

FlexShard: Flexible Sharding for Industry-Scale Sequence Recommendation Models

288

08 Jan 2023

A GPU-specialized Inference Parameter Server for Large-Scale Deep Recommendation ModelsACM Conference on Recommender Systems (RecSys), 2022

135

17 Oct 2022

Merlin HugeCTR: GPU-accelerated Recommender System Training and InferenceACM Conference on Recommender Systems (RecSys), 2022

...

194

17 Oct 2022

DreamShard: Generalizable Embedding Table Placement for Recommender SystemsNeural Information Processing Systems (NeurIPS), 2022

Daochen Zha

Louis Feng

Bhargav Bhushanam

353

05 Oct 2022

FeatureBox: Feature Engineering on GPUs for Massive-Scale Ads Systems

202

26 Sep 2022

A Comprehensive Survey on Trustworthy Recommender Systems

Xiangyu Zhao

...

Lei Chen

298

21 Sep 2022

Understanding Scaling Laws for Recommendation Models

Bhargav Bhushanam

197

17 Aug 2022

AutoShard: Automated Embedding Table Sharding for Recommender SystemsKnowledge Discovery and Data Mining (KDD), 2022

Daochen Zha

Louis Feng

Bhargav Bhushanam

208

12 Aug 2022

Package for Fast ABC-Boost

Ping Li

Weijie Zhao

309

18 Jul 2022

Nimble GNN Embedding with Tensor-Train DecompositionKnowledge Discovery and Data Mining (KDD), 2022

George Karypis

262

21 Jun 2022

FEL: High Capacity Learning for Recommendation and Ranking via Federated Ensemble Learning

275

07 Jun 2022

Good Intentions: Adaptive Parameter Management via Intent SignalingInternational Conference on Information and Knowledge Management (CIKM), 2022

Alexander Renz-Wieland

Volker Markl

388

01 Jun 2022

Training Personalized Recommendation Systems from (GPU) Scratch: Look Forward not BackwardsInternational Symposium on Computer Architecture (ISCA), 2022

Youngeun Kwon

Minsoo Rhu

188

10 May 2022

CowClip: Reducing CTR Prediction Model Training Time from 12 hours to 10 minutes on 1 GPUAAAI Conference on Artificial Intelligence (AAAI), 2022

...

Yang You

398

13 Apr 2022

Heterogeneous Acceleration Pipeline for Recommendation System TrainingInternational Symposium on Computer Architecture (ISCA), 2022

Muhammad Adnan

Yassaman Ebrahimzadeh Maboud

Divyat Mahajan

Shiyang Chen

371

11 Apr 2022

PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender SystemsIEEE International Conference on Data Engineering (ICDE), 2022

...

Yong Li

219

11 Apr 2022

ORCA: A Network and Architecture Co-design for Offloading us-scale Datacenter Applications

212

16 Mar 2022

GPU-Initiated On-Demand High-Throughput Storage Access in the BaM System ArchitectureInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022

Zaid Qureshi

Vikram Sharma Mailthody

...

214

09 Mar 2022

RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural RecommendationInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022

335

25 Jan 2022

Building a Performance Model for Deep Learning Recommendation Model Training on GPUsInternational Conference on High Performance Computing (HiPC), 2022

Louis Feng

190

19 Jan 2022

GCWSNet: Generalized Consistent Weighted Sampling for Scalable and Accurate Training of Neural NetworksInternational Conference on Information and Knowledge Management (CIKM), 2022

Ping Li

Weijie Zhao

BDL

231

07 Jan 2022

Communication-Efficient TeraByte-Scale Model Training Framework for Online Advertising

200

05 Jan 2022

HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework

Xiaonan Nie

186

14 Dec 2021

End-to-end Adaptive Distributed Training on PaddlePaddle

Dianhai Yu

...

306

06 Dec 2021

HeterPS: Distributed Deep Learning With Reinforcement Learning Based Scheduling in Heterogeneous EnvironmentsFuture generations computer systems (FGCS), 2021

Dianhai Yu

279

20 Nov 2021

Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion ParametersKnowledge Discovery and Data Mining (KDD), 2021

...

316

10 Nov 2021

Sustainable AI: Environmental Implications, Challenges and OpportunitiesConference on Machine Learning and Systems (MLSys), 2021

...

512

600

30 Oct 2021

Supporting Massive DLRM Inference Through Software Defined MemoryIEEE International Conference on Distributed Computing Systems (ICDCS), 2021

...

273

21 Oct 2021