Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1912.12953
Cited By
RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing
International Symposium on Computer Architecture (ISCA), 2019
30 December 2019
Liu Ke
Udit Gupta
Carole-Jean Wu
B. Cho
Mark Hempstead
Brandon Reagen
Xuan Zhang
David Brooks
Vikas Chandra
Utku Diril
A. Firoozshahian
K. Hazelwood
Bill Jia
Hsien-Hsin S. Lee
Meng Li
Bertrand A. Maher
Dheevatsa Mudigere
Maxim Naumov
Martin D. Schatz
M. Smelyanskiy
Xiaodong Wang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing"
50 / 57 papers shown
Cocoon: A System Architecture for Differentially Private Training with Correlated Noises
Donghwan Kim
Xin Gu
Jinho Baek
Timothy Lo
Younghoon Min
Kwangsik Shin
Jongryool Kim
J. Park
Kiwan Maeng
130
0
0
08 Oct 2025
Toward Robust and Efficient ML-Based GPU Caching for Modern Inference
Peng Chen
Jiaji Zhang
Hailiang Zhao
Yirong Zhang
Jiahong Yu
...
Jianping Zou
Gang Xiong
Kingsum Chow
Shuibing He
Shuiguang Deng
BDL
164
1
0
25 Sep 2025
L3: DIMM-PIM Integrated Architecture and Coordination for Scalable Long-Context LLM Inference
Qingyuan Liu
Liyan Chen
Yanning Yang
Haoyu Wang
Dong Du
Zhigang Mao
Naifeng Jing
Yubin Xia
Haibo Chen
209
0
0
24 Apr 2025
Ember: A Compiler for Efficient Embedding Operations on Decoupled Access-Execute Architectures
Marco Siracusa
Olivia Hsu
Victor Soria-Pardos
Joshua Randall
Arnaud Grasset
...
Doug Joseph
Randy Allen
Fredrik Kjolstad
Miquel Moretó Planas
Adrià Armejach
247
0
0
14 Apr 2025
Pushing the Performance Envelope of DNN-based Recommendation Systems Inference on GPUs
Micro (MICRO), 2024
Rishabh Jain
Vivek M. Bhasi
Adwait Jog
A. Sivasubramaniam
M. Kandemir
Chita R. Das
154
3
0
29 Oct 2024
PIFS-Rec: Process-In-Fabric-Switch for Large-Scale Recommendation System Inferences
Micro (MICRO), 2024
Pingyi Huo
Anusha Devulapally
Hasan Al Maruf
Minseo Park
Krishnakumar Nair
Meena Arunachalam
Gulsum Gudukbay Akbulut
M. Kandemir
Vijaykrishnan Narayanan
168
5
0
25 Sep 2024
UpDLRM: Accelerating Personalized Recommendation using Real-World PIM Architecture
Sitian Chen
Haobin Tan
Amelie Chi Zhou
Yusen Li
Pavan Balaji
AI4CE
80
12
0
20 Jun 2024
Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization
Jungi Lee
Wonbeom Lee
Jaewoong Sim
MQ
252
33
0
16 Jun 2024
PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models
Yunjae Lee
Hyeseong Kim
Minsoo Rhu
161
7
0
11 Jun 2024
ElasticRec: A Microservice-based Model Serving Architecture Enabling Elastic Resource Scaling for Recommendation Models
Yujeong Choi
Jiin Kim
Minsoo Rhu
205
2
0
11 Jun 2024
PID-Comm: A Fast and Flexible Collective Communication Framework for Commodity Processing-in-DIMM Devices
Si Ung Noh
Junguk Hong
Chaemin Lim
Seong-Yeol Park
Jeehyun Kim
Hanjun Kim
Youngsok Kim
Jinho Lee
231
13
0
13 Apr 2024
LazyDP: Co-Designing Algorithm-Software for Scalable Training of Differentially Private Recommendation Models
Juntaek Lim
Youngeun Kwon
Ranggi Hwang
Kiwan Maeng
Edward Suh
Minsoo Rhu
SyDa
198
1
0
12 Apr 2024
Accelerating Recommender Model Training by Dynamically Skipping Stale Embeddings
Yassaman Ebrahimzadeh Maboud
Muhammad Adnan
Divyat Mahajan
Shiyang Chen
AI4TS
216
0
0
22 Mar 2024
MAD Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems
International Symposium on Computer Architecture (ISCA), 2023
Samuel Hsia
Alicia Golden
Bilge Acun
Newsha Ardalani
Zach DeVito
Gu-Yeon Wei
David Brooks
Carole-Jean Wu
MoE
320
14
0
04 Oct 2023
SimplePIM: A Software Framework for Productive and Efficient Processing-in-Memory
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2023
Jinfan Chen
Juan Gómez Luna
I. E. Hajj
Yu-Yin Guo
Onur Mutlu
135
31
0
03 Oct 2023
Mem-Rec: Memory Efficient Recommendation System using Alternative Representation
Asian Conference on Machine Learning (ACML), 2023
Gopu Krishna Jha
Anthony Thomas
Nilesh Jain
Sameh Gobriel
Tajana Rosing
Ravi Iyer
191
4
0
12 May 2023
CHASE: Accelerating Distributed Pointer-Traversals on Disaggregated Memory
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023
Yupeng Tang
Meng Qi
Anurag Khandelwal
GNN
160
0
0
03 May 2023
Hera: A Heterogeneity-Aware Multi-Tenant Inference Server for Personalized Recommendations
Yujeong Choi
John Kim
Minsoo Rhu
158
1
0
23 Feb 2023
MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023
Samuel Hsia
Udit Gupta
Bilge Acun
Newsha Ardalani
Pan Zhong
Gu-Yeon Wei
David Brooks
Carole-Jean Wu
201
20
0
21 Feb 2023
On Memory Codelets: Prefetching, Recoding, Moving and Streaming Data
D. Fox
J. M. Diaz
Xiaoming Li
77
2
0
31 Jan 2023
Failure Tolerant Training with Persistent Memory Disaggregation over CXL
IEEE Micro (IEEE Micro), 2023
Miryeong Kwon
Junhyeok Jang
Hanjin Choi
Sangwon Lee
Myoungsoo Jung
180
12
0
14 Jan 2023
FlexShard: Flexible Sharding for Industry-Scale Sequence Recommendation Models
Geet Sethi
Pallab Bhattacharya
Dhruv Choudhary
Carole-Jean Wu
Christos Kozyrakis
221
5
0
08 Jan 2023
DisaggRec: Architecting Disaggregated Systems for Large-Scale Personalized Recommendation
Liu Ke
Xuan Zhang
Benjamin C. Lee
G. E. Suh
Hsien-Hsin S. Lee
137
10
0
02 Dec 2022
A Comprehensive Survey on Trustworthy Recommender Systems
Wenqi Fan
Xiangyu Zhao
Xiao Chen
Jingran Su
Jingtong Gao
...
Qidong Liu
Yiqi Wang
Hanfeng Xu
Lei Chen
Qing Li
FaML
236
60
0
21 Sep 2022
An Experimental Evaluation of Machine Learning Training on a Real Processing-in-Memory System
Juan Gómez Luna
Yu-Yin Guo
Sylvan Brocard
Julien Legriel
Remy Cimadomo
Geraldo F. Oliveira
Gagandeep Singh
O. Mutlu
VLM
233
18
0
16 Jul 2022
Heterogeneous Data-Centric Architectures for Modern Data-Intensive Applications: Case Studies in Machine Learning and Databases
IEEE Computer Society Annual Symposium on VLSI (VLSI), 2022
Geraldo F. Oliveira
Amirali Boroumand
Saugata Ghose
Juan Gómez Luna
O. Mutlu
155
8
0
29 May 2022
SmartSAGE: Training Large-scale Graph Neural Networks using In-Storage Processing Architectures
International Symposium on Computer Architecture (ISCA), 2022
Yunjae Lee
Jin-Won Chung
Minsoo Rhu
GNN
154
64
0
10 May 2022
Training Personalized Recommendation Systems from (GPU) Scratch: Look Forward not Backwards
International Symposium on Computer Architecture (ISCA), 2022
Youngeun Kwon
Minsoo Rhu
127
30
0
10 May 2022
Heterogeneous Acceleration Pipeline for Recommendation System Training
International Symposium on Computer Architecture (ISCA), 2022
Muhammad Adnan
Yassaman Ebrahimzadeh Maboud
Divyat Mahajan
Shiyang Chen
206
23
0
11 Apr 2022
ORCA: A Network and Architecture Co-design for Offloading us-scale Datacenter Applications
Yifan Yuan
Jing-yu Huang
Yan Sun
Tianchen Wang
Jacob Nelson
Dan R. K. Ports
Yipeng Wang
Ren Wang
Charlie Tai
Nam Sung Kim
175
2
0
16 Mar 2022
Hercules: Heterogeneity-Aware Inference Serving for At-Scale Personalized Recommendation
International Symposium on High-Performance Computer Architecture (HPCA), 2022
Liu Ke
Udit Gupta
Mark Hempstead
Carole-Jean Wu
Hsien-Hsin S. Lee
Xuan Zhang
154
31
0
14 Mar 2022
BagPipe: Accelerating Deep Recommendation Model Training
Symposium on Operating Systems Principles (SOSP), 2022
Saurabh Agarwal
Chengpo Yan
Ziyi Zhang
Shivaram Venkataraman
201
26
0
24 Feb 2022
RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural Recommendation
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022
Geet Sethi
Bilge Acun
Niket Agarwal
Christos Kozyrakis
Caroline Trippel
Carole-Jean Wu
229
78
0
25 Jan 2022
SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Systems
Christina Giannoula
Ivan Fernandez
Juan Gómez Luna
N. Koziris
G. Goumas
O. Mutlu
MoE
357
27
0
13 Jan 2022
GNNear: Accelerating Full-Batch Training of Graph Neural Networks with Near-Memory Processing
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2021
Zhe Zhou
Cong Li
Xuechao Wei
Xiaoyang Wang
Guangyu Sun
GNN
175
34
0
01 Nov 2021
Sustainable AI: Environmental Implications, Challenges and Opportunities
Conference on Machine Learning and Systems (MLSys), 2021
Carole-Jean Wu
Ramya Raghavendra
Udit Gupta
Bilge Acun
Newsha Ardalani
...
Maximilian Balandat
Joe Spisak
R. Jain
Michael G. Rabbat
K. Hazelwood
381
532
0
30 Oct 2021
Google Neural Network Models for Edge Devices: Analyzing and Mitigating Machine Learning Inference Bottlenecks
Amirali Boroumand
Saugata Ghose
Berkin Akin
Ravi Narayanaswami
Geraldo F. Oliveira
Xiaoyu Ma
Eric Shiu
O. Mutlu
172
98
0
29 Sep 2021
Neuro-Symbolic AI: An Emerging Class of AI Workloads and their Characterization
Zachary Susskind
Bryce Arden
L. John
Patrick A Stockton
E. John
NAI
152
45
0
13 Sep 2021
Accelerating Weather Prediction using Near-Memory Reconfigurable Fabric
ACM Transactions on Reconfigurable Technology and Systems (TRETS), 2021
Gagandeep Singh
D. Diamantopoulos
Juan Gómez Luna
C. Hagleitner
S. Stuijk
Henk Corporaal
O. Mutlu
206
30
0
19 Jul 2021
RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance
Micro (MICRO), 2021
Udit Gupta
Samuel Hsia
J. Zhang
Mark Wilkening
Javin Pombra
Hsien-Hsin S. Lee
Gu-Yeon Wei
Carole-Jean Wu
David Brooks
233
33
0
18 May 2021
DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks
IEEE Access (IEEE Access), 2021
Geraldo F. Oliveira
Juan Gómez Luna
Lois Orosa
Saugata Ghose
Nandita Vijaykumar
Ivan Fernandez
Mohammad Sadrosadati
O. Mutlu
407
94
0
08 May 2021
Continual Learning Approach for Improving the Data and Computation Mapping in Near-Memory Processing System
Pritam Majumder
Jiayi Huang
Sungkeun Kim
A. Muzahid
Dylan Siegers
Chia-Che Tsai
Eun Jung Kim
200
2
0
28 Apr 2021
Mitigating Edge Machine Learning Inference Bottlenecks: An Empirical Study on Accelerating Google Edge Models
Amirali Boroumand
Saugata Ghose
Berkin Akin
Ravi Narayanaswami
Geraldo F. Oliveira
Xiaoyu Ma
Eric Shiu
O. Mutlu
127
31
0
01 Mar 2021
Accelerating Recommendation System Training by Leveraging Popular Choices
Proceedings of the VLDB Endowment (PVLDB), 2021
Muhammad Adnan
Yassaman Ebrahimzadeh Maboud
Divyat Mahajan
Shiyang Chen
218
67
0
01 Mar 2021
RecSSD: Near Data Processing for Solid State Drive Based Recommendation Inference
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2021
Mark Wilkening
Udit Gupta
Samuel Hsia
Caroline Trippel
Carole-Jean Wu
David Brooks
Gu-Yeon Wei
155
125
0
29 Jan 2021
TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
Conference on Machine Learning and Systems (MLSys), 2021
Chunxing Yin
Bilge Acun
Xing Liu
Carole-Jean Wu
247
114
0
25 Jan 2021
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
International Symposium on High-Performance Computer Architecture (HPCA), 2020
Hanrui Wang
Zhekai Zhang
Song Han
451
493
0
17 Dec 2020
Understanding Training Efficiency of Deep Learning Recommendation Models at Scale
Bilge Acun
Matthew Murphy
Xiaodong Wang
Jade Nie
Carole-Jean Wu
K. Hazelwood
212
123
0
11 Nov 2020
CPR: Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery
Kiwan Maeng
Shivam Bharuka
Isabel Gao
M. C. Jeffrey
V. Saraph
...
Caroline Trippel
Jiyan Yang
Michael G. Rabbat
Brandon Lucia
Carole-Jean Wu
OffRL
251
39
0
05 Nov 2020
Understanding Capacity-Driven Scale-Out Neural Recommendation Inference
Michael Lui
Yavuz Yetim
Özgür Özkan
Zhuoran Zhao
Shin-Yeh Tsai
Carole-Jean Wu
Mark Hempstead
GNN
BDL
LRM
260
57
0
04 Nov 2020
1
2
Next