Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2012.06421
Cited By
v1
v2 (latest)
When is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning?
Symposium on the Theory of Computing (STOC), 2020
11 December 2020
Gavin Brown
Mark Bun
Vitaly Feldman
Adam D. Smith
Kunal Talwar
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"When is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning?"
50 / 92 papers shown
Extracting alignment data in open models
Federico Barbero
Xiangming Gu
Christopher A. Choquette-Choo
Chawin Sitawarin
Matthew Jagielski
Itay Yona
Petar Velickovic
Ilia Shumailov
Jamie Hayes
282
3
0
21 Oct 2025
AI Agents as Universal Task Solvers
Alessandro Achille
Stefano Soatto
LRM
133
2
1
14 Oct 2025
A Law of Data Reconstruction for Random Features (and Beyond)
Leonardo Iurada
Simone Bombari
Tatiana Tommasi
Marco Mondelli
157
0
0
26 Sep 2025
Efficiently Attacking Memorization Scores
Tue Do
Varun Chandrasekaran
Daniel Alabi
TDI
AAML
281
0
0
24 Sep 2025
Synth-MIA: A Testbed for Auditing Privacy Leakage in Tabular Data Synthesis
Joshua Ward
Xiaofeng Lin
Chi-Hua Wang
Guang Cheng
167
5
0
22 Sep 2025
Access Paths for Efficient Ordering with Large Language Models
Fuheng Zhao
Jiayue Chen
Yiming Pan
Tahseen Rabbani
D. Agrawal
D. Agrawal
A. El Abbadi
Paritosh Aggarwal
Anupam Datta
Dimitris Tsirogiannis
215
0
0
30 Aug 2025
Unveiling Over-Memorization in Finetuning LLMs for Reasoning Tasks
Zhiwen Ruan
Yun-Nung Chen
Yutao Hou
Peng Li
Yang Liu
Guanhua Chen
229
2
0
06 Aug 2025
A Common Pool of Privacy Problems: Legal and Technical Lessons from a Large-Scale Web-Scraped Machine Learning Dataset
Rachel Hong
Jevan Hutson
William Agnew
Imaad Huda
Tadayoshi Kohno
Jamie Morgenstern
AILaw
391
5
0
20 Jun 2025
Black-Box Privacy Attacks on Shared Representations in Multitask Learning
John Abascal
Nicolás Berrios
Alina Oprea
Jonathan R. Ullman
Adam D. Smith
Matthew Jagielski
MLAU
276
0
0
19 Jun 2025
Memorization in Language Models through the Lens of Intrinsic Dimension
Stefan Arnold
PILM
364
3
0
11 Jun 2025
Trade-offs in Data Memorization via Strong Data Processing Inequalities
Annual Conference Computational Learning Theory (COLT), 2025
Vitaly Feldman
Guy Kornowski
Xin Lyu
TDI
FedML
451
4
0
02 Jun 2025
How much do language models memorize?
John X. Morris
Chawin Sitawarin
Chuan Guo
Narine Kokhlikyan
G. E. Suh
Alexander M. Rush
Kamalika Chaudhuri
Saeed Mahloujifar
KELM
ELM
423
33
0
30 May 2025
Bayesian Perspective on Memorization and Reconstruction
Haim Kaplan
Yishay Mansour
Kobbi Nissim
Uri Stemmer
AAML
268
0
0
29 May 2025
Querying Kernel Methods Suffices for Reconstructing their Training Data
Daniel Barzilai
Yuval Margalit
Eitan Gronich
Gilad Yehudai
Meirav Galun
Ronen Basri
219
0
0
25 May 2025
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models
Minki Kang
Jongwon Jeong
Jaewoong Cho
ALM
LRM
324
7
0
07 Apr 2025
Trustworthy Machine Learning via Memorization and the Granular Long-Tail: A Survey on Interactions, Tradeoffs, and Beyond
Qiongxiu Li
Xiaoyu Luo
Yiyi Chen
Johannes Bjerva
565
6
0
10 Mar 2025
Machine Learners Should Acknowledge the Legal Implications of Large Language Models as Personal Data
Henrik Nolte
Michèle Finck
Kristof Meding
AILaw
PILM
488
3
0
03 Mar 2025
The Pitfalls of Memorization: When Memorization Hurts Generalization
International Conference on Learning Representations (ICLR), 2024
Reza Bayat
Mohammad Pezeshki
Elvis Dohmatob
David Lopez-Paz
Pascal Vincent
OOD
362
16
0
10 Dec 2024
Improved Localized Machine Unlearning Through the Lens of Memorization
Reihaneh Torkzadehmahani
Reza Nasirigerdeh
Georgios Kaissis
Daniel Rueckert
Gintare Karolina Dziugaite
Eleni Triantafillou
MU
219
7
0
03 Dec 2024
Slowing Down Forgetting in Continual Learning
Pascal Janetzky
Tobias Schlagenhauf
Stefan Feuerriegel
CLL
456
0
0
11 Nov 2024
Undesirable Memorization in Large Language Models: A Survey
Ali Satvaty
Suzan Verberne
Fatih Turkmen
ELM
PILM
629
25
0
03 Oct 2024
Range Membership Inference Attacks
Jiashu Tao
Reza Shokri
472
9
0
09 Aug 2024
Demystifying Verbatim Memorization in Large Language Models
Jing Huang
Diyi Yang
Christopher Potts
ELM
PILM
MU
338
48
0
25 Jul 2024
A Survey on Machine Unlearning: Techniques and New Emerged Privacy Risks
Journal of Information Security and Applications (JISA), 2024
Hengzhu Liu
Ping Xiong
Tianqing Zhu
Philip S. Yu
248
21
0
10 Jun 2024
Data Reconstruction: When You See It and When You Don't
Edith Cohen
Haim Kaplan
Yishay Mansour
Shay Moran
Kobbi Nissim
Uri Stemmer
Eliad Tsfadia
AAML
315
9
0
24 May 2024
Exploring prompts to elicit memorization in masked language model-based named entity recognition
PLoS ONE (PLoS ONE), 2024
Yuxi Xia
Anastasiia Sedova
Pedro Henrique Luz de Araujo
Vasiliki Kougia
Lisa Nussbaumer
Benjamin Roth
296
1
0
05 May 2024
Differentially Private Reinforcement Learning with Self-Play
Dan Qiao
Yu Wang
274
0
0
11 Apr 2024
Gradient Descent is Pareto-Optimal in the Oracle Complexity and Memory Tradeoff for Feasibility Problems
IEEE Annual Symposium on Foundations of Computer Science (FOCS), 2024
Moise Blanchard
265
1
0
10 Apr 2024
Unveiling Privacy, Memorization, and Input Curvature Links
Deepak Ravikumar
Efstathia Soufleri
Abolfazl Hashemi
Kaushik Roy
305
13
0
28 Feb 2024
Information Complexity of Stochastic Convex Optimization: Applications to Generalization and Memorization
Idan Attias
Gintare Karolina Dziugaite
Mahdi Haghifam
Roi Livni
Daniel M. Roy
360
11
0
14 Feb 2024
Do LLMs Dream of Ontologies?
ACM Transactions on Intelligent Systems and Technology (ACM TIST), 2024
Marco Bombieri
Paolo Fiorini
Simone Paolo Ponzetto
M. Rospocher
CLL
367
6
0
26 Jan 2024
Memorization in Self-Supervised Learning Improves Downstream Generalization
Wenhao Wang
Muhammad Ahmad Kaleem
Adam Dziedzic
Michael Backes
Nicolas Papernot
Franziska Boenisch
SSL
429
19
0
19 Jan 2024
The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline
Haonan Wang
Qianli Shen
Yao Tong
Yang Zhang
Kenji Kawaguchi
307
41
0
07 Jan 2024
SoK: Unintended Interactions among Machine Learning Defenses and Risks
Vasisht Duddu
S. Szyller
Nadarajah Asokan
AAML
384
6
0
07 Dec 2023
Differentially Private Non-Convex Optimization under the KL Condition with Optimal Rates
International Conference on Algorithmic Learning Theory (ALT), 2023
Michael Menart
Enayat Ullah
Raman Arora
Raef Bassily
Cristóbal Guzmán
343
2
0
22 Nov 2023
On Retrieval Augmentation and the Limitations of Language Model Training
Ting-Rui Chiang
Xinyan Velocity Yu
Joshua Robinson
Ollie Liu
Isabelle Lee
Dani Yogatama
RALM
227
2
0
16 Nov 2023
Privacy Threats in Stable Diffusion Models
Thomas Cilloni
Charles Fleming
Charles Walter
211
5
0
15 Nov 2023
SoK: Memorisation in machine learning
Dmitrii Usynin
Moritz Knolle
Georgios Kaissis
333
1
0
06 Nov 2023
Why Train More? Effective and Efficient Membership Inference via Memorization
Jihye Choi
Shruti Tople
Varun Chandrasekaran
Somesh Jha
TDI
FedML
276
3
0
12 Oct 2023
What do larger image classifiers memorise?
Michal Lukasik
Vaishnavh Nagarajan
A. S. Rawat
A. Menon
Sanjiv Kumar
268
5
0
09 Oct 2023
Anonymous Learning via Look-Alike Clustering: A Precise Analysis of Model Generalization
Neural Information Processing Systems (NeurIPS), 2023
Adel Javanmard
Vahab Mirrokni
434
3
0
06 Oct 2023
Deconstructing Data Reconstruction: Multiclass, Weight Decay and General Losses
Neural Information Processing Systems (NeurIPS), 2023
G. Buzaglo
Niv Haim
Gilad Yehudai
Gal Vardi
Yakir Oz
Yaniv Nikankin
Michal Irani
279
25
0
04 Jul 2023
Deconstructing Classifiers: Towards A Data Reconstruction Attack Against Text Classification Models
Adel M. Elmahdy
A. Salem
SILM
326
8
0
23 Jun 2023
Memory-Query Tradeoffs for Randomized Convex Optimization
IEEE Annual Symposium on Foundations of Computer Science (FOCS), 2023
Xinyu Chen
Binghui Peng
240
7
0
21 Jun 2023
Machine Unlearning: A Survey
ACM Computing Surveys (ACM Comput. Surv.), 2023
Heng Xu
Tianqing Zhu
Lefeng Zhang
Wanlei Zhou
Philip S. Yu
MU
282
43
0
06 Jun 2023
TMI! Finetuned Models Leak Private Information from their Pretraining Data
Proceedings on Privacy Enhancing Technologies (PoPETs), 2023
John Abascal
Stanley Wu
Alina Oprea
Jonathan R. Ullman
313
23
0
01 Jun 2023
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks
Neural Information Processing Systems (NeurIPS), 2023
Minki Kang
Seanie Lee
Jinheon Baek
Kenji Kawaguchi
Sung Ju Hwang
ALM
LRM
309
98
0
28 May 2023
Private Everlasting Prediction
Neural Information Processing Systems (NeurIPS), 2023
M. Naor
Kobbi Nissim
Uri Stemmer
Chao Yan
213
5
0
16 May 2023
AI Model Disgorgement: Methods and Choices
Proceedings of the National Academy of Sciences of the United States of America (PNAS), 2023
Alessandro Achille
Michael Kearns
Carson Klingenberg
Stefano Soatto
MU
249
17
0
07 Apr 2023
Near Optimal Memory-Regret Tradeoff for Online Learning
IEEE Annual Symposium on Foundations of Computer Science (FOCS), 2023
Binghui Peng
A. Rubinstein
CLL
318
12
0
03 Mar 2023
1
2
Next
Page 1 of 2