Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2311.17035
Cited By
Scalable Extraction of Training Data from (Production) Language Models
28 November 2023
Milad Nasr
Nicholas Carlini
Jonathan Hayase
Matthew Jagielski
A. Feder Cooper
Daphne Ippolito
Christopher A. Choquette-Choo
Eric Wallace
Florian Tramèr
Katherine Lee
SILM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (3 upvotes)
Papers citing
"Scalable Extraction of Training Data from (Production) Language Models"
50 / 281 papers shown
Private Memorization Editing: Turning Memorization into a Defense to Strengthen Data Privacy in Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Elena Sofia Ruzzetti
Giancarlo A. Xompero
Davide Venditti
Fabio Massimo Zanzotto
KELM
PILM
291
5
0
09 Jun 2025
Quantifying Cross-Modality Memorization in Vision-Language Models
Yuxin Wen
Yangsibo Huang
Tom Goldstein
Ravi Kumar
Badih Ghazi
Chiyuan Zhang
330
2
0
05 Jun 2025
Membership Inference Attacks on Sequence Models
Lorenzo Rossi
Michael Aerni
Jie Zhang
F. Tramèr
273
2
0
05 Jun 2025
Privacy Leaks by Adversaries: Adversarial Iterations for Membership Inference Attack
Jing Xue
Zhishen Sun
Haishan Ye
Luo Luo
Xiangyu Chang
Ivor Tsang
Guang Dai
MIACV
MIALM
313
0
0
03 Jun 2025
ACCESS DENIED INC: The First Benchmark Environment for Sensitivity Awareness
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Dren Fazlija
Arkadij Orlov
Sandipan Sikdar
255
0
0
01 Jun 2025
Existing Large Language Model Unlearning Evaluations Are Inconclusive
Zhili Feng
Yixuan Even Xu
Avi Schwarzschild
Robert Kirk
Xander Davies
Yarin Gal
Avi Schwarzschild
J. Zico Kolter
MU
ELM
154
6
0
31 May 2025
How much do language models memorize?
John X. Morris
Chawin Sitawarin
Chuan Guo
Narine Kokhlikyan
G. E. Suh
Alexander M. Rush
Kamalika Chaudhuri
Saeed Mahloujifar
KELM
ELM
408
23
0
30 May 2025
Hush! Protecting Secrets During Model Training: An Indistinguishability Approach
Arun Ganesh
Brendan McMahan
Milad Nasr
Thomas Steinke
Abhradeep Thakurta
193
1
0
30 May 2025
Exploring the limits of strong membership inference attacks on large language models
Jamie Hayes
Ilia Shumailov
Christopher A. Choquette-Choo
Matthew Jagielski
G. Kaissis
...
Matthieu Meeus
Yves-Alexandre de Montjoye
Franziska Boenisch
Adam Dziedzic
A. Feder Cooper
341
10
0
24 May 2025
Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!
Zhexin Zhang
Yuhao Sun
Junxiao Yang
Shiyao Cui
Hongning Wang
Shiyu Huang
AAML
319
1
0
21 May 2025
Shared Path: Unraveling Memorization in Multilingual LLMs through Language Similarities
Xiaoyu Luo
Yiyi Chen
Johannes Bjerva
Qiongxiu Li
278
3
0
21 May 2025
Is Your Prompt Safe? Investigating Prompt Injection Attacks Against Open-Source LLMs
Jiawen Wang
Pritha Gupta
Ivan Habernal
Eyke Hüllermeier
SILM
AAML
264
7
0
20 May 2025
Fragments to Facts: Partial-Information Fragment Inference from LLMs
Lucas Rosenblatt
Bin Han
Robert Wolfe
Bill Howe
AAML
332
0
0
20 May 2025
Adversarially Pretrained Transformers May Be Universally Robust In-Context Learners
Soichiro Kumano
Hiroshi Kera
Toshihiko Yamasaki
AAML
549
1
0
20 May 2025
One-Step Offline Distillation of Diffusion-based Models via Koopman Modeling
Nimrod Berman
Ilan Naiman
Moshe Eliasof
Hedi Zisling
Omri Azencot
DiffM
OffRL
474
6
0
19 May 2025
PANORAMA: A synthetic PII-laced dataset for studying sensitive data memorization in LLMs
Sriram Selvam
Anneswa Ghosh
178
0
0
18 May 2025
PIG: Privacy Jailbreak Attack on LLMs via Gradient-based Iterative In-Context Optimization
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Yidan Wang
Yanan Cao
Yubing Ren
Fang Fang
Zheng Lin
Binxing Fang
PILM
502
8
0
15 May 2025
DMRL: Data- and Model-aware Reward Learning for Data Extraction
Zhiqiang Wang
Ruoxi Cheng
182
0
0
07 May 2025
OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models
Xiaoyu Xu
Minxin Du
Qingqing Ye
Haibo Hu
MU
456
2
0
07 May 2025
Automatic Calibration for Membership Inference Attack on Large Language Models
Saleh Zare Zade
Yao Qiang
Xiangyu Zhou
Hui Zhu
Mohammad Amin Roshani
Prashant Khanduri
Dongxiao Zhu
267
3
0
06 May 2025
Transferable Adversarial Attacks on Black-Box Vision-Language Models
Kai Hu
Weichen Yu
Guang Dai
Alexander Robey
Andy Zou
Chengming Xu
Haoqi Hu
Matt Fredrikson
AAML
VLM
410
5
0
02 May 2025
Towards Harnessing the Collaborative Power of Large and Small Models for Domain Tasks
Yang Liu
Bingjie Yan
Tianyuan Zou
Jianqing Zhang
Zixuan Gu
...
Jiajian Li
Xiaozhou Ye
Ye Ouyang
Qiang Yang
Yanzhe Zhang
ALM
1.0K
4
0
24 Apr 2025
Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
Vaishnavh Nagarajan
Chen Henry Wu
Charles Ding
Aditi Raghunathan
622
13
0
21 Apr 2025
Antidistillation Sampling
Yash Savani
Asher Trockman
Zhili Feng
Avi Schwarzschild
Avi Schwarzschild
Alexander Robey
Marc Finzi
J. Zico Kolter
448
8
0
17 Apr 2025
The Hidden Space of Safety: Understanding Preference-Tuned LLMs in Multilingual context
Nikhil Verma
Manasa Bharadwaj
273
3
0
03 Apr 2025
SUV: Scalable Large Language Model Copyright Compliance with Regularized Selective Unlearning
Tianyang Xu
Xiaoze Liu
Feijie Wu
Xiaoqian Wang
Jing Gao
MU
551
5
0
29 Mar 2025
Spend Your Budget Wisely: Towards an Intelligent Distribution of the Privacy Budget in Differentially Private Text Rewriting
Conference on Data and Application Security and Privacy (CODASPY), 2024
Stephen Meisenbacher
Chaeeun Joy Lee
Florian Matthes
263
1
0
28 Mar 2025
How do language models learn facts? Dynamics, curricula and hallucinations
Nicolas Zucchet
J. Bornschein
Stephanie C. Y. Chan
Andrew Kyle Lampinen
Razvan Pascanu
Soham De
KELM
HILM
LRM
367
19
1
27 Mar 2025
Gemma 3 Technical Report
Gemma Team
Aishwarya B Kamath
Johan Ferret
Shreya Pathak
Nino Vieillard
...
Harshal Tushar Lehri
Hussein Hazimeh
Ian Ballantyne
Idan Szpektor
Ivan Nardini
VLM
576
781
0
25 Mar 2025
Language Models May Verbatim Complete Text They Were Not Explicitly Trained On
Katja Filippova
Christopher A. Choquette-Choo
Matthew Jagielski
Peter Kairouz
Sanmi Koyejo
Abigail Z. Jacobs
Nicolas Papernot
472
12
0
21 Mar 2025
In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI
Shayne Longpre
Kevin Klyman
Ruth E. Appel
Sayash Kapoor
Rishi Bommasani
...
Victoria Westerhoff
Yacine Jernite
Rumman Chowdhury
Percy Liang
Arvind Narayanan
ELM
391
8
0
21 Mar 2025
Inspecting the Representation Manifold of Differentially-Private Text
Stefan Arnold
236
1
0
19 Mar 2025
Empirical Privacy Variance
Yuzheng Hu
Fan Wu
Ruicheng Xian
Yuhang Liu
Lydia Zakynthinou
Pritish Kamath
Chiyuan Zhang
David A. Forsyth
509
1
0
16 Mar 2025
Synthesizing Privacy-Preserving Text Data via Finetuning without Finetuning Billion-Scale LLMs
Bowen Tan
Zheng Xu
Eric P. Xing
Zhiting Hu
Shanshan Wu
SyDa
399
9
0
16 Mar 2025
PoisonedParrot: Subtle Data Poisoning Attacks to Elicit Copyright-Infringing Content from Large Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2025
Michael-Andrei Panaitescu-Liess
Pankayaraj Pathmanathan
Yigitcan Kaya
Zora Che
Bang An
Sicheng Zhu
Aakriti Agrawal
Furong Huang
AAML
357
2
0
10 Mar 2025
Privacy Auditing of Large Language Models
International Conference on Learning Representations (ICLR), 2025
Ashwinee Panda
Xinyu Tang
Milad Nasr
Christopher A. Choquette-Choo
Prateek Mittal
PILM
349
20
0
09 Mar 2025
Mitigating Memorization in LLMs using Activation Steering
Manan Suri
Nishit Anand
Amisha Bhaskar
LLMSV
345
7
0
08 Mar 2025
Machine Learners Should Acknowledge the Legal Implications of Large Language Models as Personal Data
Henrik Nolte
Michèle Finck
Kristof Meding
AILaw
PILM
451
2
0
03 Mar 2025
Towards Label-Only Membership Inference Attack against Pre-trained Large Language Models
Yu He
Boheng Li
Lu Liu
Zhongjie Ba
Wei Dong
Yiming Li
Zhan Qin
Kui Ren
Chong Chen
MIALM
466
17
0
26 Feb 2025
Merger-as-a-Stealer: Stealing Targeted PII from Aligned LLMs with Model Merging
Lin Lu
Zhigang Zuo
Ziji Sheng
Pan Zhou
MoMe
356
1
0
22 Feb 2025
Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Ivoline Ngong
Swanand Kadhe
Hao Wang
K. Murugesan
Justin D. Weisz
Amit Dhurandhar
Karthikeyan N. Ramamurthy
276
12
0
22 Feb 2025
Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Jaydeep Borkar
Matthew Jagielski
Katherine Lee
Niloofar Mireshghallah
David A. Smith
Christopher A. Choquette-Choo
PILM
689
6
0
21 Feb 2025
Generative AI Training and Copyright Law
Tim W. Dornis
Sebastian Stober
387
2
0
21 Feb 2025
The Canary's Echo: Auditing Privacy Risks of LLM-Generated Synthetic Text
Matthieu Meeus
Lukas Wutschitz
Santiago Zanella Béguelin
Shruti Tople
Reza Shokri
442
7
0
19 Feb 2025
Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks
Ang Li
Yin Zhou
Vethavikashini Chithrra Raghuram
Tom Goldstein
Micah Goldblum
AAML
347
34
0
12 Feb 2025
LLM Unlearning via Neural Activation Redirection
William F. Shen
Xinchi Qiu
Meghdad Kurmanji
Alex Iacob
Lorenzo Sani
Yihong Chen
Nicola Cancedda
Nicholas D. Lane
MU
359
13
0
11 Feb 2025
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations
Kaixuan Huang
Jiacheng Guo
Zihao Li
X. Ji
Jiawei Ge
...
Yangsibo Huang
Chi Jin
Xinyun Chen
Chiyuan Zhang
Mengdi Wang
AAML
LRM
654
53
0
10 Feb 2025
Can We Trust AI Benchmarks? An Interdisciplinary Review of Current Issues in AI Evaluation
Maria Eriksson
Erasmo Purificato
Arman Noroozian
Joao Vinagre
Guillaume Chaslot
Emilia Gomez
David Fernandez-Llorca
ELM
734
25
0
10 Feb 2025
On the Impact of Noise in Differentially Private Text Rewriting
North American Chapter of the Association for Computational Linguistics (NAACL), 2025
Stephen Meisenbacher
Maulik Chevli
Florian Matthes
219
5
0
31 Jan 2025
The Pitfalls of "Security by Obscurity" And What They Mean for Transparent AI
AAAI Conference on Artificial Intelligence (AAAI), 2025
Peter Hall
Olivia Mundahl
Sunoo Park
383
3
0
30 Jan 2025
Previous
1
2
3
4
5
6
Next