Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.07646
Cited By
Quantifying Memorization Across Neural Language Models
15 February 2022
Nicholas Carlini
Daphne Ippolito
Matthew Jagielski
Katherine Lee
Florian Tramèr
Chiyuan Zhang
PILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Quantifying Memorization Across Neural Language Models"
50 / 131 papers shown
Title
GP-MoLFormer: A Foundation Model For Molecular Generation
Jerret Ross
Brian M. Belgodere
Samuel C. Hoffman
Vijil Chenthamarakshan
Youssef Mroueh
Payel Das
Payel Das
36
5
0
04 Apr 2024
Min-K%++: Improved Baseline for Detecting Pre-Training Data from Large Language Models
Jingyang Zhang
Jingwei Sun
Eric C. Yeats
Ouyang Yang
Martin Kuo
Jianyi Zhang
Hao Frank Yang
Hai Li
43
41
0
03 Apr 2024
Will GPT-4 Run DOOM?
Adrian de Wynter
LM&Ro
MLLM
41
5
0
08 Mar 2024
Membership Inference Attacks and Privacy in Topic Modeling
Nico Manzonelli
Wanrong Zhang
Salil P. Vadhan
37
1
0
07 Mar 2024
Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs
Aly M. Kassem
Omar Mahmoud
Niloofar Mireshghallah
Hyunwoo J. Kim
Yulia Tsvetkov
Yejin Choi
Sherif Saad
Santu Rana
50
18
0
05 Mar 2024
Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Rival Human Crowd Accuracy
P. Schoenegger
Indre Tuminauskaite
Peter S. Park
Rafael Valdece Sousa Bastos
P. Tetlock
35
24
0
29 Feb 2024
Black-Box Access is Insufficient for Rigorous AI Audits
Stephen Casper
Carson Ezell
Charlotte Siegmann
Noam Kolt
Taylor Lynn Curtis
...
Michael Gerovitch
David Bau
Max Tegmark
David M. Krueger
Dylan Hadfield-Menell
AAML
34
78
0
25 Jan 2024
LLM360: Towards Fully Transparent Open-Source LLMs
Zhengzhong Liu
Aurick Qiao
W. Neiswanger
Hongyi Wang
Bowen Tan
...
Zhiting Hu
Mark Schulze
Preslav Nakov
Timothy Baldwin
Eric P. Xing
40
70
0
11 Dec 2023
DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer
Junyuan Hong
Jiachen T. Wang
Chenhui Zhang
Zhangheng Li
Bo-wen Li
Zhangyang Wang
45
29
0
27 Nov 2023
Rethinking Benchmark and Contamination for Language Models with Rephrased Samples
Shuo Yang
Wei-Lin Chiang
Lianmin Zheng
Joseph E. Gonzalez
Ion Stoica
ALM
27
110
0
08 Nov 2023
Privately Aligning Language Models with Reinforcement Learning
Fan Wu
Huseyin A. Inan
A. Backurs
Varun Chandrasekaran
Janardhan Kulkarni
Robert Sim
29
6
0
25 Oct 2023
FLTrojan: Privacy Leakage Attacks against Federated Language Models Through Selective Weight Tampering
Md. Rafi Ur Rashid
Vishnu Asutosh Dasu
Kang Gu
Najrin Sultana
Shagufta Mehnaz
AAML
FedML
44
10
0
24 Oct 2023
Fundamental Limits of Membership Inference Attacks on Machine Learning Models
Eric Aubinais
Elisabeth Gassiat
Pablo Piantanida
MIACV
48
2
0
20 Oct 2023
Privacy Preserving Large Language Models: ChatGPT Case Study Based Vision and Framework
Imdad Ullah
Najm Hassan
S. Gill
Basem Suleiman
T. Ahanger
Zawar Shah
Junaid Qadir
S. Kanhere
40
16
0
19 Oct 2023
Unintended Memorization in Large ASR Models, and How to Mitigate It
Lun Wang
Om Thakkar
Rajiv Mathews
33
5
0
18 Oct 2023
A Confederacy of Models: a Comprehensive Evaluation of LLMs on Creative Writing
Carlos Gómez-Rodríguez
Paul Williams
29
65
0
12 Oct 2023
Beyond Memorization: Violating Privacy Via Inference with Large Language Models
Robin Staab
Mark Vero
Mislav Balunović
Martin Vechev
PILM
38
74
0
11 Oct 2023
FTFT: Efficient and Robust Fine-Tuning by Transferring Training Dynamics
Yupei Du
Albert Gatt
Dong Nguyen
24
1
0
10 Oct 2023
Forgetting Private Textual Sequences in Language Models via Leave-One-Out Ensemble
Zhe Liu
Ozlem Kalinli
MU
KELM
26
2
0
28 Sep 2023
Knowledge Sanitization of Large Language Models
Yoichi Ishibashi
Hidetoshi Shimodaira
KELM
29
19
0
21 Sep 2023
Recovering from Privacy-Preserving Masking with Large Language Models
A. Vats
Zhe Liu
Peng Su
Debjyoti Paul
Yingyi Ma
Yutong Pang
Zeeshan Ahmed
Ozlem Kalinli
29
9
0
12 Sep 2023
Quantifying and Analyzing Entity-level Memorization in Large Language Models
Zhenhong Zhou
Jiuyang Xiang
Chao-Yi Chen
Sen Su
PILM
38
8
0
30 Aug 2023
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
Lin Geng Foo
Hossein Rahmani
J. Liu
73
31
0
27 Aug 2023
Position: Key Claims in LLM Research Have a Long Tail of Footnotes
Anna Rogers
A. Luccioni
48
19
0
14 Aug 2023
SimplyRetrieve: A Private and Lightweight Retrieval-Centric Generative AI Tool
Youyang Ng
Daisuke Miyashita
Yasuto Hoshi
Yasuhiro Morioka
Osamu Torii
Tomoya Kodama
J. Deguchi
RALM
15
9
0
08 Aug 2023
Generative Models as a Complex Systems Science: How can we make sense of large language model behavior?
Ari Holtzman
Peter West
Luke Zettlemoyer
AI4CE
30
14
0
31 Jul 2023
Samplable Anonymous Aggregation for Private Federated Data Analysis
Kunal Talwar
Shan Wang
Audra McMillan
Vojta Jina
Vitaly Feldman
...
Congzheng Song
Karl Tarbe
Sebastian Vogt
L. Winstrom
Shundong Zhou
FedML
35
13
0
27 Jul 2023
What can we learn from Data Leakage and Unlearning for Law?
Jaydeep Borkar
PILM
MU
35
10
0
19 Jul 2023
Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation
Zhexin Zhang
Jiaxin Wen
Minlie Huang
30
29
0
10 Jul 2023
Deconstructing Data Reconstruction: Multiclass, Weight Decay and General Losses
G. Buzaglo
Niv Haim
Gilad Yehudai
Gal Vardi
Yakir Oz
Yaniv Nikankin
Michal Irani
28
10
0
04 Jul 2023
Gradients Look Alike: Sensitivity is Often Overestimated in DP-SGD
Anvith Thudi
Hengrui Jia
Casey Meehan
Ilia Shumailov
Nicolas Papernot
24
3
0
01 Jul 2023
Understanding and Mitigating Copying in Diffusion Models
Gowthami Somepalli
Vasu Singla
Micah Goldblum
Jonas Geiping
Tom Goldstein
DiffM
20
125
0
31 May 2023
Jointly Reparametrized Multi-Layer Adaptation for Efficient and Private Tuning
Umang Gupta
Aram Galstyan
Greg Ver Steeg
6
2
0
30 May 2023
Differentially Private Synthetic Data via Foundation Model APIs 1: Images
Zi-Han Lin
Sivakanth Gopi
Janardhan Kulkarni
Harsha Nori
Sergey Yekhanin
41
36
0
24 May 2023
Token-wise Decomposition of Autoregressive Language Model Hidden States for Analyzing Model Predictions
Byung-Doh Oh
William Schuler
29
2
0
17 May 2023
PaLM 2 Technical Report
Rohan Anil
Andrew M. Dai
Orhan Firat
Melvin Johnson
Dmitry Lepikhin
...
Ce Zheng
Wei Zhou
Denny Zhou
Slav Petrov
Yonghui Wu
ReLM
LRM
80
1,147
0
17 May 2023
Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks
Alon Jacovi
Avi Caciularu
Omer Goldman
Yoav Goldberg
17
95
0
17 May 2023
Emergent and Predictable Memorization in Large Language Models
Stella Biderman
USVSN Sai Prashanth
Lintang Sutawika
Hailey Schoelkopf
Quentin G. Anthony
Shivanshu Purohit
Edward Raf
24
116
0
21 Apr 2023
Towards Generating Functionally Correct Code Edits from Natural Language Issue Descriptions
Sarah Fakhoury
Saikat Chakraborty
Madan Musuvathi
Shuvendu K. Lahiri
38
21
0
07 Apr 2023
BloombergGPT: A Large Language Model for Finance
Shijie Wu
Ozan Irsoy
Steven Lu
Vadim Dabravolski
Mark Dredze
Sebastian Gehrmann
P. Kambadur
David S. Rosenberg
Gideon Mann
AIFin
71
785
0
30 Mar 2023
Recognition, recall, and retention of few-shot memories in large language models
A. Orhan
LRM
KELM
CLL
36
3
0
30 Mar 2023
Hallucinations in Large Multilingual Translation Models
Nuno M. Guerreiro
Duarte M. Alves
Jonas Waldendorf
Barry Haddow
Alexandra Birch
Pierre Colombo
André F.T. Martins
VLM
HILM
LRM
22
140
0
28 Mar 2023
On the Creativity of Large Language Models
Giorgio Franceschelli
Mirco Musolesi
72
51
0
27 Mar 2023
Koala: An Index for Quantifying Overlaps with Pre-training Corpora
Thuy-Trang Vu
Xuanli He
Gholamreza Haffari
Ehsan Shareghi
CLL
21
12
0
26 Mar 2023
The Shaky Foundations of Clinical Foundation Models: A Survey of Large Language Models and Foundation Models for EMRs
Michael Wornow
Yizhe Xu
Rahul Thapa
Birju S. Patel
E. Steinberg
Scott L. Fleming
M. Pfeffer
Jason Alan Fries
N. Shah
LM&MA
23
32
0
22 Mar 2023
A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT
Yihan Cao
Siyu Li
Yixin Liu
Zhiling Yan
Yutong Dai
Philip S. Yu
Lichao Sun
29
506
0
07 Mar 2023
Tight Auditing of Differentially Private Machine Learning
Milad Nasr
Jamie Hayes
Thomas Steinke
Borja Balle
Florian Tramèr
Matthew Jagielski
Nicholas Carlini
Andreas Terzis
FedML
32
52
0
15 Feb 2023
Extracting Training Data from Diffusion Models
Nicholas Carlini
Jamie Hayes
Milad Nasr
Matthew Jagielski
Vikash Sehwag
Florian Tramèr
Borja Balle
Daphne Ippolito
Eric Wallace
DiffM
63
569
0
30 Jan 2023
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Emmanuel Abbe
Samy Bengio
Aryo Lotfi
Kevin Rizk
LRM
39
48
0
30 Jan 2023
SingSong: Generating musical accompaniments from singing
Chris Donahue
Antoine Caillon
Adam Roberts
Ethan Manilow
P. Esling
...
Mauro Verzetti
Ian Simon
Olivier Pietquin
Neil Zeghidour
Jesse Engel
32
52
0
30 Jan 2023
Previous
1
2
3
Next