ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.02936
  4. Cited By
Min-K%++: Improved Baseline for Detecting Pre-Training Data from Large Language Models

Min-K%++: Improved Baseline for Detecting Pre-Training Data from Large Language Models

3 April 2024
Jingyang Zhang
Jingwei Sun
Eric C. Yeats
Ouyang Yang
Martin Kuo
Jianyi Zhang
Hao Frank Yang
Hai Li
ArXivPDFHTML

Papers citing "Min-K%++: Improved Baseline for Detecting Pre-Training Data from Large Language Models"

30 / 30 papers shown
Title
Automatic Calibration for Membership Inference Attack on Large Language Models
Automatic Calibration for Membership Inference Attack on Large Language Models
Saleh Zare Zade
Yao Qiang
Xiangyu Zhou
Hui Zhu
Mohammad Amin Roshani
Prashant Khanduri
Dongxiao Zhu
32
1
0
06 May 2025
Beyond Public Access in LLM Pre-Training Data
Beyond Public Access in LLM Pre-Training Data
Sruly Rosenblat
Tim O'Reilly
Ilan Strauss
MLAU
55
0
0
24 Apr 2025
STAMP Your Content: Proving Dataset Membership via Watermarked Rephrasings
STAMP Your Content: Proving Dataset Membership via Watermarked Rephrasings
Saksham Rastogi
Pratyush Maini
Danish Pruthi
42
0
0
18 Apr 2025
Evidencing Unauthorized Training Data from AI Generated Content using Information Isotopes
Evidencing Unauthorized Training Data from AI Generated Content using Information Isotopes
Qi Tao
Yin Jinhua
Cai Dongqi
Xie Yueqi
Wang Huili
...
Zhou Zhili
Wang Shangguang
Lyu Lingjuan
Huang Yongfeng
Lane Nicholas
35
0
0
24 Mar 2025
Learning on LLM Output Signatures for gray-box LLM Behavior Analysis
Learning on LLM Output Signatures for gray-box LLM Behavior Analysis
Guy Bar-Shalom
Fabrizio Frasca
Derek Lim
Yoav Gelberg
Yftah Ziser
Ran El-Yaniv
Gal Chechik
Haggai Maron
62
0
0
18 Mar 2025
Towards Label-Only Membership Inference Attack against Pre-trained Large Language Models
Towards Label-Only Membership Inference Attack against Pre-trained Large Language Models
Yu He
Boheng Li
L. Liu
Zhongjie Ba
Wei Dong
Yiming Li
Z. Qin
Kui Ren
C. L. P. Chen
MIALM
69
0
0
26 Feb 2025
Synthetic Data Can Mislead Evaluations: Membership Inference as Machine Text Detection
Synthetic Data Can Mislead Evaluations: Membership Inference as Machine Text Detection
Ali Naseh
Niloofar Mireshghallah
51
0
0
20 Jan 2025
Synthetic Data Privacy Metrics
Synthetic Data Privacy Metrics
Amy Steier
Lipika Ramaswamy
Andre Manoel
Alexa Haushalter
41
0
0
08 Jan 2025
INCLUDE: Evaluating Multilingual Language Understanding with Regional
  Knowledge
INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
Angelika Romanou
Negar Foroutan
Anna Sotnikova
Zeming Chen
Sree Harsha Nelaturu
...
Mike Zhang
Imanol Schlag
Marzieh Fadaee
Sara Hooker
Antoine Bosselut
ELM
105
6
0
29 Nov 2024
TDDBench: A Benchmark for Training data detection
TDDBench: A Benchmark for Training data detection
Zhihao Zhu
Yi Yang
Defu Lian
44
0
0
05 Nov 2024
Beware of Calibration Data for Pruning Large Language Models
Beware of Calibration Data for Pruning Large Language Models
Yixin Ji
Yang Xiang
Juntao Li
Qingrong Xia
Ping Li
Xinyu Duan
Zhefeng Wang
Min Zhang
34
2
0
23 Oct 2024
Detecting Training Data of Large Language Models via Expectation Maximization
Detecting Training Data of Large Language Models via Expectation Maximization
Gyuwan Kim
Yang Li
Evangelia Spiliopoulou
Jie Ma
Miguel Ballesteros
William Yang Wang
MIALM
90
3
2
10 Oct 2024
Clean Evaluations on Contaminated Visual Language Models
Clean Evaluations on Contaminated Visual Language Models
Hongyuan Lu
Shujie Miao
Wai Lam
MLLM
33
0
0
09 Oct 2024
Fine-tuning can Help Detect Pretraining Data from Large Language Models
Fine-tuning can Help Detect Pretraining Data from Large Language Models
H. Zhang
Songxin Zhang
Bingyi Jing
Hongxin Wei
34
0
0
09 Oct 2024
Position: LLM Unlearning Benchmarks are Weak Measures of Progress
Position: LLM Unlearning Benchmarks are Weak Measures of Progress
Pratiksha Thaker
Shengyuan Hu
Neil Kale
Yash Maurya
Zhiwei Steven Wu
Virginia Smith
MU
45
10
0
03 Oct 2024
Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data
Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data
Jie Zhang
Debeshee Das
Gautam Kamath
Florian Tramèr
MIALM
MIACV
223
16
1
29 Sep 2024
Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method
Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method
Weichao Zhang
Ruqing Zhang
Jiafeng Guo
Maarten de Rijke
Yixing Fan
Xueqi Cheng
28
7
0
23 Sep 2024
Context-Aware Membership Inference Attacks against Pre-trained Large
  Language Models
Context-Aware Membership Inference Attacks against Pre-trained Large Language Models
Hongyan Chang
Ali Shahin Shamsabadi
Kleomenis Katevas
Hamed Haddadi
Reza Shokri
MIALM
55
6
0
11 Sep 2024
Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding
Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding
Cheng Wang
Yiwei Wang
Bryan Hooi
Yujun Cai
Nanyun Peng
Kai-Wei Chang
37
2
0
05 Sep 2024
Exposing Privacy Gaps: Membership Inference Attack on Preference Data for LLM Alignment
Exposing Privacy Gaps: Membership Inference Attack on Preference Data for LLM Alignment
Qizhang Feng
Siva Rajesh Kasa
Santhosh Kumar Kasa
Hyokun Yun
C. Teo
S. Bodapati
84
6
0
08 Jul 2024
ObfuscaTune: Obfuscated Offsite Fine-tuning and Inference of Proprietary LLMs on Private Datasets
ObfuscaTune: Obfuscated Offsite Fine-tuning and Inference of Proprietary LLMs on Private Datasets
Ahmed Frikha
Nassim Walha
Ricardo Mendes
K. K. Nakka
Xue Jiang
Xuebing Zhou
66
2
0
03 Jul 2024
ReCaLL: Membership Inference via Relative Conditional Log-Likelihoods
ReCaLL: Membership Inference via Relative Conditional Log-Likelihoods
Roy Xie
Junlin Wang
Ruomin Huang
Minxing Zhang
Rong Ge
Jian Pei
Neil Zhenqiang Gong
Bhuwan Dhingra
MIALM
40
11
0
23 Jun 2024
Blind Baselines Beat Membership Inference Attacks for Foundation Models
Blind Baselines Beat Membership Inference Attacks for Foundation Models
Debeshee Das
Jie Zhang
Florian Tramèr
MIALM
72
28
1
23 Jun 2024
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language
  Models
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models
Zhuoran Jin
Pengfei Cao
Chenhao Wang
Zhitao He
Hongbang Yuan
Jiachun Li
Yubo Chen
Kang Liu
Jun Zhao
KELM
MU
37
12
0
16 Jun 2024
Semantic Membership Inference Attack against Large Language Models
Semantic Membership Inference Attack against Large Language Models
Hamid Mozaffari
Virendra J. Marathe
MIALM
45
3
0
14 Jun 2024
Is My Data in Your Retrieval Database? Membership Inference Attacks Against Retrieval Augmented Generation
Is My Data in Your Retrieval Database? Membership Inference Attacks Against Retrieval Augmented Generation
Maya Anderson
Guy Amit
Abigail Goldsteen
AAML
37
13
0
30 May 2024
Pandora's White-Box: Precise Training Data Detection and Extraction in
  Large Language Models
Pandora's White-Box: Precise Training Data Detection and Extraction in Large Language Models
Jeffrey G. Wang
Jason Wang
Marvin Li
Seth Neel
MIALM
46
0
0
26 Feb 2024
Rethinking Machine Unlearning for Large Language Models
Rethinking Machine Unlearning for Large Language Models
Sijia Liu
Yuanshun Yao
Jinghan Jia
Stephen Casper
Nathalie Baracaldo
...
Hang Li
Kush R. Varshney
Mohit Bansal
Sanmi Koyejo
Yang Liu
AILaw
MU
65
81
0
13 Feb 2024
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
248
1,986
0
31 Dec 2020
Extracting Training Data from Large Language Models
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
267
1,808
0
14 Dec 2020
1