ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.04401
  4. Cited By
Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft
  Prompting and Calibrated Confidence Estimation

Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation

Annual Meeting of the Association for Computational Linguistics (ACL), 2023
10 July 2023
Zhexin Zhang
Jiaxin Wen
Shiyu Huang
ArXiv (abs)PDFHTMLGithub (23★)

Papers citing "Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation"

24 / 24 papers shown
CoSPED: Consistent Soft Prompt Targeted Data Extraction and Defense
CoSPED: Consistent Soft Prompt Targeted Data Extraction and Defense
Yang Zhuochen
Fok Kar Wai
Thing Vrizlynn
AAMLSILM
309
0
0
13 Oct 2025
Current State in Privacy-Preserving Text Preprocessing for Domain-Agnostic NLP
Current State in Privacy-Preserving Text Preprocessing for Domain-Agnostic NLP
Abhirup Sinha
Pritilata Saha
Tithi Saha
AILaw
163
0
0
05 Aug 2025
SoK: The Privacy Paradox of Large Language Models: Advancements, Privacy Risks, and Mitigation
SoK: The Privacy Paradox of Large Language Models: Advancements, Privacy Risks, and MitigationACM Asia Conference on Computer and Communications Security (AsiaCCS), 2025
Yashothara Shanmugarasa
Ming Ding
M. Chamikara
Thierry Rakotoarivelo
PILMAILaw
555
21
0
15 Jun 2025
Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!
Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!
Zhexin Zhang
Yuhao Sun
Junxiao Yang
Shiyao Cui
Hongning Wang
Shiyu Huang
Minlie Huang
AAML
396
2
0
21 May 2025
How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study
How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study
Zhexin Zhang
Xian Qi Loye
Victor Shea-Jay Huang
Junxiao Yang
Qi Zhu
...
Fei Mi
Lifeng Shang
Yingkang Wang
Hongning Wang
Shiyu Huang
ReLMLRM
403
17
0
21 May 2025
AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement
AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement
Zhexin Zhang
Leqi Lei
Junxiao Yang
Xijie Huang
Yida Lu
...
Xianqi Lei
Changzai Pan
Lei Sha
Han Wang
Shiyu Huang
AAML
283
12
0
24 Feb 2025
R.R.: Unveiling LLM Training Privacy through Recollection and Ranking
R.R.: Unveiling LLM Training Privacy through Recollection and RankingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Wenlong Meng
Zhenyuan Guo
Lenan Wu
Chen Gong
Wenyan Liu
Weixian Li
Chengkun Wei
Wenzhi Chen
PILM
400
6
0
18 Feb 2025
Enhancing Safety in Reinforcement Learning with Human Feedback via Rectified Policy Optimization
Enhancing Safety in Reinforcement Learning with Human Feedback via Rectified Policy Optimization
Xiyue Peng
Hengquan Guo
Jiawei Zhang
Dongqing Zou
Ziyu Shao
Honghao Wei
Xin Liu
425
6
0
25 Oct 2024
MIBench: A Comprehensive Framework for Benchmarking Model Inversion Attack and Defense
MIBench: A Comprehensive Framework for Benchmarking Model Inversion Attack and Defense
Yixiang Qiu
Hongyao Yu
Hao Fang
Wenbo Yu
Wenbo Yu
Bin Chen
Shu-Tao Xia
Ke Xu
Ke Xu
AAML
339
1
0
07 Oct 2024
Undesirable Memorization in Large Language Models: A Survey
Undesirable Memorization in Large Language Models: A Survey
Ali Satvaty
Suzan Verberne
Fatih Turkmen
ELMPILM
715
27
0
03 Oct 2024
PII-Compass: Guiding LLM training data extraction prompts towards the target PII via grounding
PII-Compass: Guiding LLM training data extraction prompts towards the target PII via grounding
Krishna Kanth Nakka
Ahmed Frikha
Ricardo Mendes
Xue Jiang
Xuebing Zhou
502
26
0
03 Jul 2024
Unique Security and Privacy Threats of Large Language Models: A Comprehensive Survey
Unique Security and Privacy Threats of Large Language Models: A Comprehensive Survey
Shang Wang
Tianqing Zhu
B. Liu
Ming Ding
Dayong Ye
Dayong Ye
Wanlei Zhou
PILM
490
22
0
12 Jun 2024
Deconstructing The Ethics of Large Language Models from Long-standing
  Issues to New-emerging Dilemmas
Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas
Chengyuan Deng
Yiqun Duan
Xin Jin
Heng Chang
Yijun Tian
...
Kuofeng Gao
Sihong He
Jun Zhuang
Lu Cheng
Haohan Wang
AILaw
354
28
0
08 Jun 2024
Exploring the Privacy Protection Capabilities of Chinese Large Language
  Models
Exploring the Privacy Protection Capabilities of Chinese Large Language Models
Yuqi Yang
Xiaowen Huang
Jitao Sang
ELMPILMAILaw
256
1
0
27 Mar 2024
Concerned with Data Contamination? Assessing Countermeasures in Code
  Language Model
Concerned with Data Contamination? Assessing Countermeasures in Code Language Model
Jialun Cao
Wuqi Zhang
Shing-Chi Cheung
472
26
0
25 Mar 2024
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable
  Safety Detectors
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors
Zhexin Zhang
Yida Lu
Jingyuan Ma
Di Zhang
Rui Li
...
Hao Sun
Lei Sha
Zhifang Sui
Hongning Wang
Shiyu Huang
147
54
0
26 Feb 2024
A Survey on Large Language Model (LLM) Security and Privacy: The Good,
  the Bad, and the Ugly
A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the UglyHigh-Confidence Computing (HC), 2023
Yifan Yao
Jinhao Duan
Kaidi Xu
Yuanfang Cai
Eric Sun
Yue Zhang
PILMELM
669
1,127
0
04 Dec 2023
Unveiling the Implicit Toxicity in Large Language Models
Unveiling the Implicit Toxicity in Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jiaxin Wen
Pei Ke
Hao Sun
Zhexin Zhang
Chengfei Li
Jinfeng Bai
Shiyu Huang
180
59
0
29 Nov 2023
Defending Large Language Models Against Jailbreaking Attacks Through
  Goal Prioritization
Defending Large Language Models Against Jailbreaking Attacks Through Goal PrioritizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhexin Zhang
Junxiao Yang
Pei Ke
Fei Mi
Hongning Wang
Shiyu Huang
AAML
487
193
0
15 Nov 2023
The Janus Interface: How Fine-Tuning in Large Language Models Amplifies
  the Privacy Risks
The Janus Interface: How Fine-Tuning in Large Language Models Amplifies the Privacy RisksConference on Computer and Communications Security (CCS), 2023
Xiaoyi Chen
Siyuan Tang
Rui Zhu
Shijun Yan
Lei Jin
Zihao Wang
Liya Su
Zhikun Zhang
Luyi Xing
Haixu Tang
AAMLPILM
271
48
0
24 Oct 2023
Privacy in Large Language Models: Attacks, Defenses and Future
  Directions
Privacy in Large Language Models: Attacks, Defenses and Future Directions
Haoran Li
Yulin Chen
Jinglong Luo
Weijing Chen
Xiaojin Zhang
Qi Hu
Chunkit Chan
Yangqiu Song
PILM
512
78
0
16 Oct 2023
Identifying and Mitigating Privacy Risks Stemming from Language Models:
  A Survey
Identifying and Mitigating Privacy Risks Stemming from Language Models: A Survey
Victoria Smith
Ali Shahin Shamsabadi
Carolyn Ashurst
Adrian Weller
PILM
539
42
0
27 Sep 2023
SafetyBench: Evaluating the Safety of Large Language Models
SafetyBench: Evaluating the Safety of Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhexin Zhang
Leqi Lei
Lindong Wu
Rui Sun
Yongkang Huang
Chong Long
Xiao Liu
Xuanyu Lei
Jie Tang
Shiyu Huang
LRMLM&MAELM
383
196
0
13 Sep 2023
What Neural Networks Memorize and Why: Discovering the Long Tail via
  Influence Estimation
What Neural Networks Memorize and Why: Discovering the Long Tail via Influence EstimationNeural Information Processing Systems (NeurIPS), 2020
Vitaly Feldman
Chiyuan Zhang
TDI
720
602
0
09 Aug 2020
1
Page 1 of 1