Unique Security and Privacy Threats of Large Language Model: A Comprehensive Survey

12 June 2024

Shang Wang

Tianqing Zhu

Bo Liu

Philip S. Yu

Papers citing "Unique Security and Privacy Threats of Large Language Model: A Comprehensive Survey"

29 / 29 papers shown

Title
LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures Francisco Aguilera-Martínez Fernando Berzal PILM 45 0 0 02 May 2025
VeriDebug: A Unified LLM for Verilog Debugging via Contrastive Embedding and Guided Correction N. Wang Bingkun Yao Jie Zhou Yuchen Hu Xi Wang Nan Guan Zhe Jiang 29 0 0 27 Apr 2025
SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors Bohan Lyu Siqiao Huang Zichen Liang Qi-An Sun Jiaming Zhang ELM LRM 38 0 0 16 Feb 2025
Typos that Broke the RAG's Back: Genetic Attack on RAG Pipeline by Simulating Documents in the Wild via Low-level Perturbations Sukmin Cho Soyeong Jeong Jeongyeon Seo Taeho Hwang Jong C. Park SILM AAML 35 26 0 22 Apr 2024
BadEdit: Backdooring large language models by model editing Yanzhou Li Tianlin Li Kangjie Chen Jian Zhang Shangqing Liu Wenhan Wang Tianwei Zhang Yang Liu SyDa AAML KELM 39 47 0 20 Mar 2024
Securing Large Language Models: Threats, Vulnerabilities and Responsible Practices Sara Abdali Richard Anarfi C. Barberan Jia He PILM 40 22 0 19 Mar 2024
On Protecting the Data Privacy of Large Language Models (LLMs): A Survey Biwei Yan Kun Li Minghui Xu Yueyan Dong Yue Zhang Zhaochun Ren Xiuzhen Cheng AILaw PILM 70 70 0 08 Mar 2024
Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey Zhichen Dong Zhanhui Zhou Chao Yang Jing Shao Yu Qiao ELM 34 18 0 14 Feb 2024
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems Tianyu Cui Yanling Wang Chuanpu Fu Yong Xiao Sijia Li ... Junwu Xiong Xinyu Kong Zujie Wen Ke Xu Qi Li 42 22 0 11 Jan 2024
Large Language Models Can Be Good Privacy Protection Learners Yijia Xiao Yiqiao Jin Yushi Bai Yue Wu Xianjun Yang ... Xujiang Zhao Yanchi Liu Haifeng Chen Wei Wang Wei Cheng PILM 95 17 0 03 Oct 2023
Who's Harry Potter? Approximate Unlearning in LLMs Ronen Eldan M. Russinovich MU MoMe 98 171 0 03 Oct 2023
Sentence Embedding Leaks More Information than You Expect: Generative Embedding Inversion Attack to Recover the Whole Sentence Haoran Li Mingshi Xu Yangqiu Song 49 26 0 04 May 2023
Robust Multi-bit Natural Language Watermarking through Invariant Features Kiyoon Yoo Wonhyuk Ahn Jiho Jang Nojun Kwak WaLM 128 47 0 03 May 2023
Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models Shuai Zhao Jinming Wen Anh Tuan Luu J. Zhao Jie Fu SILM 51 88 0 02 May 2023
Poisoning Language Models During Instruction Tuning Alexander Wan Eric Wallace Sheng Shen Dan Klein SILM 90 124 0 01 May 2023
ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger Jiazhao Li Yijin Yang Zhuofeng Wu V. Vydiswaran Chaowei Xiao SILM 33 27 0 27 Apr 2023
Stealing the Decoding Algorithms of Language Models A. Naseh Kalpesh Krishna Mohit Iyyer Amir Houmansadr MLAU 42 20 0 08 Mar 2023
Improving alignment of dialogue agents via targeted human judgements Amelia Glaese Nat McAleese Maja Trkebacz John Aslanides Vlad Firoiu ... John F. J. Mellor Demis Hassabis Koray Kavukcuoglu Lisa Anne Hendricks G. Irving ALM AAML 217 495 0 28 Sep 2022
Defending Against Stealthy Backdoor Attacks Sangeet Sagar Abhinav Bhatt Abhijith Srinivas Bidaralli AAML 36 3 0 27 May 2022
You Don't Know My Favorite Color: Preventing Dialogue Representations from Revealing Speakers' Private Personas Haoran Li Yangqiu Song Lixin Fan 51 17 0 26 Apr 2022
Protecting Intellectual Property of Language Generation APIs with Lexical Watermark Xuanli He Qiongkai Xu Lingjuan Lyu Fangzhao Wu Chenguang Wang WaLM 155 76 0 05 Dec 2021
P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks Xiao Liu Kaixuan Ji Yicheng Fu Weng Lam Tam Zhengxiao Du Zhilin Yang Jie Tang VLM 228 780 0 14 Oct 2021
Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer Fanchao Qi Yangyi Chen Xurui Zhang Mukai Li Zhiyuan Liu Maosong Sun AAML SILM 75 171 0 14 Oct 2021
Text Detoxification using Large Pre-trained Neural Models David Dale Anton Voronov Daryna Dementieva V. Logacheva Olga Kozlova Nikita Semenov Alexander Panchenko 31 71 0 18 Sep 2021
Challenges in Detoxifying Language Models Johannes Welbl Amelia Glaese J. Uesato Sumanth Dathathri John F. J. Mellor Lisa Anne Hendricks Kirsty Anderson Pushmeet Kohli Ben Coppin Po-Sen Huang LM&MA 236 191 0 15 Sep 2021
The Power of Scale for Parameter-Efficient Prompt Tuning Brian Lester Rami Al-Rfou Noah Constant VPVLM 275 3,784 0 18 Apr 2021
Gradient-based Adversarial Attacks against Text Transformers Chuan Guo Alexandre Sablayrolles Hervé Jégou Douwe Kiela SILM 93 162 0 15 Apr 2021
Extracting Training Data from Large Language Models Nicholas Carlini Florian Tramèr Eric Wallace Matthew Jagielski Ariel Herbert-Voss ... Tom B. Brown D. Song Ulfar Erlingsson Alina Oprea Colin Raffel MLAU SILM 261 1,386 0 14 Dec 2020
Fine-Tuning Language Models from Human Preferences Daniel M. Ziegler Nisan Stiennon Jeff Wu Tom B. Brown Alec Radford Dario Amodei Paul Christiano G. Irving ALM 273 1,561 0 18 Sep 2019