Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.17826
Cited By
NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models
28 May 2023
Kai Mei
Zheng Li
Zhenting Wang
Yang Zhang
Shiqing Ma
AAML
SILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models"
39 / 39 papers shown
Title
NLP Security and Ethics, in the Wild
Heather Lent
Erick Galinkin
Yiyi Chen
Jens Myrup Pedersen
Leon Derczynski
Johannes Bjerva
SILM
42
0
0
09 Apr 2025
RAG-Thief: Scalable Extraction of Private Data from Retrieval-Augmented Generation Applications with Agent-based Attacks
Changyue Jiang
Xudong Pan
Geng Hong
Chenfu Bao
Min Yang
SILM
72
7
0
21 Nov 2024
Data-centric NLP Backdoor Defense from the Lens of Memorization
Zhenting Wang
Zhizhi Wang
Mingyu Jin
Mengnan Du
Juan Zhai
Shiqing Ma
29
3
0
21 Sep 2024
The Dark Side of Human Feedback: Poisoning Large Language Models via User Inputs
Bocheng Chen
Hanqing Guo
Guangjing Wang
Yuanda Wang
Qiben Yan
AAML
37
4
0
01 Sep 2024
MEGen: Generative Backdoor in Large Language Models via Model Editing
Jiyang Qiu
Xinbei Ma
Zhuosheng Zhang
Hai Zhao
AAML
KELM
SILM
23
3
0
20 Aug 2024
Can LLMs be Fooled? Investigating Vulnerabilities in LLMs
Sara Abdali
Jia He
C. Barberan
Richard Anarfi
29
7
0
30 Jul 2024
SeqMIA: Sequential-Metric Based Membership Inference Attack
Hao Li
Zheng Li
Siyuan Wu
Chengrui Hu
Yutong Ye
Min Zhang
Dengguo Feng
Yang Zhang
25
3
0
21 Jul 2024
Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models
Qingcheng Zeng
Mingyu Jin
Qinkai Yu
Zhenting Wang
Wenyue Hua
...
Felix Juefei Xu
Kaize Ding
Fan Yang
Ruixiang Tang
Yongfeng Zhang
AAML
31
9
0
15 Jul 2024
Unique Security and Privacy Threats of Large Language Model: A Comprehensive Survey
Shang Wang
Tianqing Zhu
Bo Liu
Ming Ding
Xu Guo
Dayong Ye
Wanlei Zhou
Philip S. Yu
PILM
52
16
0
12 Jun 2024
Preemptive Answer "Attacks" on Chain-of-Thought Reasoning
Rongwu Xu
Zehan Qi
Wei Xu
LRM
SILM
48
6
0
31 May 2024
Cross-Context Backdoor Attacks against Graph Prompt Learning
Xiaoting Lyu
Yufei Han
Wei Wang
Hangwei Qian
Ivor Tsang
Xiangliang Zhang
SILM
AAML
33
14
0
28 May 2024
Invisible Backdoor Attack against Self-supervised Learning
Hanrong Zhang
Zhenting Wang
Tingxu Han
Mingyu Jin
Chenlu Zhan
Mengnan Du
Hongwei Wang
Shiqing Ma
Hongwei Wang
Shiqing Ma
AAML
SSL
38
2
0
23 May 2024
BadActs: A Universal Backdoor Defense in the Activation Space
Biao Yi
Sishuo Chen
Yiming Li
Tong Li
Baolei Zhang
Zheli Liu
AAML
22
5
0
18 May 2024
Advances and Open Challenges in Federated Learning with Foundation Models
Chao Ren
Han Yu
Hongyi Peng
Xiaoli Tang
Anran Li
...
A. Tan
Bo Zhao
Xiaoxiao Li
Zengxiang Li
Qiang Yang
FedML
AIFin
AI4CE
68
4
0
23 Apr 2024
Dialectical Alignment: Resolving the Tension of 3H and Security Threats of LLMs
Shu Yang
Jiayuan Su
Han Jiang
Mengdi Li
Keyuan Cheng
Muhammad Asif Ali
Lijie Hu
Di Wang
16
5
0
30 Mar 2024
Shortcuts Arising from Contrast: Effective and Covert Clean-Label Attacks in Prompt-Based Learning
Xiaopeng Xie
Ming Yan
Xiwen Zhou
Chenlong Zhao
Suli Wang
Yong Zhang
Joey Tianyi Zhou
AAML
25
0
0
30 Mar 2024
Securing Large Language Models: Threats, Vulnerabilities and Responsible Practices
Sara Abdali
Richard Anarfi
C. Barberan
Jia He
PILM
65
22
0
19 Mar 2024
BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models
Zhen Xiang
Fengqing Jiang
Zidi Xiong
Bhaskar Ramasubramanian
Radha Poovendran
Bo Li
LRM
SILM
24
38
0
20 Jan 2024
TrojFSP: Trojan Insertion in Few-shot Prompt Tuning
Meng Zheng
Jiaqi Xue
Xun Chen
YanShan Wang
Qian Lou
Lei Jiang
AAML
17
6
0
16 Dec 2023
The Philosopher's Stone: Trojaning Plugins of Large Language Models
Tian Dong
Minhui Xue
Guoxing Chen
Rayne Holland
Shaofeng Li
Yan Meng
Zhen Liu
Haojin Zhu
AAML
13
9
0
01 Dec 2023
Grounding Foundation Models through Federated Transfer Learning: A General Framework
Yan Kang
Tao Fan
Hanlin Gu
Xiaojin Zhang
Lixin Fan
Qiang Yang
AI4CE
68
19
0
29 Nov 2023
Beyond Boundaries: A Comprehensive Survey of Transferable Attacks on AI Systems
Guangjing Wang
Ce Zhou
Yuanda Wang
Bocheng Chen
Hanqing Guo
Qiben Yan
AAML
SILM
48
3
0
20 Nov 2023
PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models
Hongwei Yao
Jian Lou
Zhan Qin
SILM
AAML
49
30
0
19 Oct 2023
Privacy in Large Language Models: Attacks, Defenses and Future Directions
Haoran Li
Yulin Chen
Jinglong Luo
Yan Kang
Xiaojin Zhang
Qi Hu
Chunkit Chan
Yangqiu Song
PILM
38
39
0
16 Oct 2023
PETA: Parameter-Efficient Trojan Attacks
Lauren Hong
Ting Wang
AAML
28
1
0
01 Oct 2023
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
Jiahao Yu
Xingwei Lin
Zheng Yu
Xinyu Xing
SILM
110
292
0
19 Sep 2023
Backdoor Attacks and Countermeasures in Natural Language Processing Models: A Comprehensive Security Review
Pengzhou Cheng
Zongru Wu
Wei Du
Haodong Zhao
Wei Lu
Gongshen Liu
SILM
AAML
18
16
0
12 Sep 2023
MasterKey: Automated Jailbreak Across Multiple Large Language Model Chatbots
Gelei Deng
Yi Liu
Yuekang Li
Kailong Wang
Ying Zhang
Zefeng Li
Haoyu Wang
Tianwei Zhang
Yang Liu
SILM
28
118
0
16 Jul 2023
Prompt Injection attack against LLM-integrated Applications
Yi Liu
Gelei Deng
Yuekang Li
Kailong Wang
Zihao Wang
...
Tianwei Zhang
Yepang Liu
Haoyu Wang
Yanhong Zheng
Yang Liu
SILM
10
310
0
08 Jun 2023
Alteration-free and Model-agnostic Origin Attribution of Generated Images
Zhenting Wang
Chen Chen
Yi Zeng
Lingjuan Lyu
Shiqing Ma
8
5
0
29 May 2023
BppAttack: Stealthy and Efficient Trojan Attacks against Deep Neural Networks via Image Quantization and Contrastive Adversarial Learning
Zhenting Wang
Juan Zhai
Shiqing Ma
AAML
116
97
0
26 May 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
203
1,651
0
15 Oct 2021
P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
Xiao Liu
Kaixuan Ji
Yicheng Fu
Weng Lam Tam
Zhengxiao Du
Zhilin Yang
Jie Tang
VLM
236
780
0
14 Oct 2021
Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer
Fanchao Qi
Yangyi Chen
Xurui Zhang
Mukai Li
Zhiyuan Liu
Maosong Sun
AAML
SILM
77
171
0
14 Oct 2021
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
241
1,898
0
31 Dec 2020
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
267
1,798
0
14 Dec 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
258
1,584
0
21 Jan 2020
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
Chen Zhu
Yu Cheng
Zhe Gan
S. Sun
Tom Goldstein
Jingjing Liu
AAML
211
430
0
25 Sep 2019
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
398
2,576
0
03 Sep 2019
1