Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2110.02467
Cited By
BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models
6 October 2021
Kangjie Chen
Yuxian Meng
Xiaofei Sun
Shangwei Guo
Tianwei Zhang
Jiwei Li
Chun Fan
SILM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models"
50 / 76 papers shown
SteganoBackdoor: Stealthy and Data-Efficient Backdoor Attacks on Language Models
Eric Xue
Ruiyi Zhang
Zijun Zhang
AAML
224
0
0
18 Nov 2025
Unmasking Backdoors: An Explainable Defense via Gradient-Attention Anomaly Scoring for Pre-trained Language Models
Anindya Sundar Das
Kangjie Chen
M. Bhuyan
SILM
AAML
246
1
0
05 Oct 2025
Trigger Where It Hurts: Unveiling Hidden Backdoors through Sensitivity with Sensitron
Gejian Zhao
Hanzhou Wu
Xinpeng Zhang
211
0
0
23 Sep 2025
Backdoor Samples Detection Based on Perturbation Discrepancy Consistency in Pre-trained Language Models
Neural Networks (NN), 2025
Zuquan Peng
Jianming Fu
Lixin Zou
Li Zheng
Yanzhen Ren
Guojun Peng
AAML
179
0
0
30 Aug 2025
Pruning Strategies for Backdoor Defense in LLMs
Santosh Chapagain
S. M. Hamdi
S. F. Boubrahimi
AAML
165
5
0
27 Aug 2025
A Systematic Review of Poisoning Attacks Against Large Language Models
Neil Fendley
Edward W. Staley
Joshua Carney
William Redman
Marie Chau
Nathan G. Drenkow
AAML
PILM
285
8
0
06 Jun 2025
The Ripple Effect: On Unforeseen Complications of Backdoor Attacks
Rui Zhang
Yun Shen
Hongwei Li
Wenbo Jiang
Hanxiao Chen
Yuan Zhang
Guowen Xu
Yang Zhang
SILM
AAML
260
0
0
16 May 2025
BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models
Liang Luo
Hongwei Li
Rui Zhang
Wenbo Jiang
Kangjie Chen
Tianwei Zhang
Qingchuan Zhao
Guowen Xu
AAML
275
1
0
06 May 2025
GaussTrap: Stealthy Poisoning Attacks on 3D Gaussian Splatting for Targeted Scene Confusion
Jiaxin Hong
Sixu Chen
Shuoyang Sun
Hongyao Yu
Hao Fang
Yuqi Tan
Bin Chen
Shuhan Qi
Jiawei Li
3DGS
AAML
951
1
0
29 Apr 2025
The Ultimate Cookbook for Invisible Poison: Crafting Subtle Clean-Label Text Backdoors with Style Attributes
Wencong You
Daniel Lowd
351
1
0
24 Apr 2025
SSD: A State-based Stealthy Backdoor Attack For Navigation System in UAV Route Planning
Liang Luo
Yang Li
J.N. Zhang
Xingshuo Han
Kangbo Liu
Lyu Yang
yuan Zhou
Tianwei Zhang
Quan Pan
AAML
387
0
0
27 Feb 2025
Quantized Delta Weight Is Safety Keeper
Yule Liu
Zhen Sun
Xinlei He
Xinyi Huang
532
10
0
29 Nov 2024
New Emerged Security and Privacy of Pre-trained Model: a Survey and Outlook
Meng Yang
Tianqing Zhu
Chi Liu
Wanlei Zhou
Shui Yu
Philip S. Yu
AAML
ELM
PILM
357
2
0
12 Nov 2024
CAT: Concept-level backdoor ATtacks for Concept Bottleneck Models
Keming Wu
Jiayu Yang
Yu Huang
Lijie Hu
Tianlang Xue
Zhangyi Hu
Jiaxu Li
Haicheng Liao
Yutao Yue
409
2
0
07 Oct 2024
Obliviate: Neutralizing Task-agnostic Backdoors within the Parameter-efficient Fine-tuning Paradigm
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Jaehan Kim
Minkyoo Song
S. Na
Seungwon Shin
AAML
311
6
0
21 Sep 2024
The Dark Side of Human Feedback: Poisoning Large Language Models via User Inputs
Bocheng Chen
Hanqing Guo
Guangjing Wang
Yuanda Wang
Qiben Yan
AAML
352
10
0
01 Sep 2024
Rethinking Backdoor Detection Evaluation for Language Models
Jun Yan
Wenjie Jacky Mo
Xiang Ren
Robin Jia
ELM
375
5
0
31 Aug 2024
Turning Generative Models Degenerate: The Power of Data Poisoning Attacks
Shuli Jiang
S. Kadhe
Yi Zhou
Praneet Adusumilli
Ling Cai
Nathalie Baracaldo
SILM
AAML
325
12
0
17 Jul 2024
Hey, That's My Model! Introducing Chain & Hash, An LLM Fingerprinting Technique
M. Russinovich
Ahmed Salem
519
44
0
15 Jul 2024
Distributed Backdoor Attacks on Federated Graph Learning and Certified Defenses
Yuxin Yang
Qiang Li
Jinyuan Jia
Yuan Hong
Binghui Wang
AAML
FedML
265
23
0
12 Jul 2024
Defending Code Language Models against Backdoor Attacks with Deceptive Cross-Entropy Loss
Guang Yang
Yu Zhou
Xiang Chen
Xiangyu Zhang
Terry Yue Zhuo
David Lo
Taolue Chen
AAML
444
4
0
12 Jul 2024
Unique Security and Privacy Threats of Large Language Models: A Comprehensive Survey
Shang Wang
Tianqing Zhu
B. Liu
Ming Ding
Dayong Ye
Dayong Ye
Wanlei Zhou
PILM
490
22
0
12 Jun 2024
An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection
Shenao Yan
Shen Wang
Yue Duan
Hanbin Hong
Kiho Lee
Doowon Kim
Yuan Hong
AAML
SILM
271
59
0
10 Jun 2024
BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents
Yifei Wang
Dizhan Xue
Shengjie Zhang
Shengsheng Qian
AAML
LLMAG
342
102
0
05 Jun 2024
Cross-Context Backdoor Attacks against Graph Prompt Learning
Xiaoting Lyu
Yufei Han
Wei Wang
Hangwei Qian
Ivor Tsang
Xiangliang Zhang
SILM
AAML
258
24
0
28 May 2024
TrojFM: Resource-efficient Backdoor Attacks against Very Large Foundation Models
Yuzhou Nie
Yanting Wang
Jinyuan Jia
Michael J. De Lucia
Nathaniel D. Bastian
Wenbo Guo
Dawn Song
SILM
AAML
293
8
0
27 May 2024
SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks
Transactions of the Association for Computational Linguistics (TACL), 2024
Xuanli He
Xingliang Yuan
Jun Wang
Benjamin I. P. Rubinstein
Trevor Cohn
AAML
227
6
0
19 May 2024
BadEdit: Backdooring large language models by model editing
Yanzhou Li
Tianlin Li
Kangjie Chen
Jian Zhang
Shangqing Liu
Wenhan Wang
Tianwei Zhang
Yang Liu
SyDa
AAML
KELM
336
106
0
20 Mar 2024
WARDEN: Multi-Directional Backdoor Watermarks for Embedding-as-a-Service Copyright Protection
Anudeex Shetty
Yue Teng
Ke He
Xingliang Yuan
WaLM
366
15
0
03 Mar 2024
Double-I Watermark: Protecting Model Copyright for LLM Fine-tuning
Shen Li
Liuyi Yao
Jinyang Gao
Lan Zhang
Yaliang Li
609
28
0
22 Feb 2024
Purifying Large Language Models by Ensembling a Small Language Model
Tianlin Li
Qian Liu
Tianyu Pang
Chao Du
Qing Guo
Yang Liu
Min Lin
300
30
0
19 Feb 2024
Test-Time Backdoor Attacks on Multimodal Large Language Models
Dong Lu
Tianyu Pang
Chao Du
Qian Liu
Xianjun Yang
Min Lin
AAML
499
44
0
13 Feb 2024
OrderBkd: Textual backdoor attack through repositioning
Irina Alekseevskaia
Konstantin Arkhipenko
304
5
0
12 Feb 2024
Pre-trained Trojan Attacks for Visual Recognition
Aishan Liu
Xinwei Zhang
Yisong Xiao
Yuguang Zhou
Yaning Tan
Jinyang Guo
Xianglong Liu
Xiaochun Cao
Dacheng Tao
AAML
353
44
0
23 Dec 2023
Forcing Generative Models to Degenerate Ones: The Power of Data Poisoning Attacks
Shuli Jiang
S. Kadhe
Yi Zhou
Ling Cai
Nathalie Baracaldo
SILM
AAML
205
15
0
07 Dec 2023
Foundation Models for Weather and Climate Data Understanding: A Comprehensive Survey
Shengchao Chen
Guodong Long
Jing Jiang
Dikai Liu
Chengqi Zhang
SyDa
AI4CE
397
44
0
05 Dec 2023
Grounding Foundation Models through Federated Transfer Learning: A General Framework
ACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023
Weijing Chen
Tao Fan
Hanlin Gu
Xiaojin Zhang
Lixin Fan
Qiang Yang
AI4CE
645
32
0
29 Nov 2023
Beyond Boundaries: A Comprehensive Survey of Transferable Attacks on AI Systems
Guangjing Wang
Ce Zhou
Yuanda Wang
Bocheng Chen
Hanqing Guo
Qiben Yan
AAML
SILM
520
10
0
20 Nov 2023
TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Hengzhi Pei
Jinyuan Jia
Wenbo Guo
Yue Liu
Dawn Song
SILM
385
23
0
19 Nov 2023
Watermarking Vision-Language Pre-trained Models for Multi-modal Embedding as a Service
Yuanmin Tang
Jing Yu
Keke Gai
Xiangyang Qu
Yue Hu
Gang Xiong
Qi Wu
AAML
WaLM
VLM
252
12
0
10 Nov 2023
Last One Standing: A Comparative Analysis of Security and Privacy of Soft Prompt Tuning, LoRA, and In-Context Learning
Rui Wen
Tianhao Wang
Michael Backes
Yang Zhang
Ahmed Salem
AAML
249
18
0
17 Oct 2023
Privacy in Large Language Models: Attacks, Defenses and Future Directions
Haoran Li
Yulin Chen
Jinglong Luo
Weijing Chen
Xiaojin Zhang
Qi Hu
Chunkit Chan
Yangqiu Song
PILM
512
78
0
16 Oct 2023
AFLOW: Developing Adversarial Examples under Extremely Noise-limited Settings
Renyang Liu
Jinhong Zhang
Haoran Li
Jin Zhang
Yuanyu Wang
Wei Zhou
AAML
222
7
0
15 Oct 2023
Composite Backdoor Attacks Against Large Language Models
Hai Huang
Subrat Kishore Dutta
Michael Backes
Yun Shen
Yang Zhang
AAML
246
91
0
11 Oct 2023
Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
International Conference on Learning Representations (ICLR), 2023
Xiangyu Qi
Yi Zeng
Tinghao Xie
Pin-Yu Chen
Ruoxi Jia
Prateek Mittal
Peter Henderson
SILM
483
1,058
0
05 Oct 2023
PETA: Parameter-Efficient Trojan Attacks
Lauren Hong
Ting Wang
AAML
530
1
0
01 Oct 2023
Backdoor Attacks and Countermeasures in Natural Language Processing Models: A Comprehensive Security Review
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Pengzhou Cheng
Zongru Wu
Wei Du
Haodong Zhao
Wei Lu
Gongshen Liu
SILM
AAML
801
55
0
12 Sep 2023
A Comprehensive Overview of Backdoor Attacks in Large Language Models within Communication Networks
IEEE Network (IEEE Netw.), 2023
Haomiao Yang
Kunlan Xiang
Mengyu Ge
Hongwei Li
Rongxing Lu
Shui Yu
SILM
363
79
0
28 Aug 2023
LMSanitator: Defending Prompt-Tuning Against Task-Agnostic Backdoors
Network and Distributed System Security Symposium (NDSS), 2023
Chengkun Wei
Wenlong Meng
Zhikun Zhang
M. Chen
Ming-Hui Zhao
Wenjing Fang
Lei Wang
Zihui Zhang
Wenzhi Chen
AAML
260
19
0
26 Aug 2023
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and Vulnerabilities
Maximilian Mozes
Xuanli He
Bennett Kleinberg
Lewis D. Griffin
261
120
0
24 Aug 2023
1
2
Next
Page 1 of 2