Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.12457
Cited By
A backdoor attack against LSTM-based text classification systems
29 May 2019
Jiazhu Dai
Chuanshuai Chen
SILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A backdoor attack against LSTM-based text classification systems"
50 / 195 papers shown
Title
On the Relevance of Byzantine Robust Optimization Against Data Poisoning
Sadegh Farhadkhani
R. Guerraoui
Nirupam Gupta
Rafael Pinot
AAML
43
1
0
01 May 2024
A Clean-graph Backdoor Attack against Graph Convolutional Networks with Poisoned Label Only
Jiazhu Dai
Haoyu Sun
AAML
49
2
0
19 Apr 2024
SpamDam: Towards Privacy-Preserving and Adversary-Resistant SMS Spam Detection
Yekai Li
Rufan Zhang
Wenxin Rong
Xianghang Mi
45
2
0
15 Apr 2024
Backdoor Attack on Multilingual Machine Translation
Jun Wang
Qiongkai Xu
Xuanli He
Benjamin I. P. Rubinstein
Trevor Cohn
26
5
0
03 Apr 2024
Two Heads are Better than One: Nested PoE for Robust Defense Against Multi-Backdoors
Victoria Graf
Qin Liu
Muhao Chen
AAML
40
8
0
02 Apr 2024
Shortcuts Arising from Contrast: Effective and Covert Clean-Label Attacks in Prompt-Based Learning
Xiaopeng Xie
Ming Yan
Xiwen Zhou
Chenlong Zhao
Suli Wang
Yong Zhang
Joey Tianyi Zhou
AAML
42
0
0
30 Mar 2024
Task-Agnostic Detector for Insertion-Based Backdoor Attacks
Weimin Lyu
Xiao Lin
Songzhu Zheng
Lu Pang
Haibin Ling
Susmit Jha
Chao Chen
51
25
0
25 Mar 2024
WARDEN: Multi-Directional Backdoor Watermarks for Embedding-as-a-Service Copyright Protection
Anudeex Shetty
Yue Teng
Ke He
Qiongkai Xu
WaLM
30
5
0
03 Mar 2024
Here's a Free Lunch: Sanitizing Backdoored Models with Model Merge
Ansh Arora
Xuanli He
Maximilian Mozes
Srinibas Swain
Mark Dras
Qiongkai Xu
SILM
MoMe
AAML
58
13
0
29 Feb 2024
Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment
Jiong Wang
Jiazhao Li
Yiquan Li
Xiangyu Qi
Junjie Hu
Yixuan Li
P. McDaniel
Muhao Chen
Bo Li
Chaowei Xiao
AAML
SILM
40
18
0
22 Feb 2024
Learning to Poison Large Language Models During Instruction Tuning
Yao Qiang
Xiangyu Zhou
Saleh Zare Zade
Mohammad Amin Roshani
Douglas Zytko
Dongxiao Zhu
AAML
SILM
45
21
0
21 Feb 2024
Defending Against Weight-Poisoning Backdoor Attacks for Parameter-Efficient Fine-Tuning
Shuai Zhao
Leilei Gan
Anh Tuan Luu
Jie Fu
Lingjuan Lyu
Meihuizi Jia
Jinming Wen
AAML
26
23
0
19 Feb 2024
Acquiring Clean Language Models from Backdoor Poisoned Datasets by Downscaling Frequency Space
Zongru Wu
Zhuosheng Zhang
Pengzhou Cheng
Gongshen Liu
AAML
49
4
0
19 Feb 2024
Instruction Backdoor Attacks Against Customized LLMs
Rui Zhang
Hongwei Li
Rui Wen
Wenbo Jiang
Yuan Zhang
Michael Backes
Yun Shen
Yang Zhang
AAML
SILM
35
25
0
14 Feb 2024
Test-Time Backdoor Attacks on Multimodal Large Language Models
Dong Lu
Tianyu Pang
Chao Du
Qian Liu
Xianjun Yang
Min Lin
AAML
63
21
0
13 Feb 2024
OrderBkd: Textual backdoor attack through repositioning
Irina Alekseevskaia
Konstantin Arkhipenko
30
2
0
12 Feb 2024
Instructional Fingerprinting of Large Language Models
Lyne Tchapmi
Fei Wang
Mingyu Derek Ma
Pang Wei Koh
Chaowei Xiao
Muhao Chen
WaLM
22
29
0
21 Jan 2024
Object-oriented backdoor attack against image captioning
Meiling Li
Nan Zhong
Xinpeng Zhang
Zhenxing Qian
Sheng Li
26
8
0
05 Jan 2024
Effective backdoor attack on graph neural networks in link prediction tasks
Jiazhu Dai
Haoyu Sun
GNN
61
3
0
05 Jan 2024
Punctuation Matters! Stealthy Backdoor Attack for Language Models
Xuan Sheng
Zhicheng Li
Zhaoyang Han
Xiangmao Chang
Piji Li
43
3
0
26 Dec 2023
Unveiling Backdoor Risks Brought by Foundation Models in Heterogeneous Federated Learning
Xi Li
Chen Henry Wu
Jiaqi Wang
AAML
59
5
0
30 Nov 2023
Efficient Trigger Word Insertion
Yueqi Zeng
Ziqiang Li
Pengfei Xia
Lei Liu
Bin Li
AAML
23
5
0
23 Nov 2023
TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Hengzhi Pei
Jinyuan Jia
Wenbo Guo
Bo Li
Dawn Song
SILM
23
11
0
19 Nov 2023
RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language Models
Jiong Wang
Junlin Wu
Muhao Chen
Yevgeniy Vorobeychik
Chaowei Xiao
AAML
29
13
0
16 Nov 2023
Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations
Wenjie Mo
Lyne Tchapmi
Qin Liu
Jiong Wang
Jun Yan
Chaowei Xiao
Muhao Chen
Muhao Chen
AAML
67
18
0
16 Nov 2023
Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections
Yuanpu Cao
Bochuan Cao
Jinghui Chen
34
24
0
15 Nov 2023
Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
Sam Toyer
Olivia Watkins
Ethan Mendes
Justin Svegliato
Luke Bailey
...
Karim Elmaaroufi
Pieter Abbeel
Trevor Darrell
Alan Ritter
Stuart J. Russell
30
71
0
02 Nov 2023
Backdoor Threats from Compromised Foundation Models to Federated Learning
Xi Li
Songhe Wang
Chen Henry Wu
Hao Zhou
Jiaqi Wang
110
10
0
31 Oct 2023
Setting the Trap: Capturing and Defeating Backdoors in Pretrained Language Models through Honeypots
Ruixiang Tang
Jiayi Yuan
Yiming Li
Zirui Liu
Rui Chen
Xia Hu
AAML
41
13
0
28 Oct 2023
Large Language Models Are Better Adversaries: Exploring Generative Clean-Label Backdoor Attacks Against Text Classifiers
Wencong You
Zayd Hammoudeh
Daniel Lowd
AAML
32
12
0
28 Oct 2023
Attention-Enhancing Backdoor Attacks Against BERT-based Models
Weimin Lyu
Songzhu Zheng
Lu Pang
Haibin Ling
Chao Chen
32
35
0
23 Oct 2023
Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
Xiangyu Qi
Yi Zeng
Tinghao Xie
Pin-Yu Chen
Ruoxi Jia
Prateek Mittal
Peter Henderson
SILM
70
535
0
05 Oct 2023
PETA: Parameter-Efficient Trojan Attacks
Lauren Hong
Ting Wang
AAML
52
1
0
01 Oct 2023
The Trickle-down Impact of Reward (In-)consistency on RLHF
Lingfeng Shen
Sihao Chen
Linfeng Song
Lifeng Jin
Baolin Peng
Haitao Mi
Daniel Khashabi
Dong Yu
42
21
0
28 Sep 2023
Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks
Zhaohan Xi
Tianyu Du
Changjiang Li
Ren Pang
S. Ji
Jinghui Chen
Fenglong Ma
Ting Wang
AAML
16
29
0
23 Sep 2023
Backdoor Attacks and Countermeasures in Natural Language Processing Models: A Comprehensive Security Review
Pengzhou Cheng
Zongru Wu
Wei Du
Haodong Zhao
Wei Lu
Gongshen Liu
SILM
AAML
44
18
0
12 Sep 2023
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and Vulnerabilities
Maximilian Mozes
Xuanli He
Bennett Kleinberg
Lewis D. Griffin
44
78
0
24 Aug 2023
Temporal-Distributed Backdoor Attack Against Video Based Action Recognition
Xi Li
Songhe Wang
Rui Huang
Mahanth K. Gowda
G. Kesidis
AAML
52
6
0
21 Aug 2023
Backdoor Mitigation by Correcting the Distribution of Neural Activations
Xi Li
Zhen Xiang
David J. Miller
G. Kesidis
AAML
18
0
0
18 Aug 2023
TIJO: Trigger Inversion with Joint Optimization for Defending Multimodal Backdoored Models
Indranil Sur
Karan Sikka
Matthew Walmer
K. Koneripalli
Anirban Roy
Xiaoyu Lin
Ajay Divakaran
Susmit Jha
32
8
0
07 Aug 2023
ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP
Lu Yan
Zhuo Zhang
Guanhong Tao
Kaiyuan Zhang
Xuan Chen
Guangyu Shen
Xiangyu Zhang
AAML
SILM
62
16
0
04 Aug 2023
Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection
Jun Yan
Vikas Yadav
Shiyang Li
Lichang Chen
Zheng Tang
Hai Wang
Vijay Srinivasan
Xiang Ren
Hongxia Jin
SILM
28
85
0
31 Jul 2023
Backdoor Attacks for In-Context Learning with Language Models
Nikhil Kandpal
Matthew Jagielski
Florian Tramèr
Nicholas Carlini
SILM
AAML
36
76
0
27 Jul 2023
Federated Distributionally Robust Optimization with Non-Convex Objectives: Algorithm and Analysis
Yang Jiao
Kai Yang
Dongjin Song
31
1
0
25 Jul 2023
Differential Analysis of Triggers and Benign Features for Black-Box DNN Backdoor Detection
Hao Fu
Prashanth Krishnamurthy
S. Garg
Farshad Khorrami
AAML
34
14
0
11 Jul 2023
Deception by Omission: Using Adversarial Missingness to Poison Causal Structure Learning
D. Koyuncu
Alex Gittens
B. Yener
M. Yung
AAML
CML
18
1
0
31 May 2023
IMBERT: Making BERT Immune to Insertion-based Backdoor Attacks
Xuanli He
Jun Wang
Benjamin I. P. Rubinstein
Trevor Cohn
SILM
34
12
0
25 May 2023
From Shortcuts to Triggers: Backdoor Defense with Denoised PoE
Qin Liu
Fei Wang
Chaowei Xiao
Muhao Chen
AAML
46
22
0
24 May 2023
Debiasing Made State-of-the-art: Revisiting the Simple Seed-based Weak Supervision for Text Classification
Chengyu Dong
Zihan Wang
Jingbo Shang
18
4
0
24 May 2023
Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models
Lyne Tchapmi
Mingyu Derek Ma
Fei Wang
Chaowei Xiao
Muhao Chen
SILM
52
77
0
24 May 2023
Previous
1
2
3
4
Next