Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.18633
Cited By
Setting the Trap: Capturing and Defeating Backdoors in Pretrained Language Models through Honeypots
28 October 2023
Ruixiang Tang
Jiayi Yuan
Yiming Li
Zirui Liu
Rui Chen
Xia Hu
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Setting the Trap: Capturing and Defeating Backdoors in Pretrained Language Models through Honeypots"
9 / 9 papers shown
Title
When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations
Huaizhi Ge
Yiming Li
Qifan Wang
Yongfeng Zhang
Ruixiang Tang
AAML
SILM
72
0
0
19 Nov 2024
Unelicitable Backdoors in Language Models via Cryptographic Transformer Circuits
Andis Draguns
Andrew Gritsevskiy
S. Motwani
Charlie Rogers-Smith
Jeffrey Ladish
Christian Schroeder de Witt
40
2
0
03 Jun 2024
ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger
Jiazhao Li
Yijin Yang
Zhuofeng Wu
V. Vydiswaran
Chaowei Xiao
SILM
44
42
0
27 Apr 2023
A Study of the Attention Abnormality in Trojaned BERTs
Weimin Lyu
Songzhu Zheng
Teng Ma
Chao Chen
51
56
0
13 May 2022
Few-Shot Backdoor Attacks on Visual Object Tracking
Yiming Li
Haoxiang Zhong
Xingjun Ma
Yong Jiang
Shutao Xia
AAML
34
53
0
31 Jan 2022
Measure and Improve Robustness in NLP Models: A Survey
Xuezhi Wang
Haohan Wang
Diyi Yang
139
130
0
15 Dec 2021
Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer
Fanchao Qi
Yangyi Chen
Xurui Zhang
Mukai Li
Zhiyuan Liu
Maosong Sun
AAML
SILM
77
175
0
14 Oct 2021
Generating Syntactically Controlled Paraphrases without Using Annotated Parallel Pairs
Kuan-Hao Huang
Kai-Wei Chang
148
68
0
26 Jan 2021
Mitigating backdoor attacks in LSTM-based Text Classification Systems by Backdoor Keyword Identification
Chuanshuai Chen
Jiazhu Dai
SILM
53
126
0
11 Jul 2020
1