Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.13888
Cited By
Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning
31 August 2021
Linyang Li
Demin Song
Xiaonan Li
Jiehang Zeng
Ruotian Ma
Xipeng Qiu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning"
38 / 88 papers shown
Title
Large Language Model Alignment: A Survey
Tianhao Shen
Renren Jin
Yufei Huang
Chuang Liu
Weilong Dong
Zishan Guo
Xinwei Wu
Yan Liu
Deyi Xiong
LM&MA
14
176
0
26 Sep 2023
Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks
Zhaohan Xi
Tianyu Du
Changjiang Li
Ren Pang
S. Ji
Jinghui Chen
Fenglong Ma
Ting Wang
AAML
8
29
0
23 Sep 2023
Backdoor Attacks and Countermeasures in Natural Language Processing Models: A Comprehensive Security Review
Pengzhou Cheng
Zongru Wu
Wei Du
Haodong Zhao
Wei Lu
Gongshen Liu
SILM
AAML
18
16
0
12 Sep 2023
A Comprehensive Overview of Backdoor Attacks in Large Language Models within Communication Networks
Haomiao Yang
Kunlan Xiang
Mengyu Ge
Hongwei Li
Rongxing Lu
Shui Yu
SILM
21
42
0
28 Aug 2023
LMSanitator: Defending Prompt-Tuning Against Task-Agnostic Backdoors
Chengkun Wei
Wenlong Meng
Zhikun Zhang
M. Chen
Ming-Hui Zhao
Wenjing Fang
Lei Wang
Zihui Zhang
Wenzhi Chen
AAML
11
8
0
26 Aug 2023
ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP
Lu Yan
Zhuo Zhang
Guanhong Tao
Kaiyuan Zhang
Xuan Chen
Guangyu Shen
Xiangyu Zhang
AAML
SILM
46
16
0
04 Aug 2023
On the Trustworthiness Landscape of State-of-the-art Generative Models: A Survey and Outlook
Mingyuan Fan
Chengyu Wang
Cen Chen
Yang Liu
Jun Huang
HILM
31
3
0
31 Jul 2023
From Shortcuts to Triggers: Backdoor Defense with Denoised PoE
Qin Liu
Fei Wang
Chaowei Xiao
Muhao Chen
AAML
24
21
0
24 May 2023
Watermarking Text Data on Large Language Models for Dataset Copyright
Yixin Liu
Hongsheng Hu
Xun Chen
Xuyun Zhang
Lichao Sun
WaLM
10
22
0
22 May 2023
Mitigating Backdoor Poisoning Attacks through the Lens of Spurious Correlation
Xuanli He
Qiongkai Xu
Jun Wang
Benjamin I. P. Rubinstein
Trevor Cohn
AAML
24
18
0
19 May 2023
UOR: Universal Backdoor Attacks on Pre-trained Language Models
Wei Du
Peixuan Li
Bo-wen Li
Haodong Zhao
Gongshen Liu
AAML
37
7
0
16 May 2023
Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous Dimensions in Pre-trained Language Models Caused by Backdoor or Bias
Zhiyuan Zhang
Deli Chen
Hao Zhou
Fandong Meng
Jie Zhou
Xu Sun
21
5
0
08 May 2023
Defending against Insertion-based Textual Backdoor Attacks via Attribution
Jiazhao Li
Zhuofeng Wu
Wei Ping
Chaowei Xiao
V. Vydiswaran
40
23
0
03 May 2023
Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models
Shuai Zhao
Jinming Wen
Anh Tuan Luu
J. Zhao
Jie Fu
SILM
57
89
0
02 May 2023
ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger
Jiazhao Li
Yijin Yang
Zhuofeng Wu
V. Vydiswaran
Chaowei Xiao
SILM
41
42
0
27 Apr 2023
Origin Tracing and Detecting of LLMs
Linyang Li
Pengyu Wang
Kerong Ren
Tianxiang Sun
Xipeng Qiu
LLMAG
94
31
0
27 Apr 2023
Backdoor Attacks with Input-unique Triggers in NLP
Xukun Zhou
Jiwei Li
Tianwei Zhang
Lingjuan Lyu
Muqiao Yang
Jun He
SILM
AAML
14
9
0
25 Mar 2023
Stealing the Decoding Algorithms of Language Models
A. Naseh
Kalpesh Krishna
Mohit Iyyer
Amir Houmansadr
MLAU
50
20
0
08 Mar 2023
Aegis: Mitigating Targeted Bit-flip Attacks against Deep Neural Networks
Jialai Wang
Ziyuan Zhang
Meiqi Wang
Han Qiu
Tianwei Zhang
Qi Li
Zongpeng Li
Tao Wei
Chao Zhang
AAML
12
20
0
27 Feb 2023
Attacks in Adversarial Machine Learning: A Systematic Survey from the Life-cycle Perspective
Baoyuan Wu
Zihao Zhu
Li Liu
Qingshan Liu
Zhaofeng He
Siwei Lyu
AAML
44
21
0
19 Feb 2023
Training-free Lexical Backdoor Attacks on Language Models
Yujin Huang
Terry Yue Zhuo
Qiongkai Xu
Han Hu
Xingliang Yuan
Chunyang Chen
SILM
15
42
0
08 Feb 2023
Backdoor Vulnerabilities in Normally Trained Deep Learning Models
Guanhong Tao
Zhenting Wang
Shuyang Cheng
Shiqing Ma
Shengwei An
Yingqi Liu
Guangyu Shen
Zhuo Zhang
Yunshu Mao
Xiangyu Zhang
SILM
12
17
0
29 Nov 2022
On the Security Vulnerabilities of Text-to-SQL Models
Xutan Peng
Yipeng Zhang
Jingfeng Yang
Mark Stevenson
SILM
15
10
0
28 Nov 2022
BadPrompt: Backdoor Attacks on Continuous Prompts
Xiangrui Cai
Haidong Xu
Sihan Xu
Ying Zhang
Xiaojie Yuan
SILM
15
59
0
27 Nov 2022
A Survey on Backdoor Attack and Defense in Natural Language Processing
Xuan Sheng
Zhaoyang Han
Piji Li
Xiangmao Chang
SILM
11
19
0
22 Nov 2022
Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis
Lukas Struppek
Dominik Hintersdorf
Kristian Kersting
SILM
22
35
0
04 Nov 2022
Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models
Zhiyuan Zhang
Lingjuan Lyu
Xingjun Ma
Chenguang Wang
Xu Sun
AAML
16
41
0
18 Oct 2022
Expose Backdoors on the Way: A Feature-Based Efficient Defense against Textual Backdoor Attacks
Sishuo Chen
Wenkai Yang
Zhiyuan Zhang
Xiaohan Bi
Xu Sun
SILM
AAML
18
23
0
14 Oct 2022
Watermarking Pre-trained Language Models with Backdooring
Chenxi Gu
Chengsong Huang
Xiaoqing Zheng
Kai-Wei Chang
Cho-Jui Hsieh
WaLM
15
43
0
14 Oct 2022
BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets
Chen Gong
Zhou Yang
Yunru Bai
Junda He
Jieke Shi
...
Arunesh Sinha
Bowen Xu
Xinwen Hou
David Lo
Guoliang Fan
AAML
OffRL
11
7
0
07 Oct 2022
A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks
Ganqu Cui
Lifan Yuan
Bingxiang He
Yangyi Chen
Zhiyuan Liu
Maosong Sun
AAML
ELM
SILM
16
68
0
17 Jun 2022
Threats to Pre-trained Language Models: Survey and Taxonomy
Shangwei Guo
Chunlong Xie
Jiwei Li
Lingjuan Lyu
Tianwei Zhang
PILM
20
29
0
14 Feb 2022
Triggerless Backdoor Attack for NLP Tasks with Clean Labels
Leilei Gan
Jiwei Li
Tianwei Zhang
Xiaoya Li
Yuxian Meng
Fei Wu
Yi Yang
Shangwei Guo
Chun Fan
AAML
SILM
23
74
0
15 Nov 2021
BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models
Kangjie Chen
Yuxian Meng
Xiaofei Sun
Shangwei Guo
Tianwei Zhang
Jiwei Li
Chun Fan
SILM
13
105
0
06 Oct 2021
Red Alarm for Pre-trained Models: Universal Vulnerability to Neuron-Level Backdoor Attacks
Zhengyan Zhang
Guangxuan Xiao
Yongwei Li
Tian Lv
Fanchao Qi
Zhiyuan Liu
Yasheng Wang
Xin Jiang
Maosong Sun
AAML
19
67
0
18 Jan 2021
Backdoor Learning: A Survey
Yiming Li
Yong Jiang
Zhifeng Li
Shutao Xia
AAML
35
585
0
17 Jul 2020
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
241
1,444
0
18 Mar 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
Previous
1
2