Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2105.00164
Cited By
Hidden Backdoors in Human-Centric Language Models
1 May 2021
Shaofeng Li
Hui Liu
Tian Dong
Benjamin Zi Hao Zhao
Minhui Xue
Haojin Zhu
Jialiang Lu
SILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Hidden Backdoors in Human-Centric Language Models"
19 / 19 papers shown
Title
BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models
ziqi wang
Hongwei Li
Rui Zhang
Wenbo Jiang
Kangjie Chen
Tianwei Zhang
Qingchuan Zhao
Jiawei Li
AAML
46
0
0
06 May 2025
LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures
Francisco Aguilera-Martínez
Fernando Berzal
PILM
55
0
0
02 May 2025
Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs
Himanshu Beniwal
Sailesh Panda
Birudugadda Srivibhav
Mayank Singh
45
0
0
24 Feb 2025
Context is the Key: Backdoor Attacks for In-Context Learning with Vision Transformers
Gorka Abad
S. Picek
Lorenzo Cavallaro
A. Urbieta
SILM
44
0
0
06 Sep 2024
ModelShield: Adaptive and Robust Watermark against Model Extraction Attack
Kaiyi Pang
Tao Qi
Chuhan Wu
Minhao Bai
Minghu Jiang
Yongfeng Huang
AAML
WaLM
72
2
0
03 May 2024
Threats, Attacks, and Defenses in Machine Unlearning: A Survey
Ziyao Liu
Huanyi Ye
Chen Chen
Yongsen Zheng
K. Lam
AAML
MU
35
28
0
20 Mar 2024
Punctuation Matters! Stealthy Backdoor Attack for Language Models
Xuan Sheng
Zhicheng Li
Zhaoyang Han
Xiangmao Chang
Piji Li
40
3
0
26 Dec 2023
A Comprehensive Overview of Backdoor Attacks in Large Language Models within Communication Networks
Haomiao Yang
Kunlan Xiang
Mengyu Ge
Hongwei Li
Rongxing Lu
Shui Yu
SILM
30
42
0
28 Aug 2023
From Prompt Injections to SQL Injection Attacks: How Protected is Your LLM-Integrated Web Application?
Rodrigo Pedro
Daniel Castro
Paulo Carreira
Nuno Santos
SILM
AAML
38
50
0
03 Aug 2023
Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting Jailbreaks
Abhinav Rao
S. Vashistha
Atharva Naik
Somak Aditya
Monojit Choudhury
35
17
0
24 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
45
82
0
19 May 2023
Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning
Shengfang Zhai
Yinpeng Dong
Qingni Shen
Shih-Chieh Pu
Yuejian Fang
Hang Su
32
71
0
07 May 2023
Attacks in Adversarial Machine Learning: A Systematic Survey from the Life-cycle Perspective
Baoyuan Wu
Zihao Zhu
Li Liu
Qingshan Liu
Zhaofeng He
Siwei Lyu
AAML
44
21
0
19 Feb 2023
BDMMT: Backdoor Sample Detection for Language Models through Model Mutation Testing
Jiali Wei
Ming Fan
Wenjing Jiao
Wuxia Jin
Ting Liu
AAML
29
10
0
25 Jan 2023
Mind Your Heart: Stealthy Backdoor Attack on Dynamic Deep Neural Network in Edge Computing
Tian Dong
Ziyuan Zhang
Han Qiu
Tianwei Zhang
Hewu Li
T. Wang
AAML
28
6
0
22 Dec 2022
Poison Attack and Defense on Deep Source Code Processing Models
Jia Li
Zhuo Li
Huangzhao Zhang
Ge Li
Zhi Jin
Xing Hu
Xin Xia
AAML
37
16
0
31 Oct 2022
Constrained Optimization with Dynamic Bound-scaling for Effective NLPBackdoor Defense
Guangyu Shen
Yingqi Liu
Guanhong Tao
Qiuling Xu
Zhuo Zhang
Shengwei An
Shiqing Ma
Xinming Zhang
AAML
18
33
0
11 Feb 2022
Spinning Language Models: Risks of Propaganda-As-A-Service and Countermeasures
Eugene Bagdasaryan
Vitaly Shmatikov
SILM
AAML
27
76
0
09 Dec 2021
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
290
1,815
0
14 Dec 2020
1