ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.10369
  4. Cited By
ONION: A Simple and Effective Defense Against Textual Backdoor Attacks

ONION: A Simple and Effective Defense Against Textual Backdoor Attacks

20 November 2020
Fanchao Qi
Yangyi Chen
Mukai Li
Yuan Yao
Zhiyuan Liu
Maosong Sun
    AAML
ArXivPDFHTML

Papers citing "ONION: A Simple and Effective Defense Against Textual Backdoor Attacks"

50 / 164 papers shown
Title
BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models
BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models
Z. Wang
Hongwei Li
Rui Zhang
Wenbo Jiang
Kangjie Chen
Tianwei Zhang
Qingchuan Zhao
Guowen Xu
AAML
41
0
0
06 May 2025
A Chaos Driven Metric for Backdoor Attack Detection
A Chaos Driven Metric for Backdoor Attack Detection
Hema Karnam Surendrababu
Nithin Nagaraj
AAML
36
0
0
06 May 2025
The Ultimate Cookbook for Invisible Poison: Crafting Subtle Clean-Label Text Backdoors with Style Attributes
The Ultimate Cookbook for Invisible Poison: Crafting Subtle Clean-Label Text Backdoors with Style Attributes
Wencong You
Daniel Lowd
34
0
0
24 Apr 2025
BadMoE: Backdooring Mixture-of-Experts LLMs via Optimizing Routing Triggers and Infecting Dormant Experts
BadMoE: Backdooring Mixture-of-Experts LLMs via Optimizing Routing Triggers and Infecting Dormant Experts
Qingyue Wang
Qi Pang
Xixun Lin
Shuai Wang
Daoyuan Wu
MoE
57
0
0
24 Apr 2025
Propaganda via AI? A Study on Semantic Backdoors in Large Language Models
Propaganda via AI? A Study on Semantic Backdoors in Large Language Models
Nay Myat Min
Long H. Pham
Yige Li
Jun Sun
AAML
23
0
0
15 Apr 2025
Never Start from Scratch: Expediting On-Device LLM Personalization via Explainable Model Selection
Never Start from Scratch: Expediting On-Device LLM Personalization via Explainable Model Selection
Haoming Wang
Boyuan Yang
Xiangyu Yin
Wei Gao
28
0
0
15 Apr 2025
Exploring Backdoor Attack and Defense for LLM-empowered Recommendations
Exploring Backdoor Attack and Defense for LLM-empowered Recommendations
Liangbo Ning
Wenqi Fan
Qing Li
AAML
SILM
48
0
0
15 Apr 2025
NLP Security and Ethics, in the Wild
NLP Security and Ethics, in the Wild
Heather Lent
Erick Galinkin
Yiyi Chen
Jens Myrup Pedersen
Leon Derczynski
Johannes Bjerva
SILM
42
0
0
09 Apr 2025
Defending Deep Neural Networks against Backdoor Attacks via Module Switching
Defending Deep Neural Networks against Backdoor Attacks via Module Switching
Weijun Li
Ansh Arora
Xuanli He
Mark Dras
Qiongkai Xu
AAML
MoMe
47
0
0
08 Apr 2025
ShadowCoT: Cognitive Hijacking for Stealthy Reasoning Backdoors in LLMs
ShadowCoT: Cognitive Hijacking for Stealthy Reasoning Backdoors in LLMs
Gejian Zhao
Hanzhou Wu
Xinpeng Zhang
Athanasios V. Vasilakos
LRM
36
1
0
08 Apr 2025
The H-Elena Trojan Virus to Infect Model Weights: A Wake-Up Call on the Security Risks of Malicious Fine-Tuning
The H-Elena Trojan Virus to Infect Model Weights: A Wake-Up Call on the Security Risks of Malicious Fine-Tuning
Virilo Tejedor
Cristina Zuheros
Carlos Peláez-González
David Herrera-Poyatos
Andrés Herrera-Poyatos
F. Herrera
24
0
0
04 Apr 2025
Exposing the Ghost in the Transformer: Abnormal Detection for Large Language Models via Hidden State Forensics
Exposing the Ghost in the Transformer: Abnormal Detection for Large Language Models via Hidden State Forensics
Shide Zhou
K. Wang
Ling Shi
H. Wang
44
0
0
01 Apr 2025
NaviDet: Efficient Input-level Backdoor Detection on Text-to-Image Synthesis via Neuron Activation Variation
Shengfang Zhai
Jiajun Li
Yue Liu
Huanran Chen
Zhihua Tian
Wenjie Qu
Qingni Shen
Ruoxi Jia
Yinpeng Dong
Jiaheng Zhang
AAML
44
0
0
09 Mar 2025
BadJudge: Backdoor Vulnerabilities of LLM-as-a-Judge
Terry Tong
Fei-Yue Wang
Zhe Zhao
M. Chen
AAML
ELM
37
1
0
01 Mar 2025
Beyond Natural Language Perplexity: Detecting Dead Code Poisoning in Code Generation Datasets
Beyond Natural Language Perplexity: Detecting Dead Code Poisoning in Code Generation Datasets
Chichien Tsai
Chiamu Yu
Yingdar Lin
Yusung Wu
Weibin Lee
AAML
50
0
0
27 Feb 2025
Show Me Your Code! Kill Code Poisoning: A Lightweight Method Based on Code Naturalness
Show Me Your Code! Kill Code Poisoning: A Lightweight Method Based on Code Naturalness
Weisong Sun
Yuchen Chen
Mengzhe Yuan
Chunrong Fang
Zhenpeng Chen
Chong Wang
Yang Liu
Baowen Xu
Zhenyu Chen
AAML
34
1
0
20 Feb 2025
Poisoned Source Code Detection in Code Models
Poisoned Source Code Detection in Code Models
Ehab Ghannoum
Mohammad Ghafari
AAML
63
0
0
19 Feb 2025
UniGuardian: A Unified Defense for Detecting Prompt Injection, Backdoor Attacks and Adversarial Attacks in Large Language Models
UniGuardian: A Unified Defense for Detecting Prompt Injection, Backdoor Attacks and Adversarial Attacks in Large Language Models
Huawei Lin
Yingjie Lao
Tong Geng
Tan Yu
Weijie Zhao
AAML
SILM
79
2
0
18 Feb 2025
BoT: Breaking Long Thought Processes of o1-like Large Language Models through Backdoor Attack
BoT: Breaking Long Thought Processes of o1-like Large Language Models through Backdoor Attack
Zihao Zhu
Hongbao Zhang
Mingda Zhang
Ruotong Wang
Guanzong Wu
Ke Xu
Baoyuan Wu
AAML
LRM
54
4
0
16 Feb 2025
Cut the Deadwood Out: Post-Training Model Purification with Selective Module Substitution
Cut the Deadwood Out: Post-Training Model Purification with Selective Module Substitution
Yao Tong
Weijun Li
Xuanli He
Haolan Zhan
Qiongkai Xu
AAML
30
1
0
31 Dec 2024
Double Landmines: Invisible Textual Backdoor Attacks based on
  Dual-Trigger
Double Landmines: Invisible Textual Backdoor Attacks based on Dual-Trigger
Yang Hou
Qiuling Yue
Lujia Chai
Guozhao Liao
Wenbao Han
Wei Ou
35
0
0
23 Dec 2024
Gracefully Filtering Backdoor Samples for Generative Large Language
  Models without Retraining
Gracefully Filtering Backdoor Samples for Generative Large Language Models without Retraining
Zongru Wu
Pengzhou Cheng
Lingyong Fang
Zhuosheng Zhang
Gongshen Liu
AAML
SILM
73
0
0
03 Dec 2024
Neutralizing Backdoors through Information Conflicts for Large Language
  Models
Neutralizing Backdoors through Information Conflicts for Large Language Models
Chen Chen
Yuchen Sun
Xueluan Gong
Jiaxin Gao
K. Lam
KELM
AAML
69
0
0
27 Nov 2024
TrojanRobot: Physical-World Backdoor Attacks Against VLM-based Robotic Manipulation
X. U. Wang
Hewen Pan
Hangtao Zhang
Minghui Li
Shengshan Hu
...
Peijin Guo
Yichen Wang
Wei Wan
Aishan Liu
L. Zhang
AAML
80
2
0
18 Nov 2024
CROW: Eliminating Backdoors from Large Language Models via Internal
  Consistency Regularization
CROW: Eliminating Backdoors from Large Language Models via Internal Consistency Regularization
Nay Myat Min
Long H. Pham
Yige Li
Jun Sun
AAML
64
3
0
18 Nov 2024
BackdoorMBTI: A Backdoor Learning Multimodal Benchmark Tool Kit for Backdoor Defense Evaluation
Haiyang Yu
Tian Xie
Jiaping Gui
Pengyang Wang
P. Yi
Yue Wu
46
1
0
17 Nov 2024
CodePurify: Defend Backdoor Attacks on Neural Code Models via
  Entropy-based Purification
CodePurify: Defend Backdoor Attacks on Neural Code Models via Entropy-based Purification
Fangwen Mu
Junjie Wang
Zhuohao Yu
Lin Shi
Song Wang
Mingyang Li
Qing Wang
AAML
33
1
0
26 Oct 2024
Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via
  Exposed Models
Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models
Yige Li
Hanxun Huang
Jiaming Zhang
Xingjun Ma
Yu-Gang Jiang
AAML
33
2
0
25 Oct 2024
AdvBDGen: Adversarially Fortified Prompt-Specific Fuzzy Backdoor Generator Against LLM Alignment
AdvBDGen: Adversarially Fortified Prompt-Specific Fuzzy Backdoor Generator Against LLM Alignment
Pankayaraj Pathmanathan
Udari Madhushani Sehwag
Michael-Andrei Panaitescu-Liess
Furong Huang
SILM
AAML
38
0
0
15 Oct 2024
ASPIRER: Bypassing System Prompts With Permutation-based Backdoors in
  LLMs
ASPIRER: Bypassing System Prompts With Permutation-based Backdoors in LLMs
Lu Yan
Siyuan Cheng
Xuan Chen
Kaiyuan Zhang
Guangyu Shen
Zhuo Zhang
Xiangyu Zhang
AAML
SILM
18
0
0
05 Oct 2024
Demonstration Attack against In-Context Learning for Code Intelligence
Demonstration Attack against In-Context Learning for Code Intelligence
Yifei Ge
Weisong Sun
Yihang Lou
Chunrong Fang
Yiran Zhang
Yiming Li
Xiaofang Zhang
Yang Liu
Zhihong Zhao
Zhenyu Chen
AAML
23
1
0
03 Oct 2024
BadCM: Invisible Backdoor Attack Against Cross-Modal Learning
BadCM: Invisible Backdoor Attack Against Cross-Modal Learning
Zheng Zhang
Xu Yuan
Lei Zhu
Jingkuan Song
Liqiang Nie
AAML
37
11
0
03 Oct 2024
Mitigating Backdoor Threats to Large Language Models: Advancement and
  Challenges
Mitigating Backdoor Threats to Large Language Models: Advancement and Challenges
Qin Liu
Wenjie Mo
Terry Tong
Jiashu Xu
Fei Wang
Chaowei Xiao
Muhao Chen
AAML
31
4
0
30 Sep 2024
Data-centric NLP Backdoor Defense from the Lens of Memorization
Data-centric NLP Backdoor Defense from the Lens of Memorization
Zhenting Wang
Zhizhi Wang
Mingyu Jin
Mengnan Du
Juan Zhai
Shiqing Ma
29
3
0
21 Sep 2024
Obliviate: Neutralizing Task-agnostic Backdoors within the
  Parameter-efficient Fine-tuning Paradigm
Obliviate: Neutralizing Task-agnostic Backdoors within the Parameter-efficient Fine-tuning Paradigm
Jaehan Kim
Minkyoo Song
S. Na
Seungwon Shin
AAML
33
0
0
21 Sep 2024
Exploiting the Vulnerability of Large Language Models via Defense-Aware
  Architectural Backdoor
Exploiting the Vulnerability of Large Language Models via Defense-Aware Architectural Backdoor
Abdullah Arafat Miah
Yu Bi
AAML
SILM
26
0
0
03 Sep 2024
CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models
CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models
Rui Zeng
Xi Chen
Yuwen Pu
Xuhong Zhang
Tianyu Du
Shouling Ji
41
2
0
02 Sep 2024
The Dark Side of Human Feedback: Poisoning Large Language Models via
  User Inputs
The Dark Side of Human Feedback: Poisoning Large Language Models via User Inputs
Bocheng Chen
Hanqing Guo
Guangjing Wang
Yuanda Wang
Qiben Yan
AAML
37
4
0
01 Sep 2024
DAMe: Personalized Federated Social Event Detection with Dual
  Aggregation Mechanism
DAMe: Personalized Federated Social Event Detection with Dual Aggregation Mechanism
Xiaoyan Yu
Yifan Wei
Pu Li
Shuaishuai Zhou
Hao Peng
Li Sun
Liehuang Zhu
Philip S. Yu
FedML
24
1
0
01 Sep 2024
Rethinking Backdoor Detection Evaluation for Language Models
Rethinking Backdoor Detection Evaluation for Language Models
Jun Yan
Wenjie Jacky Mo
Xiang Ren
Robin Jia
ELM
40
1
0
31 Aug 2024
Large Language Models are Good Attackers: Efficient and Stealthy Textual
  Backdoor Attacks
Large Language Models are Good Attackers: Efficient and Stealthy Textual Backdoor Attacks
Ziqiang Li
Yueqi Zeng
Pengfei Xia
Lei Liu
Zhangjie Fu
Bin Li
SILM
AAML
42
2
0
21 Aug 2024
FDI: Attack Neural Code Generation Systems through User Feedback Channel
FDI: Attack Neural Code Generation Systems through User Feedback Channel
Zhensu Sun
Xiaoning Du
Xiapu Luo
Fu Song
David Lo
Li Li
AAML
23
3
0
08 Aug 2024
Compromising Embodied Agents with Contextual Backdoor Attacks
Compromising Embodied Agents with Contextual Backdoor Attacks
Aishan Liu
Yuguang Zhou
Xianglong Liu
Tianyuan Zhang
Siyuan Liang
...
Tianlin Li
Junqi Zhang
Wenbo Zhou
Qing-Wu Guo
Dacheng Tao
LLMAG
AAML
39
8
0
06 Aug 2024
Can LLMs be Fooled? Investigating Vulnerabilities in LLMs
Can LLMs be Fooled? Investigating Vulnerabilities in LLMs
Sara Abdali
Jia He
C. Barberan
Richard Anarfi
31
7
0
30 Jul 2024
Know Your Limits: A Survey of Abstention in Large Language Models
Know Your Limits: A Survey of Abstention in Large Language Models
Bingbing Wen
Jihan Yao
Shangbin Feng
Chenjun Xu
Yulia Tsvetkov
Bill Howe
Lucy Lu Wang
49
5
0
25 Jul 2024
Operationalizing a Threat Model for Red-Teaming Large Language Models
  (LLMs)
Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)
Apurv Verma
Satyapriya Krishna
Sebastian Gehrmann
Madhavan Seshadri
Anu Pradhan
Tom Ault
Leslie Barrett
David Rabinowitz
John Doucette
Nhathai Phan
47
9
0
20 Jul 2024
Turning Generative Models Degenerate: The Power of Data Poisoning
  Attacks
Turning Generative Models Degenerate: The Power of Data Poisoning Attacks
Shuli Jiang
S. Kadhe
Yi Zhou
Farhan Ahmed
Ling Cai
Nathalie Baracaldo
SILM
AAML
31
4
0
17 Jul 2024
Uncertainty is Fragile: Manipulating Uncertainty in Large Language
  Models
Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models
Qingcheng Zeng
Mingyu Jin
Qinkai Yu
Zhenting Wang
Wenyue Hua
...
Felix Juefei Xu
Kaize Ding
Fan Yang
Ruixiang Tang
Yongfeng Zhang
AAML
36
10
0
15 Jul 2024
DeCE: Deceptive Cross-Entropy Loss Designed for Defending Backdoor
  Attacks
DeCE: Deceptive Cross-Entropy Loss Designed for Defending Backdoor Attacks
Guang Yang
Yu Zhou
Xiang Chen
Xiangyu Zhang
Terry Yue Zhuo
David Lo
Taolue Chen
AAML
47
4
0
12 Jul 2024
Defense Against Syntactic Textual Backdoor Attacks with Token
  Substitution
Defense Against Syntactic Textual Backdoor Attacks with Token Substitution
Xinglin Li
Xianwen He
Yao Li
Minhao Cheng
21
1
0
04 Jul 2024
1234
Next