ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.11596
  4. Cited By
Mitigating Backdoor Poisoning Attacks through the Lens of Spurious
  Correlation

Mitigating Backdoor Poisoning Attacks through the Lens of Spurious Correlation

19 May 2023
Xuanli He
Qiongkai Xu
Jun Wang
Benjamin I. P. Rubinstein
Trevor Cohn
    AAML
ArXivPDFHTML

Papers citing "Mitigating Backdoor Poisoning Attacks through the Lens of Spurious Correlation"

17 / 17 papers shown
Title
NLP Security and Ethics, in the Wild
NLP Security and Ethics, in the Wild
Heather Lent
Erick Galinkin
Yiyi Chen
Jens Myrup Pedersen
Leon Derczynski
Johannes Bjerva
SILM
42
0
0
09 Apr 2025
Hallucination Detection using Multi-View Attention Features
Hallucination Detection using Multi-View Attention Features
Yuya Ogasa
Yuki Arase
26
0
0
06 Apr 2025
Cut the Deadwood Out: Post-Training Model Purification with Selective Module Substitution
Cut the Deadwood Out: Post-Training Model Purification with Selective Module Substitution
Yao Tong
Weijun Li
Xuanli He
Haolan Zhan
Qiongkai Xu
AAML
28
1
0
31 Dec 2024
Mitigating Backdoor Threats to Large Language Models: Advancement and
  Challenges
Mitigating Backdoor Threats to Large Language Models: Advancement and Challenges
Qin Liu
Wenjie Mo
Terry Tong
Jiashu Xu
Fei Wang
Chaowei Xiao
Muhao Chen
AAML
31
4
0
30 Sep 2024
Rethinking Backdoor Detection Evaluation for Language Models
Rethinking Backdoor Detection Evaluation for Language Models
Jun Yan
Wenjie Jacky Mo
Xiang Ren
Robin Jia
ELM
35
1
0
31 Aug 2024
Securing Multi-turn Conversational Language Models Against Distributed
  Backdoor Triggers
Securing Multi-turn Conversational Language Models Against Distributed Backdoor Triggers
Terry Tong
Jiashu Xu
Qin Liu
Muhao Chen
AAML
SILM
32
1
0
04 Jul 2024
SEEP: Training Dynamics Grounds Latent Representation Search for
  Mitigating Backdoor Poisoning Attacks
SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks
Xuanli He
Qiongkai Xu
Jun Wang
Benjamin I. P. Rubinstein
Trevor Cohn
AAML
27
4
0
19 May 2024
Here's a Free Lunch: Sanitizing Backdoored Models with Model Merge
Here's a Free Lunch: Sanitizing Backdoored Models with Model Merge
Ansh Arora
Xuanli He
Maximilian Mozes
Srinibas Swain
Mark Dras
Qiongkai Xu
SILM
MoMe
AAML
54
12
0
29 Feb 2024
Spurious Correlations in Machine Learning: A Survey
Spurious Correlations in Machine Learning: A Survey
Wenqian Ye
Guangtao Zheng
Xu Cao
Yunsheng Ma
Aidong Zhang
OOD
AAML
CML
29
33
0
20 Feb 2024
Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations
Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations
Wenjie Mo
Jiashu Xu
Qin Liu
Jiong Wang
Jun Yan
Chaowei Xiao
Muhao Chen
Muhao Chen
AAML
49
17
0
16 Nov 2023
Backdoor Attacks and Countermeasures in Natural Language Processing
  Models: A Comprehensive Security Review
Backdoor Attacks and Countermeasures in Natural Language Processing Models: A Comprehensive Security Review
Pengzhou Cheng
Zongru Wu
Wei Du
Haodong Zhao
Wei Lu
Gongshen Liu
SILM
AAML
18
16
0
12 Sep 2023
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and
  Vulnerabilities
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and Vulnerabilities
Maximilian Mozes
Xuanli He
Bennett Kleinberg
Lewis D. Griffin
31
75
0
24 Aug 2023
IMBERT: Making BERT Immune to Insertion-based Backdoor Attacks
IMBERT: Making BERT Immune to Insertion-based Backdoor Attacks
Xuanli He
Jun Wang
Benjamin I. P. Rubinstein
Trevor Cohn
SILM
10
12
0
25 May 2023
BFClass: A Backdoor-free Text Classification Framework
BFClass: A Backdoor-free Text Classification Framework
Zichao Li
Dheeraj Mekala
Chengyu Dong
Jingbo Shang
SILM
56
27
0
22 Sep 2021
Competency Problems: On Finding and Removing Artifacts in Language Data
Competency Problems: On Finding and Removing Artifacts in Language Data
Matt Gardner
William Merrill
Jesse Dodge
Matthew E. Peters
Alexis Ross
Sameer Singh
Noah A. Smith
161
106
0
17 Apr 2021
Stanza: A Python Natural Language Processing Toolkit for Many Human
  Languages
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages
Peng Qi
Yuhao Zhang
Yuhui Zhang
Jason Bolton
Christopher D. Manning
AI4TS
199
1,638
0
16 Mar 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1