ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.06660
  4. Cited By
Weight Poisoning Attacks on Pre-trained Models

Weight Poisoning Attacks on Pre-trained Models

14 April 2020
Keita Kurita
Paul Michel
Graham Neubig
    AAML
    SILM
ArXivPDFHTML

Papers citing "Weight Poisoning Attacks on Pre-trained Models"

50 / 65 papers shown
Title
Adversarial Attacks in Multimodal Systems: A Practitioner's Survey
Adversarial Attacks in Multimodal Systems: A Practitioner's Survey
Shashank Kapoor
Sanjay Surendranath Girija
Lakshit Arora
Dipen Pradhan
Ankit Shetgaonkar
Aman Raj
AAML
67
0
0
06 May 2025
A Survey on Privacy Risks and Protection in Large Language Models
A Survey on Privacy Risks and Protection in Large Language Models
Kang Chen
Xiuze Zhou
Yuanguo Lin
Shibo Feng
Li Shen
Pengcheng Wu
AILaw
PILM
103
0
0
04 May 2025
Backdoor Attacks Against Patch-based Mixture of Experts
Backdoor Attacks Against Patch-based Mixture of Experts
Cedric Chan
Jona te Lintelo
S. Picek
AAML
MoE
102
0
0
03 May 2025
GaussTrap: Stealthy Poisoning Attacks on 3D Gaussian Splatting for Targeted Scene Confusion
GaussTrap: Stealthy Poisoning Attacks on 3D Gaussian Splatting for Targeted Scene Confusion
Jiaxin Hong
Sixu Chen
Shuoyang Sun
Hongyao Yu
Hao Fang
Yuqi Tan
B. Chen
Shuhan Qi
Jiawei Li
3DGS
AAML
91
0
0
29 Apr 2025
BadMoE: Backdooring Mixture-of-Experts LLMs via Optimizing Routing Triggers and Infecting Dormant Experts
BadMoE: Backdooring Mixture-of-Experts LLMs via Optimizing Routing Triggers and Infecting Dormant Experts
Qingyue Wang
Qi Pang
Xixun Lin
Shuai Wang
Daoyuan Wu
MoE
57
0
0
24 Apr 2025
PR-Attack: Coordinated Prompt-RAG Attacks on Retrieval-Augmented Generation in Large Language Models via Bilevel Optimization
PR-Attack: Coordinated Prompt-RAG Attacks on Retrieval-Augmented Generation in Large Language Models via Bilevel Optimization
Yang Jiao
X. Wang
Kai Yang
AAML
SILM
31
0
0
10 Apr 2025
Revisiting Backdoor Attacks on Time Series Classification in the Frequency Domain
Revisiting Backdoor Attacks on Time Series Classification in the Frequency Domain
Y. Huang
Mi Zhang
Z. Wang
Wenxuan Li
Min Yang
AI4TS
AAML
47
0
0
12 Mar 2025
Tracking the Copyright of Large Vision-Language Models through Parameter Learning Adversarial Images
Tracking the Copyright of Large Vision-Language Models through Parameter Learning Adversarial Images
Yubo Wang
Jianting Tang
Chaohu Liu
Linli Xu
AAML
51
1
0
23 Feb 2025
When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations
When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations
Huaizhi Ge
Yiming Li
Qifan Wang
Yongfeng Zhang
Ruixiang Tang
AAML
SILM
72
0
0
19 Nov 2024
NoiseAttack: An Evasive Sample-Specific Multi-Targeted Backdoor Attack
  Through White Gaussian Noise
NoiseAttack: An Evasive Sample-Specific Multi-Targeted Backdoor Attack Through White Gaussian Noise
Abdullah Arafat Miah
Kaan Icer
Resit Sendag
Yu Bi
AAML
DiffM
15
1
0
03 Sep 2024
Best-of-Venom: Attacking RLHF by Injecting Poisoned Preference Data
Best-of-Venom: Attacking RLHF by Injecting Poisoned Preference Data
Tim Baumgärtner
Yang Gao
Dana Alon
Donald Metzler
AAML
18
18
0
08 Apr 2024
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Weijiao Zhang
Jindong Han
Zhao Xu
Hang Ni
Hao Liu
Hui Xiong
Hui Xiong
AI4CE
77
15
0
30 Jan 2024
Punctuation Matters! Stealthy Backdoor Attack for Language Models
Punctuation Matters! Stealthy Backdoor Attack for Language Models
Xuan Sheng
Zhicheng Li
Zhaoyang Han
Xiangmao Chang
Piji Li
33
3
0
26 Dec 2023
Universal Jailbreak Backdoors from Poisoned Human Feedback
Universal Jailbreak Backdoors from Poisoned Human Feedback
Javier Rando
Florian Tramèr
13
60
0
24 Nov 2023
Efficient Trigger Word Insertion
Efficient Trigger Word Insertion
Yueqi Zeng
Ziqiang Li
Pengfei Xia
Lei Liu
Bin Li
AAML
19
5
0
23 Nov 2023
RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with
  Human Feedback in Large Language Models
RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language Models
Jiong Wang
Junlin Wu
Muhao Chen
Yevgeniy Vorobeychik
Chaowei Xiao
AAML
13
12
0
16 Nov 2023
Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations
Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations
Wenjie Mo
Jiashu Xu
Qin Liu
Jiong Wang
Jun Yan
Chaowei Xiao
Muhao Chen
Muhao Chen
AAML
54
17
0
16 Nov 2023
Beyond Detection: Unveiling Fairness Vulnerabilities in Abusive Language
  Models
Beyond Detection: Unveiling Fairness Vulnerabilities in Abusive Language Models
Yueqing Liang
Lu Cheng
Ali Payani
Kai Shu
12
3
0
15 Nov 2023
Everyone Can Attack: Repurpose Lossy Compression as a Natural Backdoor
  Attack
Everyone Can Attack: Repurpose Lossy Compression as a Natural Backdoor Attack
Sze Jue Yang
Q. Nguyen
Chee Seng Chan
Khoa D. Doan
AAML
DiffM
22
0
0
31 Aug 2023
NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models
NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models
Kai Mei
Zheng Li
Zhenting Wang
Yang Zhang
Shiqing Ma
AAML
SILM
19
48
0
28 May 2023
Mitigating Backdoor Poisoning Attacks through the Lens of Spurious
  Correlation
Mitigating Backdoor Poisoning Attacks through the Lens of Spurious Correlation
Xuanli He
Qiongkai Xu
Jun Wang
Benjamin I. P. Rubinstein
Trevor Cohn
AAML
24
18
0
19 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through
  the Lens of Verification and Validation
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
27
81
0
19 May 2023
Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous
  Dimensions in Pre-trained Language Models Caused by Backdoor or Bias
Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous Dimensions in Pre-trained Language Models Caused by Backdoor or Bias
Zhiyuan Zhang
Deli Chen
Hao Zhou
Fandong Meng
Jie Zhou
Xu Sun
28
5
0
08 May 2023
Text-to-Image Diffusion Models can be Easily Backdoored through
  Multimodal Data Poisoning
Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning
Shengfang Zhai
Yinpeng Dong
Qingni Shen
Shih-Chieh Pu
Yuejian Fang
Hang Su
30
70
0
07 May 2023
Backdoor Learning on Sequence to Sequence Models
Backdoor Learning on Sequence to Sequence Models
Lichang Chen
Minhao Cheng
Heng-Chiao Huang
SILM
52
18
0
03 May 2023
An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep
  Learning Model Registry
An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep Learning Model Registry
Wenxin Jiang
Nicholas Synovic
Matt Hyatt
Taylor R. Schorlemmer
R. Sethi
Yung-Hsiang Lu
George K. Thiruvathukal
James C. Davis
25
63
0
05 Mar 2023
Backdoor Attacks to Pre-trained Unified Foundation Models
Backdoor Attacks to Pre-trained Unified Foundation Models
Zenghui Yuan
Yixin Liu
Kai Zhang
Pan Zhou
Lichao Sun
AAML
22
10
0
18 Feb 2023
Backdoor Learning for NLP: Recent Advances, Challenges, and Future
  Research Directions
Backdoor Learning for NLP: Recent Advances, Challenges, and Future Research Directions
Marwan Omar
SILM
AAML
23
20
0
14 Feb 2023
Explainable AI does not provide the explanations end-users are asking
  for
Explainable AI does not provide the explanations end-users are asking for
Savio Rozario
G. Cevora
XAI
10
0
0
25 Jan 2023
BDMMT: Backdoor Sample Detection for Language Models through Model
  Mutation Testing
BDMMT: Backdoor Sample Detection for Language Models through Model Mutation Testing
Jiali Wei
Ming Fan
Wenjing Jiao
Wuxia Jin
Ting Liu
AAML
24
10
0
25 Jan 2023
Circumventing interpretability: How to defeat mind-readers
Circumventing interpretability: How to defeat mind-readers
Lee D. Sharkey
23
3
0
21 Dec 2022
UPTON: Preventing Authorship Leakage from Public Text Release via Data
  Poisoning
UPTON: Preventing Authorship Leakage from Public Text Release via Data Poisoning
Ziyao Wang
Thai Le
Dongwon Lee
22
1
0
17 Nov 2022
MSDT: Masked Language Model Scoring Defense in Text Domain
MSDT: Masked Language Model Scoring Defense in Text Domain
Jaechul Roh
Minhao Cheng
Yajun Fang
AAML
15
1
0
10 Nov 2022
Dormant Neural Trojans
Dormant Neural Trojans
Feisi Fu
Panagiota Kiourti
Wenchao Li
AAML
21
0
0
02 Nov 2022
Poison Attack and Defense on Deep Source Code Processing Models
Poison Attack and Defense on Deep Source Code Processing Models
Jia Li
Zhuo Li
Huangzhao Zhang
Ge Li
Zhi Jin
Xing Hu
Xin Xia
AAML
33
16
0
31 Oct 2022
Marksman Backdoor: Backdoor Attacks with Arbitrary Target Class
Marksman Backdoor: Backdoor Attacks with Arbitrary Target Class
Khoa D. Doan
Yingjie Lao
Ping Li
34
40
0
17 Oct 2022
Detecting Backdoors in Deep Text Classifiers
Detecting Backdoors in Deep Text Classifiers
Youyan Guo
Jun Wang
Trevor Cohn
SILM
14
1
0
11 Oct 2022
BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets
BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets
Chen Gong
Zhou Yang
Yunru Bai
Junda He
Jieke Shi
...
Arunesh Sinha
Bowen Xu
Xinwen Hou
David Lo
Guoliang Fan
AAML
OffRL
11
7
0
07 Oct 2022
PromptAttack: Prompt-based Attack for Language Models via Gradient
  Search
PromptAttack: Prompt-based Attack for Language Models via Gradient Search
Yundi Shi
Piji Li
Changchun Yin
Zhaoyang Han
Lu Zhou
Zhe Liu
AAML
SILM
16
18
0
05 Sep 2022
Shortcut Learning of Large Language Models in Natural Language
  Understanding
Shortcut Learning of Large Language Models in Natural Language Understanding
Mengnan Du
Fengxiang He
Na Zou
Dacheng Tao
Xia Hu
KELM
OffRL
24
82
0
25 Aug 2022
DECK: Model Hardening for Defending Pervasive Backdoors
DECK: Model Hardening for Defending Pervasive Backdoors
Guanhong Tao
Yingqi Liu
Shuyang Cheng
Shengwei An
Zhuo Zhang
Qiuling Xu
Guangyu Shen
Xiangyu Zhang
AAML
18
7
0
18 Jun 2022
Is Multi-Modal Necessarily Better? Robustness Evaluation of Multi-modal
  Fake News Detection
Is Multi-Modal Necessarily Better? Robustness Evaluation of Multi-modal Fake News Detection
Jinyin Chen
Chengyu Jia
Haibin Zheng
Ruoxi Chen
Chenbo Fu
AAML
22
9
0
17 Jun 2022
BadDet: Backdoor Attacks on Object Detection
BadDet: Backdoor Attacks on Object Detection
Shih-Han Chan
Yinpeng Dong
Junyi Zhu
Xiaolu Zhang
Jun Zhou
AAML
13
55
0
28 May 2022
WeDef: Weakly Supervised Backdoor Defense for Text Classification
WeDef: Weakly Supervised Backdoor Defense for Text Classification
Lesheng Jin
Zihan Wang
Jingbo Shang
AAML
8
14
0
24 May 2022
ET-BERT: A Contextualized Datagram Representation with Pre-training
  Transformers for Encrypted Traffic Classification
ET-BERT: A Contextualized Datagram Representation with Pre-training Transformers for Encrypted Traffic Classification
Xinjie Lin
G. Xiong
Gaopeng Gou
Zhen Li
Junzheng Shi
J. Yu
13
235
0
13 Feb 2022
Neighboring Backdoor Attacks on Graph Convolutional Network
Neighboring Backdoor Attacks on Graph Convolutional Network
Liang Chen
Qibiao Peng
Jintang Li
Yang Liu
Jiawei Chen
Yong Li
Zibin Zheng
GNN
AAML
22
11
0
17 Jan 2022
Security for Machine Learning-based Software Systems: a survey of
  threats, practices and challenges
Security for Machine Learning-based Software Systems: a survey of threats, practices and challenges
Huaming Chen
Muhammad Ali Babar
AAML
27
21
0
12 Jan 2022
Measure and Improve Robustness in NLP Models: A Survey
Measure and Improve Robustness in NLP Models: A Survey
Xuezhi Wang
Haohan Wang
Diyi Yang
139
130
0
15 Dec 2021
Spinning Language Models: Risks of Propaganda-As-A-Service and
  Countermeasures
Spinning Language Models: Risks of Propaganda-As-A-Service and Countermeasures
Eugene Bagdasaryan
Vitaly Shmatikov
SILM
AAML
16
75
0
09 Dec 2021
Towards Practical Deployment-Stage Backdoor Attack on Deep Neural
  Networks
Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks
Xiangyu Qi
Tinghao Xie
Ruizhe Pan
Jifeng Zhu
Yong-Liang Yang
Kai Bu
AAML
21
57
0
25 Nov 2021
12
Next