Learning to Poison Large Language Models During Instruction Tuning

21 February 2024

Papers citing "Learning to Poison Large Language Models During Instruction Tuning"

8 / 8 papers shown

Title
PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs Jiahao Yu Yangguang Shao Hanwen Miao Junzheng Shi SILM AAML 45 3 0 23 Sep 2024
Poisoning Language Models During Instruction Tuning Alexander Wan Eric Wallace Sheng Shen Dan Klein SILM 90 124 0 01 May 2023
TrojText: Test-time Invisible Textual Trojan Insertion Qiang Lou Ye Liu Bo Feng 24 21 0 03 Mar 2023
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 301 11,730 0 04 Mar 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization Victor Sanh Albert Webson Colin Raffel Stephen H. Bach Lintang Sutawika ... T. Bers Stella Biderman Leo Gao Thomas Wolf Alexander M. Rush LRM 203 1,651 0 15 Oct 2021
The Power of Scale for Parameter-Efficient Prompt Tuning Brian Lester Rami Al-Rfou Noah Constant VPVLM 275 3,784 0 18 Apr 2021
Clean-Label Backdoor Attacks on Video Recognition Models Shihao Zhao Xingjun Ma Xiang Zheng James Bailey Jingjing Chen Yu-Gang Jiang AAML 165 252 0 06 Mar 2020
Analyzing Federated Learning through an Adversarial Lens A. Bhagoji Supriyo Chakraborty Prateek Mittal S. Calo FedML 169 1,014 0 29 Nov 2018