Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection

23 February 2023

Sahar Abdelnabi

Mario Fritz

Papers citing "Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection"

39 / 289 papers shown

Title
Attack Prompt Generation for Red Teaming and Defending Large Language Models Boyi Deng Wenjie Wang Fuli Feng Yang Deng Qifan Wang Xiangnan He AAML 14 48 0 19 Oct 2023
LLMs as Hackers: Autonomous Linux Privilege Escalation Attacks A. Happe Aaron Kaplan Jürgen Cito 19 16 0 17 Oct 2023
Survey of Vulnerabilities in Large Language Models Revealed by Adversarial Attacks Erfan Shayegani Md Abdullah Al Mamun Yu Fu Pedram Zaree Yue Dong Nael B. Abu-Ghazaleh AAML 147 144 0 16 Oct 2023
Privacy in Large Language Models: Attacks, Defenses and Future Directions Haoran Li Yulin Chen Jinglong Luo Yan Kang Xiaojin Zhang Qi Hu Chunkit Chan Yangqiu Song PILM 38 40 0 16 Oct 2023
Digital Deception: Generative Artificial Intelligence in Social Engineering and Phishing Marc Schmitt Ivan Flechais 13 34 0 15 Oct 2023
Misusing Tools in Large Language Models With Visual Adversarial Examples Xiaohan Fu Zihan Wang Shuheng Li Rajesh K. Gupta Niloofar Mireshghallah Taylor Berg-Kirkpatrick Earlence Fernandes AAML 21 24 0 04 Oct 2023
Functional trustworthiness of AI systems by statistically valid testing Bernhard Nessler Thomas Doms Sepp Hochreiter 23 0 0 04 Oct 2023
Effective Long-Context Scaling of Foundation Models Wenhan Xiong Jingyu Liu Igor Molybog Hejia Zhang Prajjwal Bhargava ... Dániel Baráth Sergey Edunov Mike Lewis Sinong Wang Hao Ma 29 204 0 27 Sep 2023
Goal-Oriented Prompt Attack and Safety Evaluation for LLMs Chengyuan Liu Fubang Zhao Lizhi Qing Yangyang Kang Changlong Sun Kun Kuang Fei Wu AAML 26 15 0 21 Sep 2023
LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Plugins Umar Iqbal Tadayoshi Kohno Franziska Roesner ELM SILM 63 45 0 19 Sep 2023
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts Jiahao Yu Xingwei Lin Zheng Yu Xinyu Xing SILM 113 300 0 19 Sep 2023
Adversarial Attacks on Tables with Entity Swap A. Koleva Martin Ringsquandl Volker Tresp AAML 12 3 0 15 Sep 2023
Demystifying RCE Vulnerabilities in LLM-Integrated Apps Tong Liu Zizhuang Deng Guozhu Meng Yuekang Li Kai Chen SILM 34 19 0 06 Sep 2023
Donkii: Can Annotation Error Detection Methods Find Errors in Instruction-Tuning Datasets? Leon Weber-Genzel Robert Litschko Ekaterina Artemova Barbara Plank 14 2 0 04 Sep 2023
Baseline Defenses for Adversarial Attacks Against Aligned Language Models Neel Jain Avi Schwarzschild Yuxin Wen Gowthami Somepalli John Kirchenbauer Ping Yeh-Chiang Micah Goldblum Aniruddha Saha Jonas Geiping Tom Goldstein AAML 31 335 0 01 Sep 2023
Image Hijacks: Adversarial Images can Control Generative Models at Runtime Luke Bailey Euan Ong Stuart J. Russell Scott Emmons VLM MLLM 16 78 0 01 Sep 2023
RatGPT: Turning online LLMs into Proxies for Malware Attacks Mika Beckerich L. Plein Sergio Coronado SILM 24 17 0 17 Aug 2023
From Prompt Injections to SQL Injection Attacks: How Protected is Your LLM-Integrated Web Application? Rodrigo Pedro Daniel Castro Paulo Carreira Nuno Santos SILM AAML 36 50 0 03 Aug 2023
Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection Jun Yan Vikas Yadav Shiyang Li Lichang Chen Zheng Tang Hai Wang Vijay Srinivasan Xiang Ren Hongxia Jin SILM 15 75 0 31 Jul 2023
The Ethics of AI Value Chains Blair Attard-Frost D. Widder 22 1 0 31 Jul 2023
On the Trustworthiness Landscape of State-of-the-art Generative Models: A Survey and Outlook Mingyuan Fan Chengyu Wang Cen Chen Yang Liu Jun Huang HILM 31 3 0 31 Jul 2023
LLM Censorship: A Machine Learning Challenge or a Computer Security Problem? David Glukhov Ilia Shumailov Y. Gal Nicolas Papernot V. Papyan AAML ELM 21 56 0 20 Jul 2023
Abusing Images and Sounds for Indirect Instruction Injection in Multi-Modal LLMs Eugene Bagdasaryan Tsung-Yin Hsieh Ben Nassi Vitaly Shmatikov 8 78 0 19 Jul 2023
Latent Jailbreak: A Benchmark for Evaluating Text Safety and Output Robustness of Large Language Models Huachuan Qiu Shuai Zhang Anqi Li Hongliang He Zhenzhong Lan ALM 37 48 0 17 Jul 2023
MasterKey: Automated Jailbreak Across Multiple Large Language Model Chatbots Gelei Deng Yi Liu Yuekang Li Kailong Wang Ying Zhang Zefeng Li Haoyu Wang Tianwei Zhang Yang Liu SILM 33 118 0 16 Jul 2023
Effective Prompt Extraction from Language Models Yiming Zhang Nicholas Carlini Daphne Ippolito MIACV SILM 25 35 0 13 Jul 2023
Opportunities and Risks of LLMs for Scalable Deliberation with Polis Christopher T. Small Ivan Vendrov Esin Durmus Hadjar Homaei Elizabeth Barry Julien Cornebise Ted Suzman Deep Ganguli Colin Megill 24 26 0 20 Jun 2023
Prompt Injection attack against LLM-integrated Applications Yi Liu Gelei Deng Yuekang Li Kailong Wang Zihao Wang ... Tianwei Zhang Yepang Liu Haoyu Wang Yanhong Zheng Yang Liu SILM 15 312 0 08 Jun 2023
ReviewerGPT? An Exploratory Study on Using Large Language Models for Paper Reviewing Ryan Liu Nihar B. Shah ELM 36 63 0 01 Jun 2023
GenSpectrum Chat: Data Exploration in Public Health Using Large Language Models C. Chen T. Stadler LM&MA 22 3 0 23 May 2023
How Language Model Hallucinations Can Snowball Muru Zhang Ofir Press William Merrill Alisa Liu Noah A. Smith HILM LRM 78 252 0 22 May 2023
Why So Gullible? Enhancing the Robustness of Retrieval-Augmented Models against Counterfactual Noise Giwon Hong Jeonghwan Kim Junmo Kang Sung-Hyon Myaeng Joyce Jiyoung Whang RALM AAML 22 19 0 02 May 2023
Emergent autonomous scientific research capabilities of large language models Daniil A. Boiko R. MacKnight Gabe Gomes ELM LM&Ro AI4CE LLMAG 101 117 0 11 Apr 2023
Generative Agents: Interactive Simulacra of Human Behavior J. Park Joseph C. O'Brien Carrie J. Cai Meredith Ringel Morris Percy Liang Michael S. Bernstein LM&Ro AI4CE 215 1,727 0 07 Apr 2023
GPT is becoming a Turing machine: Here are some ways to program it A. Jojic Zhen Wang Nebojsa Jojic LRM 47 17 0 25 Mar 2023
ReAct: Synergizing Reasoning and Acting in Language Models Shunyu Yao Jeffrey Zhao Dian Yu Nan Du Izhak Shafran Karthik Narasimhan Yuan Cao LLMAG ReLM LRM 233 2,470 0 06 Oct 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 303 11,881 0 04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Jason W. Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter F. Xia Ed H. Chi Quoc Le Denny Zhou LM&Ro LRM AI4CE ReLM 315 8,402 0 28 Jan 2022
Structural Persistence in Language Models: Priming as a Window into Abstract Language Representations Arabella J. Sinclair Jaap Jumelet Willem H. Zuidema Raquel Fernández 49 37 0 30 Sep 2021