Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection

23 February 2023

Sahar Abdelnabi

Mario Fritz

Papers citing "Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection"

50 / 289 papers shown

Title
Mind Your Questions! Towards Backdoor Attacks on Text-to-Visualization Models Shuaimin Li Yuanfeng Song Xuanang Chen Anni Peng Zhuoyue Wan Chen Jason Zhang Raymond Chi-Wing Wong SILM 29 0 0 09 Oct 2024
Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems Donghyun Lee Mo Tiwari LLMAG 31 9 0 09 Oct 2024
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond Shanshan Han 73 1 0 09 Oct 2024
Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates Xiaosen Zheng Tianyu Pang Chao Du Qian Liu Jing Jiang Min-Bin Lin 33 8 0 09 Oct 2024
Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy Tong Wu Shujian Zhang Kaiqiang Song Silei Xu Sanqiang Zhao Ravi Agrawal Sathish Indurthi Chong Xiang Prateek Mittal Wenxuan Zhou 37 7 0 09 Oct 2024
Permissive Information-Flow Analysis for Large Language Models Shoaib Ahmed Siddiqui Radhika Gaonkar Boris Köpf David M. Krueger Andrew J. Paverd Ahmed Salem Shruti Tople Lukas Wutschitz Menglin Xia Santiago Zanella Béguelin 26 1 0 04 Oct 2024
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents Hanrong Zhang Jingyuan Huang Kai Mei Yifei Yao Zhenting Wang Chenlu Zhan Hongwei Wang Yongfeng Zhang AAML LLMAG ELM 51 18 0 03 Oct 2024
VLMGuard: Defending VLMs against Malicious Prompts via Unlabeled Data Xuefeng Du Reshmi Ghosh Robert Sim Ahmed Salem Vitor Carvalho Emily Lawton Yixuan Li Jack W. Stokes VLM AAML 38 5 0 01 Oct 2024
GenTel-Safe: A Unified Benchmark and Shielding Framework for Defending Against Prompt Injection Attacks Rongchang Li Minjie Chen Chang Hu Han Chen Wenpeng Xing Meng Han SILM ELM 31 1 0 29 Sep 2024
System-Level Defense against Indirect Prompt Injection Attacks: An Information Flow Control Perspective Fangzhou Wu Ethan Cecchetti Chaowei Xiao 39 12 0 27 Sep 2024
Steward: Natural Language Web Automation Brian Tang Kang G. Shin LLMAG 29 1 0 23 Sep 2024
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI Ambrish Rawat Stefan Schoepf Giulio Zizzo Giandomenico Cornacchia Muhammad Zaid Hameed ... Elizabeth M. Daly Mark Purcell P. Sattigeri Pin-Yu Chen Kush R. Varshney AAML 40 7 0 23 Sep 2024
PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs Jiahao Yu Yangguang Shao Hanwen Miao Junzheng Shi SILM AAML 67 4 0 23 Sep 2024
On the Feasibility of Fully AI-automated Vishing Attacks João Figueiredo Afonso Carvalho Daniel Castro Daniel Gonçalves Nuno Santos 19 2 0 20 Sep 2024
Applying Pre-trained Multilingual BERT in Embeddings for Improved Malicious Prompt Injection Attacks Detection M. Rahman Hossain Shahriar Fan Wu A. Cuzzocrea AAML 36 4 0 20 Sep 2024
Trustworthiness in Retrieval-Augmented Generation Systems: A Survey Yujia Zhou Yan Liu Xiaoxi Li Jiajie Jin Hongjin Qian Zheng Liu Chaozhuo Li Zhicheng Dou Tsung-Yi Ho Philip S. Yu 3DV RALM 50 26 0 16 Sep 2024
Exploring LLMs for Malware Detection: Review, Framework Design, and Countermeasure Approaches Jamal N. Al-Karaki Muhammad Al-Zafar Khan Marwan Omar 29 6 0 11 Sep 2024
On the Weaknesses of Backdoor-based Model Watermarking: An Information-theoretic Perspective Aoting Hu Yanzhi Chen Renjie Xie Adrian Weller 38 0 0 10 Sep 2024
Recent Advances in Attack and Defense Approaches of Large Language Models Jing Cui Yishi Xu Zhewei Huang Shuchang Zhou Jianbin Jiao Junge Zhang PILM AAML 52 1 0 05 Sep 2024
SafeEmbodAI: a Safety Framework for Mobile Robots in Embodied AI Systems Wenxiao Zhang Xiangrui Kong Thomas Braunl Jin B. Hong 31 2 0 03 Sep 2024
ContextCite: Attributing Model Generation to Context Benjamin Cohen-Wang Harshay Shah Kristian Georgiev Aleksander Madry LRM 30 18 0 01 Sep 2024
Characterizing and Evaluating the Reliability of LLMs against Jailbreak Attacks Kexin Chen Yi Liu Dongxia Wang Jiaying Chen Wenhai Wang 44 1 0 18 Aug 2024
SAGE-RT: Synthetic Alignment data Generation for Safety Evaluation and Red Teaming Anurakt Kumar Divyanshu Kumar Jatan Loya Nitin Aravind Birur Tanay Baswa Sahil Agarwal P. Harshangi SyDa 37 5 0 14 Aug 2024
A Jailbroken GenAI Model Can Cause Substantial Harm: GenAI-powered Applications are Vulnerable to PromptWares Stav Cohen Ron Bitton Ben Nassi SILM 33 5 0 09 Aug 2024
ConfusedPilot: Confused Deputy Risks in RAG-based LLMs Ayush RoyChowdhury Mulong Luo Prateek Sahu Sarbartha Banerjee Mohit Tiwari SILM 43 0 0 09 Aug 2024
FDI: Attack Neural Code Generation Systems through User Feedback Channel Zhensu Sun Xiaoning Du Xiapu Luo Fu Song David Lo Li Li AAML 23 3 0 08 Aug 2024
Practical Attacks against Black-box Code Completion Engines Slobodan Jenko Jingxuan He Niels Mündler Mark Vero Martin Vechev ELM AAML SILM 27 3 0 05 Aug 2024
Risks, Causes, and Mitigations of Widespread Deployments of Large Language Models (LLMs): A Survey Md. Nazmus Sakib Md Athikul Islam Royal Pathak Md Mashrur Arifin ALM PILM 29 2 0 01 Aug 2024
Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification Boyang Zhang Yicong Tan Yun Shen Ahmed Salem Michael Backes Savvas Zannettou Yang Zhang LLMAG AAML 40 14 0 30 Jul 2024
Can LLMs be Fooled? Investigating Vulnerabilities in LLMs Sara Abdali Jia He C. Barberan Richard Anarfi 29 7 0 30 Jul 2024
The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies Feng He Tianqing Zhu Dayong Ye Bo Liu Wanlei Zhou Philip S. Yu PILM LLMAG ELM 68 23 0 28 Jul 2024
AgentPeerTalk: Empowering Students through Agentic-AI-Driven Discernment of Bullying and Joking in Peer Interactions in Schools Aditya Paul Chi Lok Yu Eva Adelina Susanto Nicholas Wai Long Lau Gwenyth Isobel Meadows LLMAG 35 3 0 27 Jul 2024
LLMmap: Fingerprinting For Large Language Models Dario Pasquini Evgenios M. Kornaropoulos G. Ateniese 50 6 0 22 Jul 2024
Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs) Apurv Verma Satyapriya Krishna Sebastian Gehrmann Madhavan Seshadri Anu Pradhan Tom Ault Leslie Barrett David Rabinowitz John Doucette Nhathai Phan 47 8 0 20 Jul 2024
Systematic Categorization, Construction and Evaluation of New Attacks against Multi-modal Mobile GUI Agents Yulong Yang Xinshan Yang Shuaidong Li Chenhao Lin Zhengyu Zhao Chao Shen Tianwei Zhang 40 1 0 12 Jul 2024
AI Safety in Generative AI Large Language Models: A Survey Jaymari Chua Yun Yvonna Li Shiyi Yang Chen Wang Lina Yao LM&MA 34 12 0 06 Jul 2024
Securing Multi-turn Conversational Language Models Against Distributed Backdoor Triggers Terry Tong Jiashu Xu Qin Liu Muhao Chen AAML SILM 37 1 0 04 Jul 2024
Soft Begging: Modular and Efficient Shielding of LLMs against Prompt Injection and Jailbreaking based on Prompt Tuning Simon Ostermann Kevin Baum Christoph Endres Julia Masloh P. Schramowski AAML 41 1 0 03 Jul 2024
SOS! Soft Prompt Attack Against Open-Source Large Language Models Ziqing Yang Michael Backes Yang Zhang Ahmed Salem AAML 38 6 0 03 Jul 2024
ValueScope: Unveiling Implicit Norms and Values via Return Potential Model of Social Interactions Chan Young Park Shuyue Stella Li Hayoung Jung Svitlana Volkova Tanushree Mitra David Jurgens Yulia Tsvetkov 47 5 0 02 Jul 2024
A Survey on Failure Analysis and Fault Injection in AI Systems Guangba Yu Gou Tan Haojia Huang Zhenyu Zhang Pengfei Chen Roberto Natella Zibin Zheng 34 3 0 28 Jun 2024
A Survey on Privacy Attacks Against Digital Twin Systems in AI-Robotics Ivan A. Fernandez Subash Neupane Trisha Chakraborty Shaswata Mitra Sudip Mittal Nisha Pillai Jingdao Chen Shahram Rahimi 52 1 0 27 Jun 2024
Adversarial Search Engine Optimization for Large Language Models Fredrik Nestaas Edoardo Debenedetti Florian Tramèr AAML 38 4 0 26 Jun 2024
AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents Edoardo Debenedetti Jie Zhang Mislav Balunović Luca Beurer-Kellner Marc Fischer Florian Tramèr LLMAG AAML 43 25 1 19 Jun 2024
Supporting Human Raters with the Detection of Harmful Content using Large Language Models Kurt Thomas Patrick Gage Kelley David Tao Sarah Meiklejohn Owen Vallis Shunwen Tan Blaz Bratanic Felipe Tiengo Ferreira Vijay Eranti Elie Bursztein 28 2 0 18 Jun 2024
The Power of LLM-Generated Synthetic Data for Stance Detection in Online Political Discussions Stefan Sylvius Wagner Maike Behrendt Marc Ziegele Stefan Harmeling 32 9 0 18 Jun 2024
IDs for AI Systems Alan Chan Noam Kolt Peter Wills Usman Anwar Christian Schroeder de Witt Nitarshan Rajkumar Lewis Hammond David M. Krueger Lennart Heim Markus Anderljung 41 6 0 17 Jun 2024
garak: A Framework for Security Probing Large Language Models Leon Derczynski Erick Galinkin Jeffrey Martin Subho Majumdar Nanna Inie AAML ELM 33 16 0 16 Jun 2024
Unique Security and Privacy Threats of Large Language Model: A Comprehensive Survey Shang Wang Tianqing Zhu Bo Liu Ming Ding Xu Guo Dayong Ye Wanlei Zhou Philip S. Yu PILM 57 16 0 12 Jun 2024
Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition Edoardo Debenedetti Javier Rando Daniel Paleka Silaghi Fineas Florin Dragos Albastroiu ... Stefan Kraft Mario Fritz Florian Tramèr Sahar Abdelnabi Lea Schonherr 46 9 0 12 Jun 2024