Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.11755
Cited By
SPML: A DSL for Defending Language Models Against Prompt Attacks
19 February 2024
Reshabh K Sharma
Vinayak Gupta
Dan Grossman
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SPML: A DSL for Defending Language Models Against Prompt Attacks"
9 / 9 papers shown
Title
Attack and defense techniques in large language models: A survey and new perspectives
Zhiyu Liao
Kang Chen
Yuanguo Lin
Kangkang Li
Yunxuan Liu
Hefeng Chen
Xingwang Huang
Yuanhui Yu
AAML
52
0
0
02 May 2025
Diversity Helps Jailbreak Large Language Models
Weiliang Zhao
Daniel Ben-Levi
Wei Hao
Junfeng Yang
Chengzhi Mao
AAML
48
0
0
06 Nov 2024
BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
Yunhan Zhao
Xiang Zheng
Lin Luo
Yige Li
Xingjun Ma
Yu-Gang Jiang
VLM
AAML
41
3
0
28 Oct 2024
Survey of Vulnerabilities in Large Language Models Revealed by Adversarial Attacks
Erfan Shayegani
Md Abdullah Al Mamun
Yu Fu
Pedram Zaree
Yue Dong
Nael B. Abu-Ghazaleh
AAML
135
139
0
16 Oct 2023
LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Plugins
Umar Iqbal
Tadayoshi Kohno
Franziska Roesner
ELM
SILM
51
41
0
19 Sep 2023
We're Afraid Language Models Aren't Modeling Ambiguity
Alisa Liu
Zhaofeng Wu
Julian Michael
Alane Suhr
Peter West
Alexander Koller
Swabha Swayamdipta
Noah A. Smith
Yejin Choi
51
87
0
27 Apr 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
197
2,953
0
22 Mar 2023
Stealing the Decoding Algorithms of Language Models
A. Naseh
Kalpesh Krishna
Mohit Iyyer
Amir Houmansadr
MLAU
50
20
0
08 Mar 2023
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
264
1,798
0
14 Dec 2020
1