Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.07954
Cited By
Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
12 June 2024
Edoardo Debenedetti
Javier Rando
Daniel Paleka
Silaghi Fineas Florin
Dragos Albastroiu
Niv Cohen
Yuval Lemberg
Reshmi Ghosh
Rui Wen
Ahmed Salem
Giovanni Cherubin
Santiago Zanella Béguelin
Robin Schmid
Victor Klemm
Takahiro Miki
Chenhao Li
Stefan Kraft
Mario Fritz
Florian Tramèr
Sahar Abdelnabi
Lea Schonherr
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition"
9 / 9 papers shown
Title
Attention Tracker: Detecting Prompt Injection Attacks in LLMs
Kuo-Han Hung
Ching-Yun Ko
Ambrish Rawat
I-Hsin Chung
Winston H. Hsu
Pin-Yu Chen
46
7
0
01 Nov 2024
Hey GPT, Can You be More Racist? Analysis from Crowdsourced Attempts to Elicit Biased Content from Generative AI
Hangzhi Guo
Pranav Narayanan Venkit
Eunchae Jang
Mukund Srinath
Wenbo Zhang
Bonam Mingole
Vipul Gupta
Kush R. Varshney
S. Shyam Sundar
A. Yadav
27
3
0
20 Oct 2024
Applying Refusal-Vector Ablation to Llama 3.1 70B Agents
Simon Lermen
Mateusz Dziemian
Govind Pimpale
LLMAG
15
4
0
08 Oct 2024
Position: LLM Unlearning Benchmarks are Weak Measures of Progress
Pratiksha Thaker
Shengyuan Hu
Neil Kale
Yash Maurya
Zhiwei Steven Wu
Virginia Smith
MU
39
10
0
03 Oct 2024
Prompt Obfuscation for Large Language Models
David Pape
Thorsten Eisenhofer
Thorsten Eisenhofer
Lea Schönherr
AAML
31
2
0
17 Sep 2024
AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Edoardo Debenedetti
Jie Zhang
Mislav Balunović
Luca Beurer-Kellner
Marc Fischer
Florian Tramèr
LLMAG
AAML
43
25
1
19 Jun 2024
Are you still on track!? Catching LLM Task Drift with Activations
Sahar Abdelnabi
Aideen Fay
Giovanni Cherubin
Ahmed Salem
Mario Fritz
Andrew J. Paverd
55
7
0
02 Jun 2024
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
Eric Wallace
Kai Y. Xiao
R. Leike
Lilian Weng
Johannes Heidecke
Alex Beutel
SILM
47
113
0
19 Apr 2024
Whispers in the Machine: Confidentiality in LLM-integrated Systems
Jonathan Evertz
Merlin Chlosta
Lea Schonherr
Thorsten Eisenhofer
69
15
0
10 Feb 2024
1