Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2412.10321
Cited By
v1
v2 (latest)
AdvPrefix: An Objective for Nuanced LLM Jailbreaks
13 December 2024
Sicheng Zhu
Brandon Amos
Yuandong Tian
Chuan Guo
Ivan Evtimov
AAML
Re-assign community
ArXiv (abs)
PDF
HTML
Github (35★)
Papers citing
"AdvPrefix: An Objective for Nuanced LLM Jailbreaks"
8 / 8 papers shown
Title
AdversariaLLM: A Unified and Modular Toolbox for LLM Robustness Research
Tim Beyer
Jonas Dornbusch
Jakob Steimle
Moritz Ladenburger
Leo Schwinn
Stephan Günnemann
AAML
236
0
0
06 Nov 2025
MetaDefense: Defending Finetuning-based Jailbreak Attack Before and During Generation
Weisen Jiang
Sinno Jialin Pan
AAML
148
1
0
09 Oct 2025
Where to Start Alignment? Diffusion Large Language Model May Demand a Distinct Position
Zhixin Xie
Xurui Song
Jun Luo
144
5
0
17 Aug 2025
Jailbreak Distillation: Renewable Safety Benchmarking
Jingyu Zhang
Ahmed Elgohary
Xiawei Wang
A S M Iftekhar
Ahmed Magooda
Benjamin Van Durme
Daniel Khashabi
Kyle Jackson
AAML
ALM
223
0
0
28 May 2025
Security Concerns for Large Language Models: A Survey
Miles Q. Li
Benjamin C. M. Fung
PILM
ELM
730
14
0
24 May 2025
Checkpoint-GCG: Auditing and Attacking Fine-Tuning-Based Prompt Injection Defenses
Xiaoxue Yang
Bozhidar Stevanoski
Matthieu Meeus
Yves-Alexandre de Montjoye
AAML
274
1
0
21 May 2025
LLM-Safety Evaluations Lack Robustness
Tim Beyer
Sophie Xhonneux
Simon Geisler
Gauthier Gidel
Leo Schwinn
Stephan Günnemann
ALM
ELM
992
10
0
04 Mar 2025
REINFORCE Adversarial Attacks on Large Language Models: An Adaptive, Distributional, and Semantic Objective
Simon Geisler
Tom Wollschlager
M. H. I. Abdalla
Vincent Cohen-Addad
Johannes Gasteiger
Stephan Günnemann
AAML
265
8
0
24 Feb 2025
1