ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.10321
  4. Cited By
AdvPrefix: An Objective for Nuanced LLM Jailbreaks
v1v2 (latest)

AdvPrefix: An Objective for Nuanced LLM Jailbreaks

13 December 2024
Sicheng Zhu
Brandon Amos
Yuandong Tian
Chuan Guo
Ivan Evtimov
    AAML
ArXiv (abs)PDFHTMLGithub (35★)

Papers citing "AdvPrefix: An Objective for Nuanced LLM Jailbreaks"

8 / 8 papers shown
Title
AdversariaLLM: A Unified and Modular Toolbox for LLM Robustness Research
AdversariaLLM: A Unified and Modular Toolbox for LLM Robustness Research
Tim Beyer
Jonas Dornbusch
Jakob Steimle
Moritz Ladenburger
Leo Schwinn
Stephan Günnemann
AAML
236
0
0
06 Nov 2025
MetaDefense: Defending Finetuning-based Jailbreak Attack Before and During Generation
MetaDefense: Defending Finetuning-based Jailbreak Attack Before and During Generation
Weisen Jiang
Sinno Jialin Pan
AAML
148
1
0
09 Oct 2025
Where to Start Alignment? Diffusion Large Language Model May Demand a Distinct Position
Where to Start Alignment? Diffusion Large Language Model May Demand a Distinct Position
Zhixin Xie
Xurui Song
Jun Luo
144
5
0
17 Aug 2025
Jailbreak Distillation: Renewable Safety Benchmarking
Jailbreak Distillation: Renewable Safety Benchmarking
Jingyu Zhang
Ahmed Elgohary
Xiawei Wang
A S M Iftekhar
Ahmed Magooda
Benjamin Van Durme
Daniel Khashabi
Kyle Jackson
AAMLALM
223
0
0
28 May 2025
Security Concerns for Large Language Models: A Survey
Security Concerns for Large Language Models: A Survey
Miles Q. Li
Benjamin C. M. Fung
PILMELM
730
14
0
24 May 2025
Checkpoint-GCG: Auditing and Attacking Fine-Tuning-Based Prompt Injection Defenses
Checkpoint-GCG: Auditing and Attacking Fine-Tuning-Based Prompt Injection Defenses
Xiaoxue Yang
Bozhidar Stevanoski
Matthieu Meeus
Yves-Alexandre de Montjoye
AAML
274
1
0
21 May 2025
LLM-Safety Evaluations Lack Robustness
Tim Beyer
Sophie Xhonneux
Simon Geisler
Gauthier Gidel
Leo Schwinn
Stephan Günnemann
ALMELM
992
10
0
04 Mar 2025
REINFORCE Adversarial Attacks on Large Language Models: An Adaptive, Distributional, and Semantic Objective
Simon Geisler
Tom Wollschlager
M. H. I. Abdalla
Vincent Cohen-Addad
Johannes Gasteiger
Stephan Günnemann
AAML
265
8
0
24 Feb 2025
1