ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.09002
  4. Cited By
AttackEval: How to Evaluate the Effectiveness of Jailbreak Attacking on
  Large Language Models

AttackEval: How to Evaluate the Effectiveness of Jailbreak Attacking on Large Language Models

17 January 2024
Dong Shu
Mingyu Jin
Suiyuan Zhu
Beichen Wang
Zihao Zhou
Chong Zhang
Yongfeng Zhang
    ELM
ArXivPDFHTML

Papers citing "AttackEval: How to Evaluate the Effectiveness of Jailbreak Attacking on Large Language Models"

13 / 13 papers shown
Title
T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models
T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models
Siyuan Liang
Jiayang Liu
Jiecheng Zhai
Tianmeng Fang
Rongcheng Tu
A. Liu
Xiaochun Cao
Dacheng Tao
VGen
49
0
0
22 Apr 2025
PiCo: Jailbreaking Multimodal Large Language Models via $\textbf{Pi}$ctorial $\textbf{Co}$de Contextualization
PiCo: Jailbreaking Multimodal Large Language Models via Pi\textbf{Pi}Pictorial Co\textbf{Co}Code Contextualization
Aofan Liu
Lulu Tang
Ting Pan
Yuguo Yin
Bin Wang
Ao Yang
MLLM
AAML
40
0
0
02 Apr 2025
Iterative Prompting with Persuasion Skills in Jailbreaking Large Language Models
Iterative Prompting with Persuasion Skills in Jailbreaking Large Language Models
Shih-Wen Ke
Guan-Yu Lai
Guo-Lin Fang
Hsi-Yuan Kao
SILM
79
0
0
26 Mar 2025
JAILJUDGE: A Comprehensive Jailbreak Judge Benchmark with Multi-Agent
  Enhanced Explanation Evaluation Framework
JAILJUDGE: A Comprehensive Jailbreak Judge Benchmark with Multi-Agent Enhanced Explanation Evaluation Framework
Fan Liu
Yue Feng
Zhao Xu
Lixin Su
Xinyu Ma
Dawei Yin
Hao Liu
ELM
19
7
0
11 Oct 2024
PathSeeker: Exploring LLM Security Vulnerabilities with a Reinforcement
  Learning-Based Jailbreak Approach
PathSeeker: Exploring LLM Security Vulnerabilities with a Reinforcement Learning-Based Jailbreak Approach
Zhihao Lin
Wei Ma
Mingyi Zhou
Yanjie Zhao
Haoyu Wang
Yang Liu
Jun Wang
Li Li
AAML
30
5
0
21 Sep 2024
JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models
JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models
Delong Ran
Jinyuan Liu
Yichen Gong
Jingyi Zheng
Xinlei He
Tianshuo Cong
Anyu Wang
ELM
29
10
0
13 Jun 2024
Jailbreak Vision Language Models via Bi-Modal Adversarial Prompt
Jailbreak Vision Language Models via Bi-Modal Adversarial Prompt
Zonghao Ying
Aishan Liu
Tianyuan Zhang
Zhengmin Yu
Siyuan Liang
Xianglong Liu
Dacheng Tao
AAML
33
26
0
06 Jun 2024
AutoJailbreak: Exploring Jailbreak Attacks and Defenses through a
  Dependency Lens
AutoJailbreak: Exploring Jailbreak Attacks and Defenses through a Dependency Lens
Lin Lu
Hai Yan
Zenghui Yuan
Jiawen Shi
Wenqi Wei
Pin-Yu Chen
Pan Zhou
AAML
38
8
0
06 Jun 2024
Goal-guided Generative Prompt Injection Attack on Large Language Models
Goal-guided Generative Prompt Injection Attack on Large Language Models
Chong Zhang
Mingyu Jin
Qinkai Yu
Chengzhi Liu
Haochen Xue
Xiaobo Jin
AAML
SILM
26
9
0
06 Apr 2024
Generative Models and Connected and Automated Vehicles: A Survey in
  Exploring the Intersection of Transportation and AI
Generative Models and Connected and Automated Vehicles: A Survey in Exploring the Intersection of Transportation and AI
Dong Shu
Zhouyao Zhu
16
1
0
14 Mar 2024
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
Xing-ming Guo
Fangxu Yu
Huan Zhang
Lianhui Qin
Bin Hu
AAML
109
69
0
13 Feb 2024
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated
  Jailbreak Prompts
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
Jiahao Yu
Xingwei Lin
Zheng Yu
Xinyu Xing
SILM
110
292
0
19 Sep 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
1