ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.20653
  4. Cited By
Enhancing Jailbreak Attack Against Large Language Models through Silent
  Tokens

Enhancing Jailbreak Attack Against Large Language Models through Silent Tokens

31 May 2024
Jiahao Yu
Haozheng Luo
Jerry Yao-Chieh Hu
Wenbo Guo
Han Liu
Xinyu Xing
ArXivPDFHTML

Papers citing "Enhancing Jailbreak Attack Against Large Language Models through Silent Tokens"

16 / 16 papers shown
Title
DETAM: Defending LLMs Against Jailbreak Attacks via Targeted Attention Modification
DETAM: Defending LLMs Against Jailbreak Attacks via Targeted Attention Modification
Yu Li
Han Jiang
Zhihua Wei
AAML
29
0
0
18 Apr 2025
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
Jiahao Qiu
Yinghui He
Xinzhe Juan
Y. Wang
Y. Liu
Zixin Yao
Yue Wu
Xun Jiang
L. Yang
Mengdi Wang
AI4MH
65
0
0
13 Apr 2025
DPBloomfilter: Securing Bloom Filters with Differential Privacy
DPBloomfilter: Securing Bloom Filters with Differential Privacy
Yekun Ke
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
43
1
0
02 Feb 2025
Global Challenge for Safe and Secure LLMs Track 1
Global Challenge for Safe and Secure LLMs Track 1
Xiaojun Jia
Yihao Huang
Yang Liu
Peng Yan Tan
Weng Kuan Yau
...
Yan Wang
Rick Siow Mong Goh
Liangli Zhen
Yingjie Zhang
Zhe Zhao
ELM
AILaw
64
0
0
21 Nov 2024
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in
  Red Teaming GenAI
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI
Ambrish Rawat
Stefan Schoepf
Giulio Zizzo
Giandomenico Cornacchia
Muhammad Zaid Hameed
...
Elizabeth M. Daly
Mark Purcell
P. Sattigeri
Pin-Yu Chen
Kush R. Varshney
AAML
40
6
0
23 Sep 2024
PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs
PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs
Jiahao Yu
Yangguang Shao
Hanwen Miao
Junzheng Shi
SILM
AAML
60
3
0
23 Sep 2024
Differentially Private Kernel Density Estimation
Differentially Private Kernel Density Estimation
Erzhi Liu
Jerry Yao-Chieh Hu
Alex Reneau
Zhao Song
Han Liu
50
3
0
03 Sep 2024
Advanced AI Framework for Enhanced Detection and Assessment of Abdominal
  Trauma: Integrating 3D Segmentation with 2D CNN and RNN Models
Advanced AI Framework for Enhanced Detection and Assessment of Abdominal Trauma: Integrating 3D Segmentation with 2D CNN and RNN Models
Liheng Jiang
Xuechun yang
Chang Yu
Zhizhong Wu
Yuting Wang
40
18
0
23 Jul 2024
Do LLMs dream of elephants (when told not to)? Latent concept
  association and associative memory in transformers
Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers
Yibo Jiang
Goutham Rajendran
Pradeep Ravikumar
Bryon Aragam
CLL
KELM
26
6
0
26 Jun 2024
Decoupled Alignment for Robust Plug-and-Play Adaptation
Decoupled Alignment for Robust Plug-and-Play Adaptation
Haozheng Luo
Jiahao Yu
Wenxin Zhang
Jialong Li
Jerry Yao-Chieh Hu
Xingyu Xing
Han Liu
41
10
0
03 Jun 2024
Gemma: Open Models Based on Gemini Research and Technology
Gemma: Open Models Based on Gemini Research and Technology
Gemma Team
Gemma Team Thomas Mesnard
Cassidy Hardin
Robert Dadashi
Surya Bhupatiraju
...
Armand Joulin
Noah Fiedel
Evan Senter
Alek Andreev
Kathleen Kenealy
VLM
LLMAG
123
415
0
13 Mar 2024
Attacking Large Language Models with Projected Gradient Descent
Attacking Large Language Models with Projected Gradient Descent
Simon Geisler
Tom Wollschlager
M. H. I. Abdalla
Johannes Gasteiger
Stephan Günnemann
AAML
SILM
42
48
0
14 Feb 2024
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated
  Jailbreak Prompts
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
Jiahao Yu
Xingwei Lin
Zheng Yu
Xinyu Xing
SILM
110
292
0
19 Sep 2023
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors,
  and Lessons Learned
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
218
441
0
23 Aug 2022
Challenges in Detoxifying Language Models
Challenges in Detoxifying Language Models
Johannes Welbl
Amelia Glaese
J. Uesato
Sumanth Dathathri
John F. J. Mellor
Lisa Anne Hendricks
Kirsty Anderson
Pushmeet Kohli
Ben Coppin
Po-Sen Huang
LM&MA
242
191
0
15 Sep 2021
Open-Ended Multi-Modal Relational Reasoning for Video Question Answering
Open-Ended Multi-Modal Relational Reasoning for Video Question Answering
Haozheng Luo
Ruiyang Qin
Chenwei Xu
Guo Ye
Zening Luo
43
4
0
01 Dec 2020
1