ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.13345
  4. Cited By
An LLM can Fool Itself: A Prompt-Based Adversarial Attack

An LLM can Fool Itself: A Prompt-Based Adversarial Attack

20 October 2023
Xilie Xu
Keyi Kong
Ning Liu
Li-zhen Cui
Di Wang
Jingfeng Zhang
Mohan S. Kankanhalli
    AAML
    SILM
ArXivPDFHTML

Papers citing "An LLM can Fool Itself: A Prompt-Based Adversarial Attack"

18 / 18 papers shown
Title
CheatAgent: Attacking LLM-Empowered Recommender Systems via LLM Agent
CheatAgent: Attacking LLM-Empowered Recommender Systems via LLM Agent
Liang-bo Ning
Shijie Wang
Wenqi Fan
Qing Li
Xin Xu
Hao Chen
Feiran Huang
AAML
21
16
0
13 Apr 2025
TH-Bench: Evaluating Evading Attacks via Humanizing AI Text on Machine-Generated Text Detectors
Jingyi Zheng
Junfeng Wang
Zhen Sun
Wenhan Dong
Yule Liu
Xinlei He
AAML
43
0
0
10 Mar 2025
Single-pass Detection of Jailbreaking Input in Large Language Models
Single-pass Detection of Jailbreaking Input in Large Language Models
Leyla Naz Candogan
Yongtao Wu
Elias Abad Rocamora
Grigorios G. Chrysos
V. Cevher
AAML
47
0
0
24 Feb 2025
SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation
SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation
Saurabh Kumar Pandey
S. Vashistha
Debrup Das
Somak Aditya
Monojit Choudhury
AAML
69
0
0
10 Feb 2025
Transferable Adversarial Attacks on SAM and Its Downstream Models
Transferable Adversarial Attacks on SAM and Its Downstream Models
Song Xia
Wenhan Yang
Yi Yu
Xun Lin
Henghui Ding
Lingyu Duan
Xudong Jiang
AAML
SILM
46
6
0
26 Oct 2024
Securing Vision-Language Models with a Robust Encoder Against Jailbreak
  and Adversarial Attacks
Securing Vision-Language Models with a Robust Encoder Against Jailbreak and Adversarial Attacks
Md Zarif Hossain
Ahmed Imteaj
AAML
VLM
38
3
0
11 Sep 2024
Do Large Language Models Exhibit Cognitive Dissonance? Studying the
  Difference Between Revealed Beliefs and Stated Answers
Do Large Language Models Exhibit Cognitive Dissonance? Studying the Difference Between Revealed Beliefs and Stated Answers
Manuel Mondal
Ljiljana Dolamic
Gérôme Bovet
Philippe Cudré-Mauroux
Julien Audiffren
34
2
0
21 Jun 2024
Leveraging Logical Rules in Knowledge Editing: A Cherry on the Top
Leveraging Logical Rules in Knowledge Editing: A Cherry on the Top
Keyuan Cheng
Muhammad Asif Ali
Shu Yang
Gang Lin
Yuxuan Zhai
Haoyang Fei
Ke Xu
Lu Yu
Lijie Hu
Di Wang
KELM
32
7
0
24 May 2024
Dialectical Alignment: Resolving the Tension of 3H and Security Threats
  of LLMs
Dialectical Alignment: Resolving the Tension of 3H and Security Threats of LLMs
Shu Yang
Jiayuan Su
Han Jiang
Mengdi Li
Keyuan Cheng
Muhammad Asif Ali
Lijie Hu
Di Wang
16
5
0
30 Mar 2024
LLM Task Interference: An Initial Study on the Impact of Task-Switch in
  Conversational History
LLM Task Interference: An Initial Study on the Impact of Task-Switch in Conversational History
Akash Gupta
Ivaxi Sheth
Vyas Raina
Mark J. F. Gales
Mario Fritz
30
4
0
28 Feb 2024
Stability Analysis of ChatGPT-based Sentiment Analysis in AI Quality
  Assurance
Stability Analysis of ChatGPT-based Sentiment Analysis in AI Quality Assurance
Tinghui Ouyang
AprilPyone Maungmaung
Koichi Konishi
Yoshiki Seo
Isao Echizen
AI4MH
18
5
0
15 Jan 2024
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for
  Generative Large Language Models
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
Potsawee Manakul
Adian Liusie
Mark J. F. Gales
HILM
LRM
150
390
0
15 Mar 2023
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,402
0
28 Jan 2022
Making Pre-trained Language Models Better Few-shot Learners
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
241
1,913
0
31 Dec 2020
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
Chen Zhu
Yu Cheng
Zhe Gan
S. Sun
Tom Goldstein
Jingjing Liu
AAML
221
436
0
25 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
Adversarial Example Generation with Syntactically Controlled Paraphrase
  Networks
Adversarial Example Generation with Syntactically Controlled Paraphrase Networks
Mohit Iyyer
John Wieting
Kevin Gimpel
Luke Zettlemoyer
AAML
GAN
185
711
0
17 Apr 2018
Adversarial examples in the physical world
Adversarial examples in the physical world
Alexey Kurakin
Ian Goodfellow
Samy Bengio
SILM
AAML
250
5,830
0
08 Jul 2016
1