Jailbreaking Generative AI: Empowering Novices to Conduct Phishing Attacks

3 March 2025

Abstract

The rapid advancements in generative AI models, such as ChatGPT, have introduced both significant benefits and new risks within the cybersecurity landscape. This paper investigates the potential misuse of the latest AI model, ChatGPT-4o Mini, in facilitating social engineering attacks, with a particular focus on phishing, one of the most pressing cybersecurity threats today. While existing literature primarily addresses the technical aspects, such as jailbreaking techniques, none have fully explored the free and straightforward execution of a comprehensive phishing campaign by novice users using ChatGPT-4o Mini. In this study, we examine the vulnerabilities of AI-driven chatbot services in 2025, specifically how methods like jailbreaking and reverse psychology can bypass ethical safeguards, allowing ChatGPT to generate phishing content, suggest hacking tools, and assist in carrying out phishing attacks. Our findings underscore the alarming ease with which even inexperienced users can execute sophisticated phishing campaigns, emphasizing the urgent need for stronger cybersecurity measures and heightened user awareness in the age of AI.

View on arXiv

@article{mishra2025_2503.01395,
  title={ Jailbreaking Generative AI: Empowering Novices to Conduct Phishing Attacks },
  author={ Rina Mishra and Gaurav Varshney and Shreya Singh },
  journal={arXiv preprint arXiv:2503.01395},
  year={ 2025 }
}

Comments on this paper