Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.08005
Cited By
Beyond the Safeguards: Exploring the Security Risks of ChatGPT
13 May 2023
Erik Derner
Kristina Batistic
SILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond the Safeguards: Exploring the Security Risks of ChatGPT"
39 / 39 papers shown
Title
AI Ethics and Social Norms: Exploring ChatGPT's Capabilities From What to How
Omid Veisi
Sasan Bahrami
Roman Englert
Claudia Müller
75
0
0
25 Apr 2025
SOK: Exploring Hallucinations and Security Risks in AI-Assisted Software Development with Insights for LLM Deployment
Ariful Haque
Sunzida Siddique
M. Rahman
Ahmed Rafi Hasan
Laxmi Rani Das
Marufa Kamal
Tasnim Masura
Kishor Datta Gupta
51
1
0
31 Jan 2025
AI Safety in Generative AI Large Language Models: A Survey
Jaymari Chua
Yun Yvonna Li
Shiyi Yang
Chen Wang
Lina Yao
LM&MA
34
12
0
06 Jul 2024
The Art of Saying No: Contextual Noncompliance in Language Models
Faeze Brahman
Sachin Kumar
Vidhisha Balachandran
Pradeep Dasigi
Valentina Pyatkin
...
Jack Hessel
Yulia Tsvetkov
Noah A. Smith
Yejin Choi
Hannaneh Hajishirzi
65
20
0
02 Jul 2024
A Complete Survey on LLM-based AI Chatbots
Sumit Kumar Dam
Choong Seon Hong
Yu Qiao
Chaoning Zhang
49
51
0
17 Jun 2024
Is On-Device AI Broken and Exploitable? Assessing the Trust and Ethics in Small Language Models
Kalyan Nakka
Jimmy Dani
Nitesh Saxena
43
1
0
08 Jun 2024
Measure-Observe-Remeasure: An Interactive Paradigm for Differentially-Private Exploratory Analysis
Priyanka Nanayakkara
Hyeok Kim
Yifan Wu
Ali Sarvghad
Narges Mahyar
G. Miklau
Jessica Hullman
26
17
0
04 Jun 2024
Towards Trustworthy AI: A Review of Ethical and Robust Large Language Models
Meftahul Ferdaus
Mahdi Abdelguerfi
Elias Ioup
Kendall N. Niles
Ken Pathak
Steve Sloan
32
10
0
01 Jun 2024
FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing
Kai Huang
Wei Gao
32
2
0
24 May 2024
Tagengo: A Multilingual Chat Dataset
P. Devine
29
3
0
21 May 2024
Risks of Practicing Large Language Models in Smart Grid: Threat Modeling and Validation
Jiangnan Li
Yingyuan Yang
Jinyuan Stella Sun
57
3
0
10 May 2024
Large Language Models for Cyber Security: A Systematic Literature Review
HanXiang Xu
Shenao Wang
Ningke Li
K. Wang
Yanjie Zhao
Kai Chen
Ting Yu
Yang Janet Liu
H. Wang
29
23
0
08 May 2024
SmartMem: Layout Transformation Elimination and Adaptation for Efficient DNN Execution on Mobile
Wei Niu
Md. Musfiqur Rahman Sanim
Zhihao Shu
Jiexiong Guan
Xipeng Shen
Miao Yin
Gagan Agrawal
Bin Ren
27
6
0
21 Apr 2024
Risk and Response in Large Language Models: Evaluating Key Threat Categories
Bahareh Harandizadeh
A. Salinas
Fred Morstatter
20
3
0
22 Mar 2024
On Protecting the Data Privacy of Large Language Models (LLMs): A Survey
Biwei Yan
Kun Li
Minghui Xu
Yueyan Dong
Yue Zhang
Zhaochun Ren
Xiuzhen Cheng
AILaw
PILM
70
76
0
08 Mar 2024
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Akila Wickramasekara
F. Breitinger
Mark Scanlon
42
7
0
29 Feb 2024
Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction
Tong Liu
Yingjie Zhang
Zhe Zhao
Yinpeng Dong
Guozhu Meng
Kai Chen
AAML
38
44
0
28 Feb 2024
Farsight: Fostering Responsible AI Awareness During AI Application Prototyping
Zijie J. Wang
Chinmay Kulkarni
Lauren Wilcox
Michael Terry
Michael A. Madaio
38
42
0
23 Feb 2024
Mapping the Ethics of Generative AI: A Comprehensive Scoping Review
Thilo Hagendorff
21
35
0
13 Feb 2024
Whispers in the Machine: Confidentiality in LLM-integrated Systems
Jonathan Evertz
Merlin Chlosta
Lea Schonherr
Thorsten Eisenhofer
69
16
0
10 Feb 2024
Improving Dialog Safety using Socially Aware Contrastive Learning
Souvik Das
R. Srihari
11
1
0
01 Feb 2024
The Ethics of Interaction: Mitigating Security Threats in LLMs
Ashutosh Kumar
Shiv Vignesh Murty
Sagarika Singh
Swathy Ragupathy
10
32
0
22 Jan 2024
A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly
Yifan Yao
Jinhao Duan
Kaidi Xu
Yuanfang Cai
Eric Sun
Yue Zhang
PILM
ELM
24
468
0
04 Dec 2023
From Chatbots to PhishBots? -- Preventing Phishing scams created using ChatGPT, Google Bard and Claude
S. Roy
Poojitha Thota
Krishna Vamsi Naragam
Shirin Nilizadeh
SILM
41
15
0
29 Oct 2023
Ask Again, Then Fail: Large Language Models' Vacillations in Judgment
Qiming Xie
Zengzhi Wang
Yi Feng
Rui Xia
AAML
HILM
16
9
0
03 Oct 2023
Can LLM-Generated Misinformation Be Detected?
Canyu Chen
Kai Shu
DeLMO
29
157
0
25 Sep 2023
Efficient Avoidance of Vulnerabilities in Auto-completed Smart Contract Code Using Vulnerability-constrained Decoding
André Storhaug
Jingyue Li
Tianyuan Hu
AAML
21
14
0
18 Sep 2023
Distilled GPT for Source Code Summarization
Chia-Yi Su
Collin McMillan
22
35
0
28 Aug 2023
GPTEval: A Survey on Assessments of ChatGPT and GPT-4
Rui Mao
Guanyi Chen
Xulang Zhang
Frank Guerin
Erik Cambria
ELM
LM&MA
28
98
0
24 Aug 2023
Using Large Language Models for Cybersecurity Capture-The-Flag Challenges and Certification Questions
W. Tann
Yuancheng Liu
Jun Heng Sim
C. Seah
E. Chang
ELM
14
30
0
21 Aug 2023
RatGPT: Turning online LLMs into Proxies for Malware Attacks
Mika Beckerich
L. Plein
Sergio Coronado
SILM
22
17
0
17 Aug 2023
Learning to Prompt in the Classroom to Understand AI Limits: A pilot study
Emily Theophilou
Cansu Koyuturk
Mona Yavari
Sathya Bursic
Gregor Donabauer
...
Davinia Hernández Leo
Martin Ruskov
D. Taibi
A. Gabbiadini
D. Ognibene
11
28
0
04 Jul 2023
On the Detectability of ChatGPT Content: Benchmarking, Methodology, and Evaluation through the Lens of Academic Writing
Zeyan Liu
Zijun Yao
Fengjun Li
Bo Luo
DeLMO
14
17
0
07 Jun 2023
From Text to MITRE Techniques: Exploring the Malicious Use of Large Language Models for Generating Cyber Attack Payloads
P. Charan
Hrushikesh Chunduri
P. Anand
S. Shukla
14
38
0
24 May 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
Adaptive Sampling Strategies to Construct Equitable Training Datasets
William Cai
R. Encarnación
Bobbie Chern
S. Corbett-Davies
Miranda Bogen
Stevie Bergman
Sharad Goel
77
29
0
31 Jan 2022
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
257
374
0
28 Feb 2021
Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models
Alex Tamkin
Miles Brundage
Jack Clark
Deep Ganguli
AILaw
ELM
192
253
0
04 Feb 2021
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
267
1,808
0
14 Dec 2020
1