ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Communities
  3. ...

Neighbor communities

0 / 0 papers shown
Title
Top Contributors
Name# Papers# Citations
Social Events
DateLocationEvent
  1. Home
  2. Communities
  3. SILM

Security Issues in Language Models

SILM
More data

LLM security is the investigation of the failure modes of LLMs in use, the conditions that lead to them, and their mitigations. The failure modes include the vulnerabilities of LLM to leak sensitive information or inappropriate contents, inclusion of trojan samples on the web such that an LLM is trained on them to eventually show inappropriate or dangerous behaviours at their deployment, or various potential misuse of LLMs to cause harms and pursue illegal activities.

Neighbor communities

51015

Featured Papers

0 / 0 papers shown
Title

All papers

50 / 917 papers shown
Title
AutoAdv: Automated Adversarial Prompting for Multi-Turn Jailbreaking of Large Language Models
AutoAdv: Automated Adversarial Prompting for Multi-Turn Jailbreaking of Large Language Models
Aashray Reddy
Andrew Zagula
Nicholas Saban
AAMLMUSILM
28
0
0
04 Nov 2025
The SDSC Satellite Reverse Proxy Service for Launching Secure Jupyter Notebooks on High-Performance Computing Systems
The SDSC Satellite Reverse Proxy Service for Launching Secure Jupyter Notebooks on High-Performance Computing Systems
Mary P Thomas
Martin Kandes
James McDougall
Dmitry Mishan
Scott Sakai
Subhashini Sivagnanam
Mahidhar Tatineni
SILMSyDa
73
0
0
03 Nov 2025
Prompt Injection as an Emerging Threat: Evaluating the Resilience of Large Language Models
Prompt Injection as an Emerging Threat: Evaluating the Resilience of Large Language Models
Daniyal Ganiuly
Assel Smaiyl
SILMAAMLELM
37
0
0
03 Nov 2025
Rescuing the Unpoisoned: Efficient Defense against Knowledge Corruption Attacks on RAG Systems
Rescuing the Unpoisoned: Efficient Defense against Knowledge Corruption Attacks on RAG Systems
Minseok Kim
Hankook Lee
Hyungjoon Koo
AAMLSILM
16
0
0
03 Nov 2025
ToxicTextCLIP: Text-Based Poisoning and Backdoor Attacks on CLIP Pre-training
ToxicTextCLIP: Text-Based Poisoning and Backdoor Attacks on CLIP Pre-training
Xin Yao
Haiyang Zhao
Yimin Chen
Jiawei Guo
Kecheng Huang
Ming Zhao
CLIPSILMVLM
20
0
0
01 Nov 2025
Enhancing Adversarial Transferability by Balancing Exploration and Exploitation with Gradient-Guided Sampling
Enhancing Adversarial Transferability by Balancing Exploration and Exploitation with Gradient-Guided Sampling
Zenghao Niu
Weicheng Xie
Siyang Song
Zitong Yu
Feng Liu
Linlin Shen
AAMLSILM
96
0
0
01 Nov 2025
DRIP: Defending Prompt Injection via De-instruction Training and Residual Fusion Model Architecture
DRIP: Defending Prompt Injection via De-instruction Training and Residual Fusion Model Architecture
Ruofan Liu
Yun Lin
Jin Song Dong
AAMLSILM
16
0
0
01 Nov 2025
Prevalence of Security and Privacy Risk-Inducing Usage of AI-based Conversational Agents
Prevalence of Security and Privacy Risk-Inducing Usage of AI-based Conversational Agents
Kathrin Grosse
Nico Ebert
SILM
80
0
0
31 Oct 2025
Secure Retrieval-Augmented Generation against Poisoning Attacks
Secure Retrieval-Augmented Generation against Poisoning Attacks
Zirui Cheng
Jikai Sun
Anjun Gao
Yueyang Quan
Zhuqing Liu
Xiaohua Hu
Minghong Fang
SILMAAML
35
0
0
28 Oct 2025
Do Chatbots Walk the Talk of Responsible AI?
Do Chatbots Walk the Talk of Responsible AI?
Susan Ariel Aaronson
Michael Moreno
SILMAI4MH
158
0
0
28 Oct 2025
S3C2 Summit 2025-03: Industry Secure Supply Chain Summit
S3C2 Summit 2025-03: Industry Secure Supply Chain Summit
Elizabeth Lin
Jonah Ghebremichael
William Enck
Yasemin Acar
Michel Cukier
A. Kapravelos
Christian Kastner
Laurie A. Williams
SILMELM
89
0
0
28 Oct 2025
AutoPrompt: Automated Red-Teaming of Text-to-Image Models via LLM-Driven Adversarial Prompts
AutoPrompt: Automated Red-Teaming of Text-to-Image Models via LLM-Driven Adversarial Prompts
Yufan Liu
Wanqian Zhang
Huashan Chen
Lin Wang
Xiaojun Jia
Zheng Lin
Weiping Wang
SILM
124
0
0
28 Oct 2025
Is Your Prompt Poisoning Code? Defect Induction Rates and Security Mitigation Strategies
Is Your Prompt Poisoning Code? Defect Induction Rates and Security Mitigation Strategies
Bin Wang
Y. Zhong
MiDi Wan
W. Yu
YuanBing Ouyang
Y. Huang
Hui Li
SILMAAML
56
0
0
27 Oct 2025
RefleXGen:The unexamined code is not worth using
RefleXGen:The unexamined code is not worth usingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Bin Wang
Hui Li
Aofan Liu
BoTao Yang
Ao Yang
Y. Zhong
Weixiang Huang
Y. Zhang
Runhuai Huang
Weimin Zeng
SILM
80
0
0
27 Oct 2025
Jailbreak Mimicry: Automated Discovery of Narrative-Based Jailbreaks for Large Language Models
Jailbreak Mimicry: Automated Discovery of Narrative-Based Jailbreaks for Large Language Models
Pavlos Ntais
AAMLSILM
137
0
0
24 Oct 2025
The Trojan Example: Jailbreaking LLMs through Template Filling and Unsafety Reasoning
The Trojan Example: Jailbreaking LLMs through Template Filling and Unsafety Reasoning
Mingrui Liu
Sixiao Zhang
Cheng Long
Kwok Yan Lam
SILM
57
0
0
24 Oct 2025
NeuroGenPoisoning: Neuron-Guided Attacks on Retrieval-Augmented Generation of LLM via Genetic Optimization of External Knowledge
NeuroGenPoisoning: Neuron-Guided Attacks on Retrieval-Augmented Generation of LLM via Genetic Optimization of External Knowledge
Hanyu Zhu
Lance Fiondella
Jiawei Yuan
K. Zeng
Long Jiao
SILMAAMLKELM
108
0
0
24 Oct 2025
A New Type of Adversarial Examples
A New Type of Adversarial Examples
Xingyang Nie
Guojie Xiao
Su Pan
Biao Wang
Huilin Ge
Tao Fang
AAMLSILM
110
0
0
22 Oct 2025
RESCUE: Retrieval Augmented Secure Code Generation
RESCUE: Retrieval Augmented Secure Code Generation
Jiahao Shi
Tianyi Zhang
SILM
97
0
0
21 Oct 2025
CourtGuard: A Local, Multiagent Prompt Injection Classifier
CourtGuard: A Local, Multiagent Prompt Injection Classifier
Isaac Wu
Michael Maslowski
LLMAGAAMLSILM
90
0
0
20 Oct 2025
The Hidden Cost of Modeling P(X): Vulnerability to Membership Inference Attacks in Generative Text Classifiers
The Hidden Cost of Modeling P(X): Vulnerability to Membership Inference Attacks in Generative Text Classifiers
Owais Makroo
Siva Rajesh Kasa
Sumegh Roychowdhury
Karan Gupta
Nikhil Pattisapu
Santhosh Kumar Kasa
Sumit Negi
SILM
56
0
0
17 Oct 2025
Are My Optimized Prompts Compromised? Exploring Vulnerabilities of LLM-based Optimizers
Are My Optimized Prompts Compromised? Exploring Vulnerabilities of LLM-based Optimizers
Andrew Zhao
Reshmi Ghosh
Vitor Carvalho
Emily Lawton
Keegan Hines
Gao Huang
Jack W. Stokes
AAMLSILM
44
1
0
16 Oct 2025
Securing U.S. Critical Infrastructure: Lessons from Stuxnet and the Ukraine Power Grid Attacks
Securing U.S. Critical Infrastructure: Lessons from Stuxnet and the Ukraine Power Grid Attacks
Jack Vanlyssel
SILM
44
0
0
16 Oct 2025
Open Shouldn't Mean Exempt: Open-Source Exceptionalism and Generative AI
Open Shouldn't Mean Exempt: Open-Source Exceptionalism and Generative AI
David Atkinson
SILM
68
0
0
16 Oct 2025
In-Browser LLM-Guided Fuzzing for Real-Time Prompt Injection Testing in Agentic AI Browsers
In-Browser LLM-Guided Fuzzing for Real-Time Prompt Injection Testing in Agentic AI Browsers
Avihay Cohen
SILMLLMAGAI4CE
81
0
0
15 Oct 2025
PromptLocate: Localizing Prompt Injection Attacks
PromptLocate: Localizing Prompt Injection Attacks
Yuqi Jia
Yupei Liu
Zedian Shao
Jinyuan Jia
Neil Zhenqiang Gong
SILMAAML
165
2
0
14 Oct 2025
Generative AI for Biosciences: Emerging Threats and Roadmap to Biosecurity
Generative AI for Biosciences: Emerging Threats and Roadmap to Biosecurity
Zaixi Zhang
Souradip Chakraborty
Amrit Singh Bedi
Emilin Mathew
Varsha Saravanan
...
Eric Xing
R. Altman
George Church
M. Y. Wang
Mengdi Wang
SILM
164
0
0
13 Oct 2025
CoSPED: Consistent Soft Prompt Targeted Data Extraction and Defense
CoSPED: Consistent Soft Prompt Targeted Data Extraction and Defense
Yang Zhuochen
Fok Kar Wai
Thing Vrizlynn
AAMLSILM
53
0
0
13 Oct 2025
RAG-Pull: Imperceptible Attacks on RAG Systems for Code Generation
RAG-Pull: Imperceptible Attacks on RAG Systems for Code Generation
Vasilije Stambolic
Aritra Dhar
Lukas Cavigelli
AAMLSILM
70
0
0
13 Oct 2025
Safeguarding Efficacy in Large Language Models: Evaluating Resistance to Human-Written and Algorithmic Adversarial Prompts
Safeguarding Efficacy in Large Language Models: Evaluating Resistance to Human-Written and Algorithmic Adversarial Prompts
Tiarnaigh Downey-Webb
Olamide Jogunola
Oluwaseun Ajao
SILMAAMLELM
48
0
0
12 Oct 2025
One Token Embedding Is Enough to Deadlock Your Large Reasoning Model
One Token Embedding Is Enough to Deadlock Your Large Reasoning Model
Mohan Zhang
Yihua Zhang
Jinghan Jia
Zhangyang Wang
Sijia Liu
Tianlong Chen
SILMLRM
52
0
0
12 Oct 2025
RIPRAG: Hack a Black-box Retrieval-Augmented Generation Question-Answering System with Reinforcement Learning
RIPRAG: Hack a Black-box Retrieval-Augmented Generation Question-Answering System with Reinforcement Learning
Meng Xi
Sihan Lv
Yechen Jin
Guanjie Cheng
Naibo Wang
Ying Li
Jianwei Yin
SILMAAML
98
0
0
11 Oct 2025
Text Prompt Injection of Vision Language Models
Text Prompt Injection of Vision Language Models
Ruizhe Zhu
SILMVLM
120
0
0
10 Oct 2025
Exploiting Web Search Tools of AI Agents for Data Exfiltration
Exploiting Web Search Tools of AI Agents for Data Exfiltration
Dennis Rall
Bernhard Bauer
Mohit Mittal
Thomas Fraunholz
SILMAAML
134
0
0
10 Oct 2025
Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples
Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples
Alexandra Souly
Javier Rando
Ed Chapman
Xander Davies
Burak Hasircioglu
...
Erik Jones
Chris Hicks
Nicholas Carlini
Y. Gal
Robert Kirk
AAMLSILM
56
3
0
08 Oct 2025
Differentially Private Synthetic Text Generation for Retrieval-Augmented Generation (RAG)
Differentially Private Synthetic Text Generation for Retrieval-Augmented Generation (RAG)
Junki Mori
Kazuya Kakizaki
Taiki Miyagawa
Jun Sakuma
SILMSyDa
92
0
0
08 Oct 2025
RL Is a Hammer and LLMs Are Nails: A Simple Reinforcement Learning Recipe for Strong Prompt Injection
RL Is a Hammer and LLMs Are Nails: A Simple Reinforcement Learning Recipe for Strong Prompt Injection
Yuxin Wen
Arman Zharmagambetov
Ivan Evtimov
Narine Kokhlikyan
Tom Goldstein
Kamalika Chaudhuri
Chuan Guo
OffRLSILM
76
1
0
06 Oct 2025
Unmasking Backdoors: An Explainable Defense via Gradient-Attention Anomaly Scoring for Pre-trained Language Models
Unmasking Backdoors: An Explainable Defense via Gradient-Attention Anomaly Scoring for Pre-trained Language Models
Anindya Sundar Das
Kangjie Chen
M. Bhuyan
SILMAAML
56
0
0
05 Oct 2025
Backdoor-Powered Prompt Injection Attacks Nullify Defense Methods
Backdoor-Powered Prompt Injection Attacks Nullify Defense Methods
Yulin Chen
Haoran Li
Yuan Sui
Yangqiu Song
Bryan Hooi
SILMAAML
115
0
0
04 Oct 2025
External Data Extraction Attacks against Retrieval-Augmented Large Language Models
External Data Extraction Attacks against Retrieval-Augmented Large Language Models
Yu He
Y. Chen
Y. Li
Shuo Shao
Leyi Qi
Boheng Li
Dacheng Tao
Zhan Qin
AAMLSILM
137
0
0
03 Oct 2025
Bypassing Prompt Guards in Production with Controlled-Release Prompting
Bypassing Prompt Guards in Production with Controlled-Release Prompting
Jaiden Fairoze
Sanjam Garg
Keewoo Lee
Mingyuan Wang
SILMAAML
128
0
0
02 Oct 2025
InvThink: Towards AI Safety via Inverse Reasoning
InvThink: Towards AI Safety via Inverse Reasoning
Yubin Kim
Taehan Kim
Lizhou Fan
Chunjong Park
C. Breazeal
Daniel J. McDuff
Hae Won Park
ReLMSILMMULRMAI4CE
108
0
0
02 Oct 2025
A Call to Action for a Secure-by-Design Generative AI Paradigm
A Call to Action for a Secure-by-Design Generative AI Paradigm
Dalal Alharthi
Ivan Roberto Kawaminami Garcia
SILMAAML
48
0
0
01 Oct 2025
SecInfer: Preventing Prompt Injection via Inference-time Scaling
SecInfer: Preventing Prompt Injection via Inference-time Scaling
Yupei Liu
Yanting Wang
Yuqi Jia
Jinyuan Jia
Neil Zhenqiang Gong
SILMAAMLLRM
148
2
0
29 Sep 2025
Privy: Envisioning and Mitigating Privacy Risks for Consumer-facing AI Product Concepts
Privy: Envisioning and Mitigating Privacy Risks for Consumer-facing AI Product Concepts
Hao-Ping Lee
Yu-Ju Yang
Matthew Bilik
Isadora Krsek
Thomas Serban Von Davier
Kyzyl Monteiro
Jason Lin
Shivani Agarwal
Jodi Forlizzi
Sauvik Das
SILM
32
0
0
27 Sep 2025
Reinforcement Learning-Based Prompt Template Stealing for Text-to-Image Models
Reinforcement Learning-Based Prompt Template Stealing for Text-to-Image Models
Xiaotian Zou
SILMVPVLM
51
0
0
27 Sep 2025
Your RAG is Unfair: Exposing Fairness Vulnerabilities in Retrieval-Augmented Generation via Backdoor Attacks
Your RAG is Unfair: Exposing Fairness Vulnerabilities in Retrieval-Augmented Generation via Backdoor Attacks
Gaurav R. Bagwe
Saket S. Chaturvedi
Xiaolong Ma
Xiaoyong Yuan
Kuang-Ching Wang
Lan Zhang
SILM
56
1
0
26 Sep 2025
ChatInject: Abusing Chat Templates for Prompt Injection in LLM Agents
ChatInject: Abusing Chat Templates for Prompt Injection in LLM Agents
Hwan Chang
Yonghyun Jun
Hwanhee Lee
SILM
72
0
0
26 Sep 2025
RAG Security and Privacy: Formalizing the Threat Model and Attack Surface
RAG Security and Privacy: Formalizing the Threat Model and Attack Surface
Atousa Arzanipour
R. Behnia
Reza Ebrahimi
Kaushik Dutta
SILM
104
0
0
24 Sep 2025
Investigating Security Implications of Automatically Generated Code on the Software Supply Chain
Investigating Security Implications of Automatically Generated Code on the Software Supply Chain
Xiaofan Li
Xing Gao
SILMAAML
68
0
0
24 Sep 2025
Loading #Papers per Month with "SILM"
Past speakers
Name (-)
Top Contributors
Name (-)
Top Organizations at ResearchTrend.AI
Name (-)
Social Events
DateLocationEvent
No social events available