Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2501.16378
Cited By
Internal Activation Revision: Safeguarding Vision Language Models Without Parameter Update
AAAI Conference on Artificial Intelligence (AAAI), 2025
24 January 2025
Qing Li
Fauzan Farooqui
Zongxiong Chen
Kun Song
Lei Ma
Fakhri Karray
KELM
LLMSV
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Internal Activation Revision: Safeguarding Vision Language Models Without Parameter Update"
11 / 11 papers shown
Title
Activation Steering Meets Preference Optimization: Defense Against Jailbreaks in Vision Language Models
Sihao Wu
Gaojie Jin
Wei Huang
Jianhong Wang
Xiaowei Huang
LLMSV
96
0
0
30 Aug 2025
Learning to Steer: Input-dependent Steering for Multimodal LLMs
Jayneel Parekh
Pegah Khayatan
Mustafa Shukor
Arnaud Dapogny
A. Newson
Matthieu Cord
LLMSV
312
2
0
18 Aug 2025
A Survey on Training-free Alignment of Large Language Models
Birong Pan
Yongqi Li
Jiasheng Si
Sibo Wei
Mayi Xu
Shen Zhou
Yuanyuan Zhu
Ming Zhong
T. Qian
3DV
LM&MA
316
0
0
12 Aug 2025
Secure Tug-of-War (SecTOW): Iterative Defense-Attack Training with Reinforcement Learning for Multimodal Model Security
Muzhi Dai
Shixuan Liu
Zhiyuan Zhao
Junyu Gao
Hao Sun
Xuelong Li
AAML
88
4
0
29 Jul 2025
DAVSP: Safety Alignment for Large Vision-Language Models via Deep Aligned Visual Safety Prompt
Yitong Zhang
Jia Li
L. Cai
Ge Li
VLM
228
3
0
11 Jun 2025
Con Instruction: Universal Jailbreaking of Multimodal Large Language Models via Non-Textual Modalities
Volume 1 (V1), 2025
Fauzan Farooqui
Thy Thy Tran
Preslav Nakov
Iryna Gurevych
MLLM
AAML
106
0
0
31 May 2025
VSCBench: Bridging the Gap in Vision-Language Model Safety Calibration
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Fauzan Farooqui
Qing Li
Zongxiong Chen
Yuxia Wang
Derui Zhu
Zhuohan Xie
Chenyang Lyu
Xiuying Chen
Preslav Nakov
Fakhri Karray
VLM
165
3
0
26 May 2025
Language Models Are Capable of Metacognitive Monitoring and Control of Their Internal Activations
Li Ji-An
Hua-Dong Xiong
Robert C. Wilson
Marcelo G. Mattar
M. Benna
283
10
0
19 May 2025
A Comprehensive Analysis for Visual Object Hallucination in Large Vision-Language Models
Liqiang Jing
Guiming Hardy Chen
Ehsan Aghazadeh
Xin Eric Wang
Xinya Du
230
2
0
04 May 2025
SAUCE: Selective Concept Unlearning in Vision-Language Models with Sparse Autoencoders
Qing Li
Fauzan Farooqui
Derui Zhu
Fengyu Cai
Chenyang Lyu
Fakhri Karray
MU
245
3
0
16 Mar 2025
A Comprehensive Survey of Machine Unlearning Techniques for Large Language Models
Fauzan Farooqui
Qing Li
Herbert Woisetschlaeger
Zongxiong Chen
Longji Xu
Preslav Nakov
Preslav Nakov
Hans-Arno Jacobsen
Fakhri Karray
MU
279
14
0
22 Feb 2025
1