Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2501.16378
Cited By

Internal Activation Revision: Safeguarding Vision Language Models Without Parameter Update

Internal Activation Revision: Safeguarding Vision Language Models Without Parameter Update

AAAI Conference on Artificial Intelligence (AAAI), 2025

24 January 2025

Fauzan Farooqui

ArXiv (abs)PDF HTML

Papers citing "Internal Activation Revision: Safeguarding Vision Language Models Without Parameter Update"

11 / 11 papers shown

Activation Steering Meets Preference Optimization: Defense Against Jailbreaks in Vision Language Models

Activation Steering Meets Preference Optimization: Defense Against Jailbreaks in Vision Language Models

128

0

0

30 Aug 2025

Learning to Steer: Input-dependent Steering for Multimodal LLMs

Learning to Steer: Input-dependent Steering for Multimodal LLMs

380

2

0

18 Aug 2025

A Survey on Training-free Alignment of Large Language Models

A Survey on Training-free Alignment of Large Language Models

429

0

0

12 Aug 2025

Secure Tug-of-War (SecTOW): Iterative Defense-Attack Training with Reinforcement Learning for Multimodal Model Security

Secure Tug-of-War (SecTOW): Iterative Defense-Attack Training with Reinforcement Learning for Multimodal Model Security

142

4

0

29 Jul 2025

DAVSP: Safety Alignment for Large Vision-Language Models via Deep Aligned Visual Safety Prompt

DAVSP: Safety Alignment for Large Vision-Language Models via Deep Aligned Visual Safety Prompt

325

3

0

11 Jun 2025

Con Instruction: Universal Jailbreaking of Multimodal Large Language Models via Non-Textual Modalities

Con Instruction: Universal Jailbreaking of Multimodal Large Language Models via Non-Textual ModalitiesVolume 1 (V1), 2025

Fauzan Farooqui

141

0

0

31 May 2025

VSCBench: Bridging the Gap in Vision-Language Model Safety Calibration

VSCBench: Bridging the Gap in Vision-Language Model Safety CalibrationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Fauzan Farooqui

172

3

0

26 May 2025

Language Models Are Capable of Metacognitive Monitoring and Control of Their Internal Activations

Language Models Are Capable of Metacognitive Monitoring and Control of Their Internal Activations

Robert C. Wilson

Marcelo G. Mattar

354

13

0

19 May 2025

A Comprehensive Analysis for Visual Object Hallucination in Large Vision-Language Models

A Comprehensive Analysis for Visual Object Hallucination in Large Vision-Language Models

Guiming Hardy Chen

Ehsan Aghazadeh

273

2

0

04 May 2025

SAUCE: Selective Concept Unlearning in Vision-Language Models with Sparse Autoencoders

SAUCE: Selective Concept Unlearning in Vision-Language Models with Sparse Autoencoders

Fauzan Farooqui

296

3

0

16 Mar 2025

A Comprehensive Survey of Machine Unlearning Techniques for Large Language Models

A Comprehensive Survey of Machine Unlearning Techniques for Large Language Models

Fauzan Farooqui

Herbert Woisetschlaeger

Hans-Arno Jacobsen

325

18

0

22 Feb 2025