ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.05346
22
0

AnyAttack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models

7 October 2024
Jiaming Zhang
Junhong Ye
Xingjun Ma
Yige Li
Yunfan Yang
Jitao Sang
Dit-Yan Yeung
Dit-Yan Yeung
    AAML
    VLM
ArXivPDFHTML
Abstract

Due to their multimodal capabilities, Vision-Language Models (VLMs) have found numerous impactful applications in real-world scenarios. However, recent studies have revealed that VLMs are vulnerable to image-based adversarial attacks. Traditional targeted adversarial attacks require specific targets and labels, limiting their real-worldthis http URLpresent AnyAttack, a self-supervised framework that transcends the limitations of conventional attacks through a novel foundation model approach. By pre-training on the massive LAION-400M dataset without label supervision, AnyAttack achieves unprecedented flexibility - enabling any image to be transformed into an attack vector targeting any desired output across differentthis http URLapproach fundamentally changes the threat landscape, making adversarial capabilities accessible at an unprecedented scale. Our extensive validation across five open-source VLMs (CLIP, BLIP, BLIP2, InstructBLIP, and MiniGPT-4) demonstrates AnyAttack's effectiveness across diverse multimodal tasks. Most concerning, AnyAttack seamlessly transfers to commercial systems including Google Gemini, Claude Sonnet, Microsoft Copilot and OpenAI GPT, revealing a systemic vulnerability requiring immediate attention.

View on arXiv
@article{zhang2025_2410.05346,
  title={ AnyAttack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models },
  author={ Jiaming Zhang and Junhong Ye and Xingjun Ma and Yige Li and Yunfan Yang and Yunhao Chen and Jitao Sang and Dit-Yan Yeung },
  journal={arXiv preprint arXiv:2410.05346},
  year={ 2025 }
}
Comments on this paper