ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.12336
  4. Cited By
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings
  for Robust Large Vision-Language Models

Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models

19 February 2024
Christian Schlarmann
Naman D. Singh
Francesco Croce
Matthias Hein
    VLM
    AAML
ArXivPDFHTML

Papers citing "Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models"

37 / 37 papers shown
Title
TAROT: Towards Essentially Domain-Invariant Robustness with Theoretical Justification
TAROT: Towards Essentially Domain-Invariant Robustness with Theoretical Justification
Dongyoon Yang
Jihu Lee
Yongdai Kim
14
0
0
10 May 2025
X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP
X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP
Hanxun Huang
Sarah Monazam Erfani
Yige Li
Xingjun Ma
James Bailey
AAML
32
0
0
08 May 2025
Text-to-Decision Agent: Learning Generalist Policies from Natural Language Supervision
Text-to-Decision Agent: Learning Generalist Policies from Natural Language Supervision
Shilin Zhang
Zican Hu
Wenhao Wu
Xinyi Xie
Jianxiang Tang
Chunlin Chen
Daoyi Dong
Yu Cheng
Zhenhong Sun
Zhi Wang
OffRL
31
0
0
21 Apr 2025
R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning
R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning
Lijun Sheng
Jian Liang
Z. Wang
Ran He
AAML
VLM
33
0
0
15 Apr 2025
Mind the Trojan Horse: Image Prompt Adapter Enabling Scalable and Deceptive Jailbreaking
Mind the Trojan Horse: Image Prompt Adapter Enabling Scalable and Deceptive Jailbreaking
Junxi Chen
Junhao Dong
Xiaohua Xie
33
0
0
08 Apr 2025
PaperBench: Evaluating AI's Ability to Replicate AI Research
PaperBench: Evaluating AI's Ability to Replicate AI Research
Giulio Starace
Oliver Jaffe
Dane Sherburn
James Aung
Jun Shern Chan
...
Benjamin Kinsella
Wyatt Thompson
Johannes Heidecke
Amelia Glaese
Tejal Patwardhan
ALM
ELM
769
5
0
02 Apr 2025
AdPO: Enhancing the Adversarial Robustness of Large Vision-Language Models with Preference Optimization
AdPO: Enhancing the Adversarial Robustness of Large Vision-Language Models with Preference Optimization
Chaohu Liu
Tianyi Gui
Yu Liu
Linli Xu
VLM
AAML
68
1
0
02 Apr 2025
Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks
Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks
Jiawei Wang
Yushen Zuo
Yuanjun Chai
Z. Liu
Yichen Fu
Yichun Feng
Kin-Man Lam
AAML
VLM
34
0
0
02 Apr 2025
OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad
OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad
Luyao Tang
Yuxuan Yuan
C. L. P. Chen
Zeyu Zhang
Yue Huang
Kun Zhang
48
0
0
24 Mar 2025
Survey of Adversarial Robustness in Multimodal Large Language Models
Survey of Adversarial Robustness in Multimodal Large Language Models
Chengze Jiang
Zhuangzhuang Wang
Minjing Dong
Jie Gui
AAML
58
0
0
18 Mar 2025
Provenance Detection for AI-Generated Images: Combining Perceptual Hashing, Homomorphic Encryption, and AI Detection Models
Shree Singhi
Aayan Yadav
Aayush Gupta
Shariar Ebrahimi
Parisa Hassanizadeh
36
0
0
14 Mar 2025
On the Limitations of Vision-Language Models in Understanding Image Transforms
Ahmad Mustafa Anis
Hasnain Ali
Saquib Sarfraz
VLM
Presented at ResearchTrend Connect | VLM on 28 Mar 2025
131
0
0
12 Mar 2025
CLIP is Strong Enough to Fight Back: Test-time Counterattacks towards Zero-shot Adversarial Robustness of CLIP
Songlong Xing
Zhengyu Zhao
N. Sebe
AAML
52
0
0
05 Mar 2025
Adversarial Training for Multimodal Large Language Models against Jailbreak Attacks
Adversarial Training for Multimodal Large Language Models against Jailbreak Attacks
Liming Lu
Shuchao Pang
Siyuan Liang
Haotian Zhu
Xiyu Zeng
Aishan Liu
Yunhuai Liu
Yongbin Zhou
AAML
49
1
0
05 Mar 2025
Stealthy Backdoor Attack in Self-Supervised Learning Vision Encoders for Large Vision Language Models
Stealthy Backdoor Attack in Self-Supervised Learning Vision Encoders for Large Vision Language Models
Zhaoyi Liu
Huan Zhang
AAML
70
0
0
25 Feb 2025
Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models
Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models
H. Malik
Fahad Shamshad
Muzammal Naseer
Karthik Nandakumar
F. Khan
Salman Khan
AAML
MLLM
VLM
66
0
0
03 Feb 2025
Adaptive Concept Bottleneck for Foundation Models Under Distribution
  Shifts
Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts
Jihye Choi
Jayaram Raghuram
Yixuan Li
Somesh Jha
93
4
0
18 Dec 2024
Adversarial Prompt Distillation for Vision-Language Models
Adversarial Prompt Distillation for Vision-Language Models
Lin Luo
Xin Wang
Bojia Zi
Shihao Zhao
Xingjun Ma
Yu-Gang Jiang
AAML
VLM
74
1
0
22 Nov 2024
TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in
  Vision-Language Models
TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models
Xin Wang
Kai-xiang Chen
Jiaming Zhang
Jingjing Chen
Xingjun Ma
AAML
VPVLM
VLM
83
1
0
20 Nov 2024
Replace-then-Perturb: Targeted Adversarial Attacks With Visual Reasoning
  for Vision-Language Models
Replace-then-Perturb: Targeted Adversarial Attacks With Visual Reasoning for Vision-Language Models
Jonggyu Jang
Hyeonsu Lyu
Jungyeon Koh
H. Yang
VLM
AAML
26
0
0
01 Nov 2024
Text-Guided Attention is All You Need for Zero-Shot Robustness in
  Vision-Language Models
Text-Guided Attention is All You Need for Zero-Shot Robustness in Vision-Language Models
Lu Yu
Haiyang Zhang
Changsheng Xu
AAML
VLM
18
3
0
29 Oct 2024
Hiding-in-Plain-Sight (HiPS) Attack on CLIP for Targetted Object Removal
  from Images
Hiding-in-Plain-Sight (HiPS) Attack on CLIP for Targetted Object Removal from Images
Arka Daw
Megan Hong-Thanh Chung
Maria Mahbub
Amir Sadovnik
AAML
24
0
0
16 Oct 2024
Fake It Until You Break It: On the Adversarial Robustness of
  AI-generated Image Detectors
Fake It Until You Break It: On the Adversarial Robustness of AI-generated Image Detectors
Sina Mavali
Jonas Ricker
David Pape
Yash Sharma
Asja Fischer
Lea Schönherr
AAML
18
3
0
02 Oct 2024
VLMGuard: Defending VLMs against Malicious Prompts via Unlabeled Data
VLMGuard: Defending VLMs against Malicious Prompts via Unlabeled Data
Xuefeng Du
Reshmi Ghosh
Robert Sim
Ahmed Salem
Vitor Carvalho
Emily Lawton
Yixuan Li
Jack W. Stokes
VLM
AAML
32
5
0
01 Oct 2024
Securing Vision-Language Models with a Robust Encoder Against Jailbreak
  and Adversarial Attacks
Securing Vision-Language Models with a Robust Encoder Against Jailbreak and Adversarial Attacks
Md Zarif Hossain
Ahmed Imteaj
AAML
VLM
35
3
0
11 Sep 2024
Towards Adversarially Robust Vision-Language Models: Insights from
  Design Choices and Prompt Formatting Techniques
Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques
Rishika Bhagwatkar
Shravan Nayak
Reza Bayat
Alexis Roger
Daniel Z Kaplan
P. Bashivan
Irina Rish
AAML
VLM
31
1
0
15 Jul 2024
Investigating the Semantic Robustness of CLIP-based Zero-Shot Anomaly
  Segmentation
Investigating the Semantic Robustness of CLIP-based Zero-Shot Anomaly Segmentation
Kevin Stangl
Marius Arvinte
Weilin Xu
Cory Cornelius
VLM
UQCV
18
0
0
13 May 2024
Omniview-Tuning: Boosting Viewpoint Invariance of Vision-Language
  Pre-training Models
Omniview-Tuning: Boosting Viewpoint Invariance of Vision-Language Pre-training Models
Shouwei Ruan
Yinpeng Dong
Hanqing Liu
Yao Huang
Hang Su
Xingxing Wei
VLM
35
1
0
18 Apr 2024
As Firm As Their Foundations: Can open-sourced foundation models be used
  to create adversarial examples for downstream tasks?
As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks?
Anjun Hu
Jindong Gu
Francesco Pinto
Konstantinos Kamnitsas
Philip H. S. Torr
AAML
SILM
21
1
0
19 Mar 2024
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM
  Agents Exponentially Fast
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Xiangming Gu
Xiaosen Zheng
Tianyu Pang
Chao Du
Qian Liu
Ye Wang
Jing Jiang
Min-Bin Lin
LLMAG
LM&Ro
35
47
0
13 Feb 2024
On the Adversarial Robustness of Multi-Modal Foundation Models
On the Adversarial Robustness of Multi-Modal Foundation Models
Christian Schlarmann
Matthias Hein
AAML
95
84
0
21 Aug 2023
Enhancing Adversarial Contrastive Learning via Adversarial Invariant
  Regularization
Enhancing Adversarial Contrastive Learning via Adversarial Invariant Regularization
Xilie Xu
Jingfeng Zhang
Feng Liu
Masashi Sugiyama
Mohan S. Kankanhalli
AAML
27
10
0
30 Apr 2023
Rethinking the Effect of Data Augmentation in Adversarial Contrastive
  Learning
Rethinking the Effect of Data Augmentation in Adversarial Contrastive Learning
Rundong Luo
Yifei Wang
Yisen Wang
32
18
0
02 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Pre-trained Adversarial Perturbations
Pre-trained Adversarial Perturbations
Y. Ban
Yinpeng Dong
AAML
54
21
0
07 Oct 2022
Learn to Explain: Multimodal Reasoning via Thought Chains for Science
  Question Answering
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
198
1,089
0
20 Sep 2022
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual
  Machine Learning
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning
Krishna Srinivasan
K. Raman
Jiecao Chen
Michael Bendersky
Marc Najork
VLM
181
307
0
02 Mar 2021
1