Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.10591
Cited By
Interpreting and Controlling Vision Foundation Models via Text Explanations
16 October 2023
Haozhe Chen
Junfeng Yang
Carl Vondrick
Chengzhi Mao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Interpreting and Controlling Vision Foundation Models via Text Explanations"
8 / 8 papers shown
Title
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations
Nick Jiang
Anish Kachinthaya
Suzie Petryk
Yossi Gandelsman
VLM
32
14
0
03 Oct 2024
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
265
4,229
0
30 Jan 2023
Natural Language Descriptions of Deep Visual Features
Evan Hernandez
Sarah Schwettmann
David Bau
Teona Bagashvili
Antonio Torralba
Jacob Andreas
MILM
196
116
0
26 Jan 2022
Editing a classifier by rewriting its prediction rules
Shibani Santurkar
Dimitris Tsipras
Mahalaxmi Elango
David Bau
Antonio Torralba
A. Madry
KELM
175
89
0
02 Dec 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
245
558
0
28 Sep 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
303
5,773
0
29 Apr 2021
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Tsung-Yi Lin
Weicheng Kuo
Yin Cui
VLM
ObjD
223
898
0
28 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,689
0
11 Feb 2021
1