Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.01758
Cited By
Improving Zero-shot Generalization and Robustness of Multi-modal Models
4 December 2022
Yunhao Ge
Jie Jessie Ren
Andrew Gallagher
Yuxiao Wang
Ming Yang
Hartwig Adam
Laurent Itti
Balaji Lakshminarayanan
Jiaping Zhao
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Improving Zero-shot Generalization and Robustness of Multi-modal Models"
32 / 32 papers shown
Title
Post-pre-training for Modality Alignment in Vision-Language Foundation Models
Shinýa Yamaguchi
Dewei Feng
Sekitoshi Kanai
Kazuki Adachi
Daiki Chijiwa
VLM
34
0
0
17 Apr 2025
FLOSS: Free Lunch in Open-vocabulary Semantic Segmentation
Yasser Benigmim
Mohammad Fahes
Tuan-Hung Vu
Andrei Bursuc
Raoul de Charette
VLM
32
0
0
14 Apr 2025
Crossmodal Knowledge Distillation with WordNet-Relaxed Text Embeddings for Robust Image Classification
Chenqi Guo
Mengshuo Rong
Qianli Feng
Rongfan Feng
Yinglong Ma
VLM
63
0
0
31 Mar 2025
An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection
Louis Y. Kim
Michelle Karker
Victoria Valledor
Seiyoung C. Lee
Karl F. Brzoska
Margaret Duff
Anthony Palladino
VLM
ObjD
51
0
0
21 Mar 2025
Enhancing Zero-Shot Image Recognition in Vision-Language Models through Human-like Concept Guidance
Hui Liu
Wenya Wang
Kecheng Chen
Jie Liu
Yibing Liu
Tiexin Qin
Peisong He
Xinghao Jiang
Haoliang Li
BDL
VLM
135
0
0
20 Mar 2025
Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion
Qingyuan Jiang
Longfei Huang
Yang Yang
57
0
0
27 Feb 2025
ACE: Action Concept Enhancement of Video-Language Models in Procedural Videos
Reza Ghoddoosian
Nakul Agarwal
Isht Dwivedi
Behzad Darisuh
62
0
0
23 Nov 2024
Unveiling Ontological Commitment in Multi-Modal Foundation Models
Mert Keser
Gesina Schwalbe
Niki Amini-Naieni
Matthias Rottmann
Alois Knoll
21
1
0
25 Sep 2024
Benchmarking VLMs' Reasoning About Persuasive Atypical Images
Sina Malakouti
Aysan Aghazadeh
Ashmit Khandelwal
Adriana Kovashka
VLM
26
2
0
16 Sep 2024
How Does Diverse Interpretability of Textual Prompts Impact Medical Vision-Language Zero-Shot Tasks?
Sicheng Wang
Che Liu
Rossella Arcucci
VLM
MedIm
36
0
0
31 Aug 2024
Surprisingly Fragile: Assessing and Addressing Prompt Instability in Multimodal Foundation Models
Ian Stewart
Sameera Horawalavithana
Brendan Kennedy
Sai Munikoti
Karl Pazdernik
AAML
24
2
0
26 Aug 2024
I can listen but cannot read: An evaluation of two-tower multimodal systems for instrument recognition
Yannis Vasilakis
Rachel M. Bittner
Johan Pauwels
40
0
0
25 Jul 2024
AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
Yuhan Zhu
Yuyang Ji
Zhiyu Zhao
Gangshan Wu
Limin Wang
VLM
39
7
0
05 Jul 2024
Review of Zero-Shot and Few-Shot AI Algorithms in The Medical Domain
Maged Badawi
Mohammedyahia Abushanab
Sheethal Bhat
Andreas K. Maier
VLM
39
2
0
23 Jun 2024
BaFTA: Backprop-Free Test-Time Adaptation For Zero-Shot Vision-Language Models
Xuefeng Hu
Ke Zhang
Min Sun
Albert Y. C. Chen
Cheng-Hao Kuo
Ram Nevatia
VLM
22
2
0
17 Jun 2024
Mitigating Noisy Correspondence by Geometrical Structure Consistency Learning
Zihua Zhao
Mengxi Chen
Tianjie Dai
Jiangchao Yao
Bo han
Ya-Qin Zhang
Yanfeng Wang
NoLa
34
3
0
27 May 2024
SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection
Mingxuan Liu
Tyler L. Hayes
Elisa Ricci
G. Csurka
Riccardo Volpi
ObjD
45
1
0
16 May 2024
On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning?
Maxime Zanella
Ismail Ben Ayed
VLM
MLLM
35
22
0
03 May 2024
Renovating Names in Open-Vocabulary Segmentation Benchmarks
Haiwen Huang
Songyou Peng
Dan Zhang
Andreas Geiger
VLM
29
3
0
14 Mar 2024
Democratizing Fine-grained Visual Recognition with Large Language Models
Mingxuan Liu
Subhankar Roy
Wenjing Li
Zhun Zhong
N. Sebe
Elisa Ricci
VLM
27
10
0
24 Jan 2024
A Simple Recipe for Language-guided Domain Generalized Segmentation
Mohammad Fahes
Tuan-Hung Vu
Andrei Bursuc
Patrick Pérez
Raoul de Charette
VLM
21
14
0
29 Nov 2023
HGCLIP: Exploring Vision-Language Models with Graph Representations for Hierarchical Understanding
Peng Xia
Xingtong Yu
Ming Hu
Lie Ju
Zhiyong Wang
Peibo Duan
Zongyuan Ge
VLM
43
9
0
23 Nov 2023
TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance
Kan Wu
Houwen Peng
Zhenghong Zhou
Bin Xiao
Mengchen Liu
...
Xi
Xi Chen
Xinggang Wang
Hongyang Chao
Han Hu
VLM
OODD
26
53
0
21 Sep 2023
Robustness Analysis on Foundational Segmentation Models
Madeline Chantry Schiappa
Shehreen Azad
V. Sachidanand
Yunhao Ge
O. Mikšík
Y. S. Rawat
Vibhav Vineet
OOD
VLM
AAML
22
5
0
15 Jun 2023
A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models
J. Allingham
Jie Jessie Ren
Michael W. Dusenberry
Xiuye Gu
Yin Cui
Dustin Tran
J. Liu
Balaji Lakshminarayanan
LLMAG
VLM
24
32
0
13 Feb 2023
Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Manli Shu
Weili Nie
De-An Huang
Zhiding Yu
Tom Goldstein
Anima Anandkumar
Chaowei Xiao
VLM
VPVLM
180
280
0
15 Sep 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,236
0
21 Mar 2022
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
325
2,261
0
02 Sep 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,774
0
24 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,689
0
11 Feb 2021
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
270
5,660
0
05 Dec 2016
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Y. Gal
Zoubin Ghahramani
UQCV
BDL
252
9,134
0
06 Jun 2015
1