GET: Unlocking the Multi-modal Potential of CLIP for Generalized Category Discovery

GET: Unlocking the Multi-modal Potential of CLIP for Generalized Category Discovery

15 March 2024

Ming-Ming Cheng

Papers citing "GET: Unlocking the Multi-modal Potential of CLIP for Generalized Category Discovery"

9 / 9 papers shown

Title
The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims Shu Zhou Xin Wang Zhengda Zhou Haohan Yi Xuhui Zheng Hao Wan 63 0 0 21 Nov 2024
Multimodal Generalized Category Discovery Yuchang Su Renping Zhou Siyu Huang Xingjian Li Tianyang Wang Ziyue Wang Min Xu 27 0 0 18 Sep 2024
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models Junnan Li Dongxu Li Silvio Savarese Steven C. H. Hoi VLM MLLM 244 4,186 0 30 Jan 2023
Linearly Mapping from Image to Text Space Jack Merullo Louis Castricato Carsten Eickhoff Ellie Pavlick VLM 153 104 0 30 Sep 2022
VLP: A Survey on Vision-Language Pre-training Feilong Chen Duzhen Zhang Minglun Han Xiuyi Chen Jing Shi Shuang Xu Bo Xu VLM 74 208 0 18 Feb 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Junnan Li Dongxu Li Caiming Xiong S. Hoi MLLM BDL VLM CLIP 378 4,010 0 28 Jan 2022
Open-Set Recognition: a Good Closed-Set Classifier is All You Need? S. Vaze Kai Han Andrea Vedaldi Andrew Zisserman BDL 151 401 0 12 Oct 2021
Emerging Properties in Self-Supervised Vision Transformers Mathilde Caron Hugo Touvron Ishan Misra Hervé Jégou Julien Mairal Piotr Bojanowski Armand Joulin 283 4,299 0 29 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Chao Jia Yinfei Yang Ye Xia Yi-Ting Chen Zarana Parekh Hieu H. Pham Quoc V. Le Yun-hsuan Sung Zhen Li Tom Duerig VLM CLIP 291 2,875 0 11 Feb 2021