VQA4CIR: Boosting Composed Image Retrieval with Visual Question
Answering

VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering

19 December 2023

Wangmeng Zuo

Rick Siow Mong Goh

Papers citing "VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering"

4 / 4 papers shown

Title
OmniCount: Multi-label Object Counting with Semantic-Geometric Priors Anindya Mondal Sauradip Nag Xiatian Zhu Anjan Dutta 28 3 0 08 Mar 2024
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality Qinghao Ye Haiyang Xu Guohai Xu Jiabo Ye Ming Yan ... Junfeng Tian Qiang Qi Ji Zhang Feiyan Huang Jingren Zhou VLM MLLM 203 883 0 27 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models Junnan Li Dongxu Li Silvio Savarese Steven C. H. Hoi VLM MLLM 244 4,186 0 30 Jan 2023
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 301 11,730 0 04 Mar 2022