Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models

IEEE International Conference on Computer Vision (ICCV), 2021

9 August 2021

Zheyuan Liu

Cristian Rodriguez-Opazo

Damien Teney

Stephen Gould

VLM

ArXiv (abs)PDF HTML Github (305★)

Papers citing "Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models"

50 / 154 papers shown

Generative Editing in the Joint Vision-Language Space for Zero-Shot Composed Image Retrieval

239

01 Dec 2025

UNION: A Lightweight Target Representation for Efficient Zero-Shot Image-Guided Retrieval with Optional Textual Queries

104

27 Nov 2025

FIGROTD: A Friendly-to-Handle Dataset for Image Guided Retrieval with Optional Text

120

27 Nov 2025

Reasoning Guided Embeddings: Leveraging MLLM Reasoning for Improved Multimodal Retrieval

219

20 Nov 2025

MoRA: Missing Modality Low-Rank Adaptation for Visual Recognition

208

09 Nov 2025

UME-R1: Exploring Reasoning-Driven Generative Multimodal Embeddings

283

01 Nov 2025

Instance-Level Composed Image Retrieval

216

29 Oct 2025

Enhanced MLLM Black-Box Jailbreaking Attacks and Defenses

197

24 Oct 2025

MCA: Modality Composition Awareness for Robust Composed Multimodal Retrieval

133

17 Oct 2025

NExT-OMNI: Towards Any-to-Any Omnimodal Foundation Models with Discrete Flow Matching

314

15 Oct 2025

MRMR: A Realistic and Expert-Level Multidisciplinary Benchmark for Reasoning-Intensive Multimodal Retrieval

193

10 Oct 2025

CIR-CoT: Towards Interpretable Composed Image Retrieval via End-to-End Chain-of-Thought Reasoning

206

09 Oct 2025

(Token-Level) InfoRMIA: Stronger Membership Inference and Memorization Assessment for LLMs

Jiashu Tao

Reza Shokri

165

07 Oct 2025

SQUARE: Semantic Query-Augmented Fusion and Efficient Batch Reranking for Training-free Zero-Shot Composed Image Retrieval

235

30 Sep 2025

^2

-Bench: Going Beyond Matching to Reasoning in Multimodal Retrieval

...

174

30 Sep 2025

SETR: A Two-Stage Semantic-Enhanced Framework for Zero-Shot Composed Image Retrieval

Yuqi Xiao

Yingying Zhu

131

30 Sep 2025

GRAPE: Let GPRO Supervise Query Rewriting by Ranking for Retrieval

...

175

27 Sep 2025

OmniBridge: Unified Multimodal Understanding, Generation, and Retrieval via Latent Space Alignment

Teng Xiao

Zuchao Li

Lefei Zhang

317

23 Sep 2025

Chain-of-Thought Re-ranking for Image Retrieval Tasks

177

18 Sep 2025

Recurrence Meets Transformers for Universal Multimodal Retrieval

293

10 Sep 2025

EVENT-Retriever: Event-Aware Multimodal Image Retrieval for Realistic Captions

31 Aug 2025

Disentangling Latent Embeddings with Sparse Linear Concept Subspaces (SLiCS)

191

27 Aug 2025

Beyond Simple Edits: Composed Video Retrieval with Dense Modifications

153

19 Aug 2025

Enhancing Supervised Composed Image Retrieval via Reasoning-Augmented Representation Engineering

Shaoguo Liu

Tingting Gao

LRM

239

15 Aug 2025

Composed Object Retrieval: Object-level Retrieval via Composed Expressions

270

06 Aug 2025

Agentic Personalized Fashion Recommendation in the Age of Generative AI: Challenges, Opportunities, and Evaluation

Yashar Deldjoo

Nima Rafiee

Mahdyar Ravanbakhsh

160

04 Aug 2025

On The Role of Pretrained Language Models in General-Purpose Text Embeddings: A Survey

371

28 Jul 2025

U-MARVEL: Unveiling Key Factors for Universal Multimodal Retrieval via Embedding Learning with MLLMs

340

20 Jul 2025

SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation

396

17 Jul 2025

Visual Re-Ranking with Non-Visual Side InformationScandinavian Conference on Image Analysis (SCIA), 2025

Gustav Hanning

Gabrielle Flood

Viktor Larsson

221

01 Jul 2025

Zero Shot Composed Image Retrieval

Santhosh Kakarla

Gautama Shastry Bulusu Venkata

232

07 Jun 2025

From Play to Replay: Composed Video Retrieval for Temporally Fine-Grained Videos

409

05 Jun 2025

SORCE: Small Object Retrieval in Complex Environments

242

30 May 2025

ConText-CIR: Learning from Concepts in Text for Composed Image RetrievalComputer Vision and Pattern Recognition (CVPR), 2025

361

27 May 2025

MLLM-Guided VLM Fine-Tuning with Joint Inference for Zero-Shot Composed Image Retrieval

431

26 May 2025

DetailFusion: A Dual-branch Framework with Detail Enhancement for Composed Image Retrieval

...

523

23 May 2025

InstructPart: Task-Oriented Part Segmentation with Instruction ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

259

23 May 2025

From Mapping to Composing: A Two-Stage Framework for Zero-shot Composed Image Retrieval

871

25 Apr 2025

TMCIR: Token Merge Benefits Composed Image Retrieval

405

15 Apr 2025

MIEB: Massive Image Embedding Benchmark

593

14 Apr 2025

NCL-CIR: Noise-aware Contrastive Learning for Composed Image RetrievalIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

358

06 Apr 2025

Scaling Prompt Instructed Zero Shot Composed Image Retrieval with Image-Only Data

Yiqun Duan

Sameera Ramasinghe

Stephen Gould

Ajanthan Thalaiyasingam

473

01 Apr 2025

IDMR: Towards Instance-Driven Precise Visual Correspondence in Multimodal Retrieval

466

01 Apr 2025

AutoComPose: Automatic Generation of Pose Transition Descriptions for Composed Pose Retrieval Using Multimodal LLMs

Shuvra S. Bhattacharyya

414

28 Mar 2025

FineCIR: Explicit Parsing of Fine-Grained Modification Semantics for Composed Image Retrieval

456

27 Mar 2025

Fine-grained Textual Inversion Network for Zero-Shot Composed Image RetrievalAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2024

483

25 Mar 2025

good4cir: Generating Detailed Synthetic Captions for Composed Image Retrieval

240

22 Mar 2025

Missing Target-Relevant Information Prediction with World Model for Accurate Zero-Shot Composed Image RetrievalComputer Vision and Pattern Recognition (CVPR), 2025

708

21 Mar 2025

Scale Efficient Training for Large DatasetsComputer Vision and Pattern Recognition (CVPR), 2025

Qing Zhou

Junyu Gao

Qi Wang

377

17 Mar 2025

ImageScope: Unifying Language-Guided Image Retrieval via Large Multimodal Model Collective ReasoningThe Web Conference (WWW), 2025

421

13 Mar 2025