v1v2 (latest)

Missing Target-Relevant Information Prediction with World Model for Accurate Zero-Shot Composed Image Retrieval

Computer Vision and Pattern Recognition (CVPR), 2025

21 March 2025

Papers citing "Missing Target-Relevant Information Prediction with World Model for Accurate Zero-Shot Composed Image Retrieval"

46 / 46 papers shown

CrossJEPA: Cross-Modal Joint-Embedding Predictive Architecture for Efficient 3D Representation Learning from 2D Images

176

23 Nov 2025

Self-Correction Distillation for Structured Data Question Answering

...

209

11 Nov 2025

SAIL-Embedding Technical Report: Omni-modal Embedding Foundation Model

...

248

14 Oct 2025

CIR-CoT: Towards Interpretable Composed Image Retrieval via End-to-End Chain-of-Thought Reasoning

147

09 Oct 2025

HLFormer: Enhancing Partially Relevant Video Retrieval with Hyperbolic Learning

242

23 Jul 2025

DetailFusion: A Dual-branch Framework with Detail Enhancement for Composed Image Retrieval

...

464

23 May 2025

Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs

903

24 Apr 2025

Fine-grained Textual Inversion Network for Zero-Shot Composed Image RetrievalAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2024

421

25 Mar 2025

Composed Multi-modal Retrieval: A Survey of Approaches and Applications

...

402

03 Mar 2025

A Comprehensive Survey on Composed Image Retrieval

479

19 Feb 2025

Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image RetrievalComputer Vision and Pattern Recognition (CVPR), 2024

400

15 Dec 2024

Pseudo-triplet Guided Few-shot Composed Image Retrieval

312

08 Jul 2024

Zero-shot Composed Image Retrieval Considering Query-target Relationship Leveraging Masked Image-text Pairs

213

27 Jun 2024

Spherical Linear Interpolation and Text-Anchoring for Zero-shot Composed Image Retrieval

Ser-Nam Lim

342

01 May 2024

Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval

Ser-Nam Lim

188

23 Apr 2024

MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions

Siyuan Qiao

295

28 Mar 2024

Knowledge-Enhanced Dual-stream Zero-shot Composed Image RetrievalComputer Vision and Pattern Recognition (CVPR), 2024

Yuchen Suo

Fan Ma

Linchao Zhu

Yi Yang

238

24 Mar 2024

Image2Sentence based Asymmetrical Zero-shot Composed Image Retrieval

Min Wang

149

03 Mar 2024

Language-only Efficient Training of Zero-shot Composed Image RetrievalComputer Vision and Pattern Recognition (CVPR), 2023

352

04 Dec 2023

Pretrain like Your Inference: Masked Tuning Improves Zero-Shot Composed Image Retrieval

Junyang Chen

Hanjiang Lai

VLM

455

13 Nov 2023

Vision-by-Language for Training-Free Compositional Image Retrieval

367

13 Oct 2023

Learning Interactive Real-World SimulatorsInternational Conference on Learning Representations (ICLR), 2023

Pieter Abbeel

345

330

09 Oct 2023

Context-I2W: Mapping Images to Context-dependent Words for Accurate Zero-Shot Composed Image RetrievalAAAI Conference on Artificial Intelligence (AAAI), 2023

Qi Wu

203

28 Sep 2023

GeneCIS: A Benchmark for General Conditional Image SimilarityComputer Vision and Pattern Recognition (CVPR), 2023

247

13 Jun 2023

Zero-Shot Composed Image Retrieval with Textual InversionIEEE International Conference on Computer Vision (ICCV), 2023

Alberto Baldrati

Lorenzo Agnolucci

Marco Bertini

278

160

27 Mar 2023

CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion

550

21 Mar 2023

Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image RetrievalComputer Vision and Pattern Recognition (CVPR), 2023

Kuniaki Saito

Kihyuk Sohn

Xiang Zhang

Chun-Liang Li

Chen-Yu Lee

Kate Saenko

Tomas Pfister

308

166

06 Feb 2023

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language ModelsInternational Conference on Machine Learning (ICML), 2023

Silvio Savarese

1.3K

6,661

30 Jan 2023

Self-Supervised Learning from Images with a Joint-Embedding Predictive ArchitectureComputer Vision and Pattern Recognition (CVPR), 2023

Pascal Vincent

465

569

19 Jan 2023

Flamingo: a Visual Language Model for Few-Shot LearningNeural Information Processing Systems (NeurIPS), 2022

Jean-Baptiste Alayrac

...

695

4,861

29 Apr 2022

Conditional Prompt Learning for Vision-Language ModelsComputer Vision and Pattern Recognition (CVPR), 2022

508

1,867

10 Mar 2022

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and GenerationInternational Conference on Machine Learning (ICML), 2022

1.3K

5,760

28 Jan 2022

High-Resolution Image Synthesis with Latent Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2021

3.0K

21,096

20 Dec 2021

High Fidelity Visualization of What Your Self-Supervised Representation Knows About

Florian Bordes

Randall Balestriero

Pascal Vincent

DiffM

260

16 Dec 2021

SimMIM: A Simple Framework for Masked Image Modeling

Jianmin Bao

433

1,637

18 Nov 2021

Masked Autoencoders Are Scalable Vision LearnersComputer Vision and Pattern Recognition (CVPR), 2021

Piotr Dollár

2.5K

10,037

11 Nov 2021

Finetuned Language Models Are Zero-Shot Learners

1.7K

4,618

03 Sep 2021

Image Retrieval on Real-life Images with Pre-trained Vision-and-Language ModelsIEEE International Conference on Computer Vision (ICCV), 2021

Zheyuan Liu

Cristian Rodriguez-Opazo

Damien Teney

Stephen Gould

VLM

296

285

09 Aug 2021

Learning Transferable Visual Models From Natural Language SupervisionInternational Conference on Machine Learning (ICML), 2021

...

2.0K

41,259

26 Feb 2021

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy

...

1.4K

55,030

22 Oct 2020

The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization

...

Justin Gilmer

991

2,103

29 Jun 2020

Language Models are Few-Shot LearnersNeural Information Processing Systems (NeurIPS), 2020

...

2.0K

52,526

28 May 2020

ReZero is All You Need: Fast Convergence at Large DepthConference on Uncertainty in Artificial Intelligence (UAI), 2020

Thomas C. Bachlechner

Bodhisattwa Prasad Majumder

363

326

10 Mar 2020

Dream to Control: Learning Behaviors by Latent ImaginationInternational Conference on Learning Representations (ICLR), 2019

Jimmy Ba

580

1,613

03 Dec 2019

Composing Text and Image for Image Retrieval - An Empirical Odyssey

Li Fei-Fei

208

423

18 Dec 2018

Microsoft COCO: Common Objects in ContextEuropean Conference on Computer Vision (ECCV), 2014

Piotr Dollár

17.8K

49,453

01 May 2014