Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2503.17109
Cited By
v1
v2 (latest)
Missing Target-Relevant Information Prediction with World Model for Accurate Zero-Shot Composed Image Retrieval
Computer Vision and Pattern Recognition (CVPR), 2025
21 March 2025
Yuanmin Tang
Jing Yu
Keke Gai
Jiamin Zhuang
Gang Xiong
Gaopeng Gou
Qi Wu
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Missing Target-Relevant Information Prediction with World Model for Accurate Zero-Shot Composed Image Retrieval"
46 / 46 papers shown
CrossJEPA: Cross-Modal Joint-Embedding Predictive Architecture for Efficient 3D Representation Learning from 2D Images
Avishka Perera
Kumal Hewagamage
Saeedha Nazar
Kavishka Abeywardana
Hasitha Gallella
Ranga Rodrigo
Mohamed Afham
3DV
176
0
0
23 Nov 2025
Self-Correction Distillation for Structured Data Question Answering
Yushan Zhu
Wen Zhang
Long Jin
Mengshu Sun
Ling Zhong
...
Juan-Zi Li
Lei Liang
Chong Long
Chao Deng
Junlan Feng
209
0
0
11 Nov 2025
SAIL-Embedding Technical Report: Omni-modal Embedding Foundation Model
Lin Lin
Jiefeng Long
Zhihe Wan
Y. Wang
Dingkang Yang
...
Yan Qiu
Haiyang Yu
Xiao Liang
Hongsheng Li
Chao Feng
248
3
0
14 Oct 2025
CIR-CoT: Towards Interpretable Composed Image Retrieval via End-to-End Chain-of-Thought Reasoning
Weihuang Lin
Yiwei Ma
Jinfa Huang
Xiaoshuai Sun
Rongrong Ji
LRM
147
0
0
09 Oct 2025
HLFormer: Enhancing Partially Relevant Video Retrieval with Hyperbolic Learning
Jun Li
Jinpeng Wang
Chaolei Tan
Niu Lian
Long Chen
Yaowei Wang
Min Zhang
Shu-Tao Xia
Bin Chen
242
4
0
23 Jul 2025
DetailFusion: A Dual-branch Framework with Detail Enhancement for Composed Image Retrieval
Yuxin Yang
Yinan Zhou
Yuxin Chen
Ziqi Zhang
Zongyang Ma
...
Bing Li
Lin Song
Jun Gao
Peng Li
Weiming Hu
464
1
0
23 May 2025
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs
Tiancheng Gu
Kaicheng Yang
Ziyong Feng
Xingjun Wang
Yanzhao Zhang
Dingkun Long
Yingda Chen
Weidong Cai
Jiankang Deng
VLM
903
35
0
24 Apr 2025
Fine-grained Textual Inversion Network for Zero-Shot Composed Image Retrieval
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2024
Haoqiang Lin
Haokun Wen
Xuemeng Song
Meng Liu
Yupeng Hu
Liqiang Nie
421
28
0
25 Mar 2025
Composed Multi-modal Retrieval: A Survey of Approaches and Applications
Kun Zhang
Jingyu Li
Zhiyu Li
Jingjing Zhang
F. Li
...
Nan Chen
Lei Zhang
Yongdong Zhang
Zhendong Mao
S.Kevin Zhou
402
1
0
03 Mar 2025
A Comprehensive Survey on Composed Image Retrieval
Xuemeng Song
Haoqiang Lin
Haokun Wen
Bohan Hou
Mingzhu Xu
Liqiang Nie
479
7
0
19 Feb 2025
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval
Computer Vision and Pattern Recognition (CVPR), 2024
Yuanmin Tang
Xiaoting Qin
Jing Zhang
Jing Yu
Gaopeng Gou
Gang Xiong
Qingwei Ling
Saravan Rajmohan
Dongmei Zhang
Qi Wu
LRM
400
11
0
15 Dec 2024
Pseudo-triplet Guided Few-shot Composed Image Retrieval
Bohan Hou
Haoqiang Lin
Haokun Wen
Meng Liu
Xuemeng Song
312
5
0
08 Jul 2024
Zero-shot Composed Image Retrieval Considering Query-target Relationship Leveraging Masked Image-text Pairs
Huaying Zhang
Rintaro Yanagi
Ren Togo
Takahiro Ogawa
Miki Haseyama
213
11
0
27 Jun 2024
Spherical Linear Interpolation and Text-Anchoring for Zero-shot Composed Image Retrieval
Young Kyun Jang
Dat Huynh
Ashish Shah
Wen-Kai Chen
Ser-Nam Lim
342
32
0
01 May 2024
Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval
Young Kyun Jang
Donghyun Kim
Zihang Meng
Dat Huynh
Ser-Nam Lim
188
18
0
23 Apr 2024
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Kai Zhang
Yi Luan
Hexiang Hu
Kenton Lee
Siyuan Qiao
Wenhu Chen
Yu-Chuan Su
Ming-Wei Chang
VLM
LRM
295
73
0
28 Mar 2024
Knowledge-Enhanced Dual-stream Zero-shot Composed Image Retrieval
Computer Vision and Pattern Recognition (CVPR), 2024
Yuchen Suo
Fan Ma
Linchao Zhu
Yi Yang
238
42
0
24 Mar 2024
Image2Sentence based Asymmetrical Zero-shot Composed Image Retrieval
Yongchao Du
Min Wang
Wen-gang Zhou
Shuping Hui
Houqiang Li
149
18
0
03 Mar 2024
Language-only Efficient Training of Zero-shot Composed Image Retrieval
Computer Vision and Pattern Recognition (CVPR), 2023
Geonmo Gu
Sanghyuk Chun
Wonjae Kim
Yoohoon Kang
Sangdoo Yun
352
31
0
04 Dec 2023
Pretrain like Your Inference: Masked Tuning Improves Zero-Shot Composed Image Retrieval
Junyang Chen
Hanjiang Lai
VLM
455
16
0
13 Nov 2023
Vision-by-Language for Training-Free Compositional Image Retrieval
Shyamgopal Karthik
Karsten Roth
Goran Frehse
Zeynep Akata
CoGe
367
88
0
13 Oct 2023
Learning Interactive Real-World Simulators
International Conference on Learning Representations (ICLR), 2023
Mengjiao Yang
Yilun Du
Kamyar Ghasemipour
Jonathan Tompson
Leslie Kaelbling
Dale Schuurmans
Pieter Abbeel
LM&Ro
PINN
345
330
0
09 Oct 2023
Context-I2W: Mapping Images to Context-dependent Words for Accurate Zero-Shot Composed Image Retrieval
AAAI Conference on Artificial Intelligence (AAAI), 2023
Yuanmin Tang
Jiahao Yu
Keke Gai
Jiamin Zhuang
Gang Xiong
Yue Hu
Qi Wu
203
54
0
28 Sep 2023
GeneCIS: A Benchmark for General Conditional Image Similarity
Computer Vision and Pattern Recognition (CVPR), 2023
S. Vaze
Nicolas Carion
Ishan Misra
VLM
DiffM
247
40
0
13 Jun 2023
Zero-Shot Composed Image Retrieval with Textual Inversion
IEEE International Conference on Computer Vision (ICCV), 2023
Alberto Baldrati
Lorenzo Agnolucci
Marco Bertini
Marco Bertini
278
160
0
27 Mar 2023
CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion
Geonmo Gu
Sanghyuk Chun
Wonjae Kim
HeeJae Jun
Yoohoon Kang
Sangdoo Yun
DiffM
550
77
0
21 Mar 2023
Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval
Computer Vision and Pattern Recognition (CVPR), 2023
Kuniaki Saito
Kihyuk Sohn
Xiang Zhang
Chun-Liang Li
Chen-Yu Lee
Kate Saenko
Tomas Pfister
308
166
0
06 Feb 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
International Conference on Machine Learning (ICML), 2023
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
1.3K
6,661
0
30 Jan 2023
Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture
Computer Vision and Pattern Recognition (CVPR), 2023
Mahmoud Assran
Quentin Duval
Ishan Misra
Piotr Bojanowski
Pascal Vincent
Michael G. Rabbat
Yann LeCun
Nicolas Ballas
SSL
AI4TS
MDE
465
569
0
19 Jan 2023
Flamingo: a Visual Language Model for Few-Shot Learning
Neural Information Processing Systems (NeurIPS), 2022
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
695
4,861
0
29 Apr 2022
Conditional Prompt Learning for Vision-Language Models
Computer Vision and Pattern Recognition (CVPR), 2022
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VLM
CLIP
VPVLM
508
1,867
0
10 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
International Conference on Machine Learning (ICML), 2022
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
1.3K
5,760
0
28 Jan 2022
High-Resolution Image Synthesis with Latent Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2021
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
DiffM
3.0K
21,096
0
20 Dec 2021
High Fidelity Visualization of What Your Self-Supervised Representation Knows About
Florian Bordes
Randall Balestriero
Pascal Vincent
DiffM
260
71
0
16 Dec 2021
SimMIM: A Simple Framework for Masked Image Modeling
Zhenda Xie
Zheng Zhang
Yue Cao
Yutong Lin
Jianmin Bao
Zhuliang Yao
Jingdong Sun
Han Hu
433
1,637
0
18 Nov 2021
Masked Autoencoders Are Scalable Vision Learners
Computer Vision and Pattern Recognition (CVPR), 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
2.5K
10,037
0
11 Nov 2021
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALM
UQCV
1.7K
4,618
0
03 Sep 2021
Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models
IEEE International Conference on Computer Vision (ICCV), 2021
Zheyuan Liu
Cristian Rodriguez-Opazo
Damien Teney
Stephen Gould
VLM
296
285
0
09 Aug 2021
Learning Transferable Visual Models From Natural Language Supervision
International Conference on Machine Learning (ICML), 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
2.0K
41,259
0
26 Feb 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
1.4K
55,030
0
22 Oct 2020
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization
Dan Hendrycks
Steven Basart
Norman Mu
Saurav Kadavath
Frank Wang
...
Samyak Parajuli
Mike Guo
Basel Alomair
Jacob Steinhardt
Justin Gilmer
OOD
991
2,103
0
29 Jun 2020
Language Models are Few-Shot Learners
Neural Information Processing Systems (NeurIPS), 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
2.0K
52,526
0
28 May 2020
ReZero is All You Need: Fast Convergence at Large Depth
Conference on Uncertainty in Artificial Intelligence (UAI), 2020
Thomas C. Bachlechner
Bodhisattwa Prasad Majumder
H. H. Mao
G. Cottrell
Julian McAuley
AI4CE
363
326
0
10 Mar 2020
Dream to Control: Learning Behaviors by Latent Imagination
International Conference on Learning Representations (ICLR), 2019
Danijar Hafner
Timothy Lillicrap
Jimmy Ba
Mohammad Norouzi
VLM
580
1,613
0
03 Dec 2019
Composing Text and Image for Image Retrieval - An Empirical Odyssey
Nam S. Vo
Lu Jiang
Chen Sun
Kevin Patrick Murphy
Li Li
Li Fei-Fei
James Hays
CoGe
208
423
0
18 Dec 2018
Microsoft COCO: Common Objects in Context
European Conference on Computer Vision (ECCV), 2014
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
17.8K
49,453
0
01 May 2014
1