Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.05160
Cited By
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks
3 January 2025
Ziyan Jiang
Rui Meng
Xinyi Yang
Semih Yavuz
Yingbo Zhou
Wenhu Chen
MLLM
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks"
13 / 13 papers shown
Title
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs
Tiancheng Gu
Kaicheng Yang
Ziyong Feng
Xingjun Wang
Yanzhao Zhang
Dingkun Long
Yingda Chen
Weidong Cai
Jiankang Deng
VLM
63
0
0
24 Apr 2025
MIEB: Massive Image Embedding Benchmark
Chenghao Xiao
Isaac Chung
Imene Kerboua
Jamie Stirling
Xin Zhang
Márton Kardos
Roman Solomatin
Noura Al Moubayed
K. Enevoldsen
Niklas Muennighoff
VLM
35
0
0
14 Apr 2025
FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations
Cheng-Yu Hsieh
Pavan Kumar Anasosalu Vasu
Fartash Faghri
Raviteja Vemulapalli
Chun-Liang Li
Ranjay Krishna
Oncel Tuzel
Hadi Pouransari
VLM
49
0
0
11 Apr 2025
IDMR: Towards Instance-Driven Precise Visual Correspondence in Multimodal Retrieval
Bangwei Liu
Yicheng Bao
Shaohui Lin
Xuhong Wang
Xin Tan
Y. Wang
Yuan Xie
Chaochao Lu
48
0
0
01 Apr 2025
Fwd2Bot: LVLM Visual Token Compression with Double Forward Bottleneck
Adrian Bulat
Yassine Ouali
Georgios Tzimiropoulos
43
0
0
27 Mar 2025
VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary
Kevin Qinghong Lin
Mike Zheng Shou
VGen
50
1
0
12 Mar 2025
A Novel Trustworthy Video Summarization Algorithm Through a Mixture of LoRA Experts
Wenzhuo Du
G. Wang
Guancheng Chen
Hang Zhao
X. Li
Jian Gao
57
0
0
08 Mar 2025
ABC: Achieving Better Control of Multimodal Embeddings using VLMs
Benjamin Schneider
Florian Kerschbaum
Wenhu Chen
31
0
0
01 Mar 2025
Joint Fusion and Encoding: Advancing Multimodal Retrieval from the Ground Up
Lang Huang
Qiyu Wu
Zhongtao Miao
T. Yamasaki
45
0
0
27 Feb 2025
Any Information Is Just Worth One Single Screenshot: Unifying Search With Visualized Information Retrieval
Ze Liu
Zhengyang Liang
Junjie Zhou
Zheng Liu
Defu Lian
OffRL
46
0
0
17 Feb 2025
GME: Improving Universal Multimodal Retrieval by Multimodal LLMs
Xin Zhang
Yanzhao Zhang
Wen Xie
Mingxin Li
Ziqi Dai
Dingkun Long
Pengjun Xie
Meishan Zhang
Wenjie Li
M. Zhang
95
7
0
22 Dec 2024
Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding
Yiming Zhang
Zhuokai Zhao
Zhaorun Chen
Zenghui Ding
Xianjun Yang
Yining Sun
79
1
0
21 Nov 2024
ColPali: Efficient Document Retrieval with Vision Language Models
Manuel Faysse
Hugues Sibille
Tony Wu
Bilel Omrani
Gautier Viaud
C´eline Hudelot
Pierre Colombo
VLM
43
21
0
27 Jun 2024
1