352

Jina-ColBERT-v2: A General-Purpose Multilingual Late Interaction Retriever

Main:5 Pages
1 Figures
Bibliography:3 Pages
7 Tables
Abstract

Multi-vector dense models, such as ColBERT, have proven highly effective in information retrieval. ColBERT's late interaction scoring approximates the joint query-document attention seen in cross-encoders while maintaining inference efficiency closer to traditional dense retrieval models, thanks to its bi-encoder architecture and recent optimizations in indexing and search. In this paper, we introduce a novel architecture and a training framework to support long context window and multilingual retrieval. Our new model, Jina-ColBERT-v2, demonstrates strong performance across a range of English and multilingual retrieval tasks,

View on arXiv
Comments on this paper