Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.03281
Cited By
Towards General Text Embeddings with Multi-stage Contrastive Learning
7 August 2023
Zehan Li
Xin Zhang
Yanzhao Zhang
Dingkun Long
Pengjun Xie
Meishan Zhang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards General Text Embeddings with Multi-stage Contrastive Learning"
41 / 41 papers shown
Title
UKElectionNarratives: A Dataset of Misleading Narratives Surrounding Recent UK General Elections
Fatima Haouari
Carolina Scarton
Nicolò Faggiani
Nikolaos Nikolaidis
Bonka Kotseva
Ibrahim Abu Farha
Jens Linge
Kalina Bontcheva
34
0
0
08 May 2025
PropRAG: Guiding Retrieval with Beam Search over Proposition Paths
Jingjin Wang
LRM
51
0
0
25 Apr 2025
Don't Retrieve, Generate: Prompting LLMs for Synthetic Training Data in Dense Retrieval
Aarush Sinha
RALM
73
0
0
20 Apr 2025
From Large to Super-Tiny: End-to-End Optimization for Cost-Efficient LLMs
Jiliang Ni
Jiachen Pu
Zhongyi Yang
Kun Zhou
Hui Wang
Xiaoliang Xiao
Dakui Wang
Xin Li
Jingfeng Luo
Conggang Hu
32
0
0
18 Apr 2025
GOLLuM: Gaussian Process Optimized LLMs -- Reframing LLM Finetuning through Bayesian Optimization
Bojana Ranković
P. Schwaller
BDL
74
0
0
08 Apr 2025
Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking
Chris Samarinas
Hamed Zamani
ALM
LRM
66
0
0
04 Apr 2025
CASE -- Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement
Gaifan Zhang
Yi Zhou
Danushka Bollegala
61
0
0
21 Mar 2025
MultiConIR: Towards multi-condition Information Retrieval
Xuan Lu
Sifan Liu
Bochao Yin
Y. K. Li
Xinghao Chen
Hui Su
Yaohui Jin
Wenjun Zeng
Xiaoyu Shen
64
0
0
13 Mar 2025
SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing
Xiangchao Yan
Shiyang Feng
Jiakang Yuan
Renqiu Xia
Bin Wang
Bo Zhang
Lei Bai
60
2
0
06 Mar 2025
Following the Autoregressive Nature of LLM Embeddings via Compression and Alignment
Jingcheng Deng
Zhongtao Jiang
Liang Pang
Liwei Chen
Kun Xu
Zihao Wei
Huawei Shen
Xueqi Cheng
49
1
0
17 Feb 2025
FinMTEB: Finance Massive Text Embedding Benchmark
Yixuan Tang
Yi Yang
AIFin
58
0
0
16 Feb 2025
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation
Mohammad Mahdi Abootorabi
Amirhosein Zobeiri
Mahdi Dehghani
Mohammadali Mohammadkhani
Bardia Mohammadi
Omid Ghahroodi
M. Baghshah
Ehsaneddin Asgari
RALM
95
4
0
12 Feb 2025
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks
Ziyan Jiang
Rui Meng
Xinyi Yang
Semih Yavuz
Yingbo Zhou
Wenhu Chen
MLLM
VLM
51
18
0
03 Jan 2025
LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models
Hieu Man
Nghia Trung Ngo
Viet Dac Lai
Ryan Rossi
Franck Dernoncourt
T. Nguyen
64
0
0
01 Jan 2025
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Wenqi Zhang
Hang Zhang
Xin Li
Jiashuo Sun
Yongliang Shen
Weiming Lu
Deli Zhao
Yueting Zhuang
Lidong Bing
VLM
37
2
0
01 Jan 2025
GME: Improving Universal Multimodal Retrieval by Multimodal LLMs
Xin Zhang
Yanzhao Zhang
Wen Xie
Mingxin Li
Ziqi Dai
Dingkun Long
Pengjun Xie
Meishan Zhang
Wenjie Li
M. Zhang
111
7
0
22 Dec 2024
Evaluating Creative Short Story Generation in Humans and Large Language Models
Mete Ismayilzada
Claire Stevenson
Lonneke van der Plas
LM&MA
LRM
30
3
0
04 Nov 2024
TurboRAG: Accelerating Retrieval-Augmented Generation with Precomputed KV Caches for Chunked Text
Songshuo Lu
Hua Wang
Yutian Rong
Zhi Chen
Yaohua Tang
VLM
28
11
0
10 Oct 2024
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
David Grangier
Simin Fan
Skyler Seto
Pierre Ablin
34
3
0
30 Sep 2024
Do We Need Domain-Specific Embedding Models? An Empirical Investigation
Yixuan Tang
Yi Yang
AIFin
38
3
0
27 Sep 2024
The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design
Artem Snegirev
Maria Tikhonova
Anna Maksimova
Alena Fenogenova
Alexander Abramov
19
4
0
22 Aug 2024
Understanding Generative AI Content with Embedding Models
Max Vargas
Reilly Cannon
A. Engel
Anand D. Sarwate
Tony Chiang
42
3
0
19 Aug 2024
Moonshine: Distilling Game Content Generators into Steerable Generative Models
Yuhe Nie
Michael Middleton
Tim Merino
Nidhushan Kanagaraja
Ashutosh Kumar
Zhan Zhuang
Julian Togelius
45
0
0
18 Aug 2024
RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework
Kunlun Zhu
Yifan Luo
Dingling Xu
Ruobing Wang
Shi Yu
...
Yishan Li
Zhiyuan Liu
Xu Han
Zhiyuan Liu
Maosong Sun
29
17
0
02 Aug 2024
TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods
Gabriel Loiseau
Damien Sileo
Damien Riquet
Maxime Meyer
Marc Tommasi
25
0
0
31 Jul 2024
NV-Retriever: Improving text embedding models with effective hard-negative mining
G. D. S. P. Moreira
Radek Osmulski
Mengyao Xu
Ronay Ak
Benedikt D. Schifferer
Even Oldridge
RALM
41
30
0
22 Jul 2024
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
Peng-Tao Xu
Wei Ping
Xianchao Wu
Zihan Liu
M. Shoeybi
Mohammad Shoeybi
Bryan Catanzaro
RALM
44
14
0
19 Jul 2024
CoIR: A Comprehensive Benchmark for Code Information Retrieval Models
Xiangyang Li
Kuicai Dong
Yi Quan Lee
Wei Xia
Yichun Yin
Xinyi Dai
Yasheng Wang
Ruiming Tang
47
15
0
03 Jul 2024
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Chankyu Lee
Rajarshi Roy
Mengyao Xu
Jonathan Raiman
M. Shoeybi
Bryan Catanzaro
Wei Ping
RALM
38
137
0
27 May 2024
Gecko: Versatile Text Embeddings Distilled from Large Language Models
Jinhyuk Lee
Zhuyun Dai
Xiaoqi Ren
Blair Chen
Daniel Matthew Cer
...
Aditya Kusupati
Prateek Jain
Siddhartha Reddy Jonnalagadda
Ming-Wei Chang
Iftekhar Naim
RALM
VLM
SyDa
30
40
0
29 Mar 2024
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Xing Han Lù
Zdeněk Kasner
Siva Reddy
22
59
0
08 Feb 2024
Convincing Rationales for Visual Question Answering Reasoning
Kun Li
G. Vosselman
Michael Ying Yang
34
1
0
06 Feb 2024
UNSEE: Unsupervised Non-contrastive Sentence Embeddings
Ömer Veysel Çagatan
SSL
12
0
0
27 Jan 2024
Data-CUBE: Data Curriculum for Instruction-based Sentence Representation Learning
Yingqian Min
Kun Zhou
Dawei Gao
Wayne Xin Zhao
He Hu
Yaliang Li
21
1
0
07 Jan 2024
A Comprehensive Survey of Sentence Representations: From the BERT Epoch to the ChatGPT Era and Beyond
Abhinav Ramesh Kashyap
Thang-Tung Nguyen
Viktor Schlegel
Stefan Winkler
See-Kiong Ng
Soujanya Poria
AI4TS
3DV
SSL
29
6
0
22 May 2023
Description-Based Text Similarity
Shauli Ravfogel
Valentina Pyatkin
Amir D. N. Cohen
Avshalom Manevich
Yoav Goldberg
14
5
0
21 May 2023
RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder
Shitao Xiao
Zheng Liu
Yingxia Shao
Zhao Cao
RALM
115
105
0
24 May 2022
Text and Code Embeddings by Contrastive Pre-Training
Arvind Neelakantan
Tao Xu
Raul Puri
Alec Radford
Jesse Michael Han
...
Tabarak Khan
Toki Sherbakov
Joanne Jang
Peter Welinder
Lilian Weng
SSL
AI4TS
204
412
0
24 Jan 2022
PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense Passage Retrieval
Ruiyang Ren
Shangwen Lv
Yingqi Qu
Jing Liu
Wayne Xin Zhao
Qiaoqiao She
Hua-Hong Wu
Haifeng Wang
Ji-Rong Wen
118
90
0
13 Aug 2021
Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval
Luyu Gao
Jamie Callan
RALM
152
326
0
12 Aug 2021
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Nandan Thakur
Nils Reimers
Andreas Rucklé
Abhishek Srivastava
Iryna Gurevych
VLM
229
961
0
17 Apr 2021
1