Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.03533
Cited By
Text Embeddings by Weakly-Supervised Contrastive Pre-training
7 December 2022
Liang Wang
Nan Yang
Xiaolong Huang
Binxing Jiao
Linjun Yang
Daxin Jiang
Rangan Majumder
Furu Wei
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Text Embeddings by Weakly-Supervised Contrastive Pre-training"
31 / 81 papers shown
Title
Transport of Algebraic Structure to Latent Embeddings
Samuel Pfrommer
Brendon G. Anderson
Somayeh Sojoudi
21
0
0
27 May 2024
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Chankyu Lee
Rajarshi Roy
Mengyao Xu
Jonathan Raiman
M. Shoeybi
Bryan Catanzaro
Wei Ping
RALM
38
137
0
27 May 2024
FlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research
Jiajie Jin
Yutao Zhu
Xinyu Yang
Chenghao Zhang
Zhicheng Dou
Chenghao Zhang
Tong Zhao
Zhao Yang
Zhicheng Dou
Ji-Rong Wen
VLM
73
45
0
22 May 2024
BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers
Ran Xu
Wenqi Shi
Yue Yu
Yuchen Zhuang
Yanqiao Zhu
M. D. Wang
Joyce C. Ho
Chao Zhang
Carl Yang
LM&MA
40
19
0
29 Apr 2024
From Matching to Generation: A Survey on Generative Information Retrieval
Xiaoxi Li
Jiajie Jin
Yujia Zhou
Yuyao Zhang
Peitian Zhang
Yutao Zhu
Zhicheng Dou
3DV
67
45
0
23 Apr 2024
Generalized Contrastive Learning for Multi-Modal Retrieval and Ranking
Tianyu Zhu
M. Jung
Jesse Clark
83
1
0
12 Apr 2024
Gecko: Versatile Text Embeddings Distilled from Large Language Models
Jinhyuk Lee
Zhuyun Dai
Xiaoqi Ren
Blair Chen
Daniel Matthew Cer
...
Aditya Kusupati
Prateek Jain
Siddhartha Reddy Jonnalagadda
Ming-Wei Chang
Iftekhar Naim
RALM
VLM
SyDa
33
40
0
29 Mar 2024
IR2: Information Regularization for Information Retrieval
Jianyou Wang
Kaicheng Wang
Xiaoyue Wang
Weili Cao
R. Paturi
Leon Bergen
43
1
0
25 Feb 2024
Deep Learning-based Computational Job Market Analysis: A Survey on Skill Extraction and Classification from Job Postings
Elena Senger
Mike Zhang
Rob van der Goot
Barbara Plank
21
7
0
08 Feb 2024
ConFit: Improving Resume-Job Matching using Data Augmentation and Contrastive Learning
Xiao Yu
Jinzhong Zhang
Zhou Yu
25
1
0
29 Jan 2024
UNSEE: Unsupervised Non-contrastive Sentence Embeddings
Ömer Veysel Çagatan
SSL
19
0
0
27 Jan 2024
Knowledge Fusion of Large Language Models
Fanqi Wan
Xinting Huang
Deng Cai
Xiaojun Quan
Wei Bi
Shuming Shi
MoMe
22
61
0
19 Jan 2024
Learning High-Quality and General-Purpose Phrase Representations
Lihu Chen
Gaël Varoquaux
Fabian M. Suchanek
12
3
0
18 Jan 2024
Data-CUBE: Data Curriculum for Instruction-based Sentence Representation Learning
Yingqian Min
Kun Zhou
Dawei Gao
Wayne Xin Zhao
He Hu
Yaliang Li
24
1
0
07 Jan 2024
RETSim: Resilient and Efficient Text Similarity
Marina Zhang
Owen Vallis
Aysegul Bumin
Tanay Vakharia
Elie Bursztein
10
1
0
28 Nov 2023
A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models
Shengyao Zhuang
Honglei Zhuang
Bevan Koopman
Guido Zuccon
30
21
0
14 Oct 2023
ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation
Jianghao Lin
Rongjie Shan
Chenxu Zhu
Kounianhua Du
Bo Chen
Shigang Quan
Ruiming Tang
Yong Yu
Weinan Zhang
LRM
21
79
0
22 Aug 2023
RaLLe: A Framework for Developing and Evaluating Retrieval-Augmented Large Language Models
Yasuto Hoshi
Daisuke Miyashita
Youyang Ng
Kento Tatsuno
Yasuhiro Morioka
Osamu Torii
J. Deguchi
LRM
27
11
0
21 Aug 2023
SimplyRetrieve: A Private and Lightweight Retrieval-Centric Generative AI Tool
Youyang Ng
Daisuke Miyashita
Yasuto Hoshi
Yasuhiro Morioka
Osamu Torii
Tomoya Kodama
J. Deguchi
RALM
8
9
0
08 Aug 2023
Large Language Models as Batteries-Included Zero-Shot ESCO Skills Matchers
Benjamin Clavié
Guillaume Soulié
13
10
0
07 Jul 2023
Description-Based Text Similarity
Shauli Ravfogel
Valentina Pyatkin
Amir D. N. Cohen
Avshalom Manevich
Yoav Goldberg
17
5
0
21 May 2023
Curating corpora with classifiers: A case study of clean energy sentiment online
M. V. Arnold
P. Dodds
C. Danforth
13
0
0
04 May 2023
The MiniPile Challenge for Data-Efficient Language Models
Jean Kaddour
MoE
ALM
10
41
0
17 Apr 2023
Text and Code Embeddings by Contrastive Pre-Training
Arvind Neelakantan
Tao Xu
Raul Puri
Alec Radford
Jesse Michael Han
...
Tabarak Khan
Toki Sherbakov
Joanne Jang
Peter Welinder
Lilian Weng
SSL
AI4TS
204
412
0
24 Jan 2022
RocketQAv2: A Joint Training Method for Dense Passage Retrieval and Passage Re-ranking
Ruiyang Ren
Yingqi Qu
Jing Liu
Wayne Xin Zhao
Qiaoqiao She
Hua-Hong Wu
Haifeng Wang
Ji-Rong Wen
124
244
0
14 Oct 2021
Salient Phrase Aware Dense Retrieval: Can a Dense Retriever Imitate a Sparse One?
Xilun Chen
Kushal Lakhotia
Barlas Oğuz
Anchit Gupta
Patrick Lewis
Stanislav Peshterliev
Yashar Mehdad
Sonal Gupta
Wen-tau Yih
48
67
0
13 Oct 2021
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
237
588
0
14 Jul 2021
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Nandan Thakur
Nils Reimers
Andreas Rucklé
Abhishek Srivastava
Iryna Gurevych
VLM
229
961
0
17 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
245
1,977
0
31 Dec 2020
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
228
31,150
0
16 Jan 2013
Previous
1
2