v1v2v3v4v5 (latest)

CASE -- Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement

21 March 2025

Gaifan Zhang

Yi Zhou

Danushka Bollegala

ArXiv (abs)PDF HTML Github

Main:8 Pages

7 Figures

Bibliography:3 Pages

11 Tables

Appendix:4 Pages

Abstract

The meaning conveyed by a sentence often depends on the context in which it appears. Despite the progress of sentence embedding methods, it remains unclear as how to best modify a sentence embedding conditioned on its context. To address this problem, we propose Condition-Aware Sentence Embeddings (CASE), an efficient and accurate method to create an embedding for a sentence under a given condition. First, CASE creates an embedding for the condition using a Large Language Model (LLM) encoder, where the sentence influences the attention scores computed for the tokens in the condition during pooling. Next, a supervised method is learnt to align the LLM-based text embeddings with the Conditional Semantic Textual Similarity (C-STS) task. We find that subtracting the condition embedding consistently improves the C-STS performance of LLM-based text embeddings by improving the isotropy of the embedding space. Moreover, our supervised projection method significantly improves the performance of LLM-based embeddings despite requiring a small number of embedding dimensions.

View on arXiv

Comments on this paper