Out-of-the-Box Conditional Text Embeddings from Large Language Models

23 April 2025

Abstract

Conditional text embedding is a proposed representation that captures the shift in perspective on texts when conditioned on a specific aspect. Previous methods have relied on extensive training data for fine-tuning models, leading to challenges in terms of labor and resource costs. We propose PonTE, a novel unsupervised conditional text embedding method that leverages a causal large language model and a conditional prompt. Through experiments on conditional semantic text similarity and text clustering, we demonstrate that PonTE can generate useful conditional text embeddings and achieve performance comparable to supervised methods without fine-tuning. We also show the interpretability of text embeddings with PonTE by analyzing word generation following prompts and embedding visualization.

View on arXiv

@article{yamada2025_2504.16411,
  title={ Out-of-the-Box Conditional Text Embeddings from Large Language Models },
  author={ Kosuke Yamada and Peinan Zhang },
  journal={arXiv preprint arXiv:2504.16411},
  year={ 2025 }
}

Comments on this paper