ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.10150
13
5

fff-MICL: Understanding and Generalizing InfoNCE-based Contrastive Learning

15 February 2024
Yiwei Lu
Guojun Zhang
Sun Sun
Hongyu Guo
Yaoliang Yu
    VLM
ArXivPDFHTML
Abstract

In self-supervised contrastive learning, a widely-adopted objective function is InfoNCE, which uses the heuristic cosine similarity for the representation comparison, and is closely related to maximizing the Kullback-Leibler (KL)-based mutual information. In this paper, we aim at answering two intriguing questions: (1) Can we go beyond the KL-based objective? (2) Besides the popular cosine similarity, can we design a better similarity function? We provide answers to both questions by generalizing the KL-based mutual information to the fff-Mutual Information in Contrastive Learning (fff-MICL) using the fff-divergences. To answer the first question, we provide a wide range of fff-MICL objectives which share the nice properties of InfoNCE (e.g., alignment and uniformity), and meanwhile result in similar or even superior performance. For the second question, assuming that the joint feature distribution is proportional to the Gaussian kernel, we derive an fff-Gaussian similarity with better interpretability and empirical performance. Finally, we identify close relationships between the fff-MICL objective and several popular InfoNCE-based objectives. Using benchmark tasks from both vision and natural language, we empirically evaluate fff-MICL with different fff-divergences on various architectures (SimCLR, MoCo, and MoCo v3) and datasets. We observe that fff-MICL generally outperforms the benchmarks and the best-performing fff-divergence is task and dataset dependent.

View on arXiv
Comments on this paper