Four Algorithms for Correlation Clustering: A Survey

24 August 2022

Abstract

In the Correlation Clustering problem, we are given a set of objects with pairwise similarity information. Our aim is to partition these objects into clusters that match this information as closely as possible. More specifically, the pairwise information is given as a weighted graph $G$ with its edges labelled as ``similar" or ``dissimilar" by a binary classifier. The goal is to produce a clustering that minimizes the weight of ``disagreements": the sum of the weights of similar edges across clusters and dissimilar edges within clusters. In this exposition we focus on the case when $G$ is complete and unweighted. We explore four approximation algorithms for the Correlation Clustering problem under this assumption. In particular, we describe the following algorithms: (i) the $17429-$ approximation algorithm by Bansal, Blum, and Chawla, (ii) the $4-$ approximation algorithm by Charikar, Guruswami, and Wirth (iii) the $3-$ approximation algorithm by Ailon, Charikar, and Newman (iv) the $2.06-$ approximation algorithm by Chawla, Makarychev, Schramm, and Yaroslavtsev.

View on arXiv

Comments on this paper