45
9

Fair Representation Clustering with Several Protected Classes

Abstract

We study the problem of fair kk-median where each cluster is required to have a fair representation of individuals from different groups. In the fair representation kk-median problem, we are given a set of points XX in a metric space. Each point xXx\in X belongs to one of \ell groups. Further, we are given fair representation parameters αj\alpha_j and βj\beta_j for each group j[]j\in [\ell]. We say that a kk-clustering C1,,CkC_1, \cdots, C_k fairly represents all groups if the number of points from group jj in cluster CiC_i is between αjCi\alpha_j |C_i| and βjCi\beta_j |C_i| for every j[]j\in[\ell] and i[k]i\in [k]. The goal is to find a set C\mathcal{C} of kk centers and an assignment ϕ:XC\phi: X\rightarrow \mathcal{C} such that the clustering defined by (C,ϕ)(\mathcal{C}, \phi) fairly represents all groups and minimizes the 1\ell_1-objective xXd(x,ϕ(x))\sum_{x\in X} d(x, \phi(x)). We present an O(logk)O(\log k)-approximation algorithm that runs in time nO()n^{O(\ell)}. Note that the known algorithms for the problem either (i) violate the fairness constraints by an additive term or (ii) run in time that is exponential in both kk and \ell. We also consider an important special case of the problem where αj=βj=fjf\alpha_j = \beta_j = \frac{f_j}{f} and fj,fNf_j, f \in \mathbb{N} for all j[]j\in [\ell]. For this special case, we present an O(logk)O(\log k)-approximation algorithm that runs in (kf)O()logn+poly(n)(kf)^{O(\ell)}\log n + poly(n) time.

View on arXiv
Comments on this paper