This paper introduces a novel algorithm for a class of weakly supervised learning tasks. The considered tasks are posed as joint optimization problems in the continuous model parameters and the (a-priori unknown) discrete label variables. In contrast to prior approaches such as convex relaxations, we decompose the nonconvex problem into purely discrete and purely continuous subproblems in a way that is amenable to distributed optimization by the Alternating Direction Method of Multipliers (ADMM). This approach preserves integrality of the discrete label variables and, for a reparameterized variant of the algorithm using kernels, guarantees global convergence to a critical point. The resulting method implicitly alternates between a discrete and a continuous variable update, however, it is inherently different from a discrete-continuous coordinate descent scheme (hard EM). In diverse experiments we show that our method can learn a classifier from weak supervision that takes the form of hard and soft constraints on the labeling and outperforms hard EM in this task.
View on arXiv