183

Optimal Permutation Recovery in Permuted Monotone Matrix Model

Journal of the American Statistical Association (JASA), 2019
Abstract

Motivated by recent research on quantifying bacterial growth dynamics based on genome assemblies, we consider a permuted monotone matrix model Y=ΘΠ+ZY=\Theta\Pi+Z, where the rows represent different samples, the columns represent contigs in genome assemblies and the elements represent log-read counts after preprocessing steps and Guanine-Cytosine (GC) adjustment. In this model, Θ\Theta is an unknown mean matrix with monotone entries for each row, Π\Pi is a permutation matrix that permutes the columns of Θ\Theta, and ZZ is a noise matrix. This paper studies the problem of estimation/recovery of Π\Pi given the observed noisy matrix YY. We propose an estimator based on the best linear projection, which is shown to be minimax rate-optimal for both exact recovery, as measured by the 0-1 loss, and partial recovery, as quantified by the normalized Kendall's tau distance. Simulation studies demonstrate the superior empirical performance of the proposed estimator over alternative methods. We demonstrate the methods using a synthetic metagenomics dataset of 45 closely related bacterial species and a real metagenomic dataset to compare the bacterial growth dynamics between the responders and the non-responders of the IBD patients after 8 weeks of treatment.

View on arXiv
Comments on this paper