Detecting change-points in a discrete distribution via model selection

This paper is concerned with the detection of multiple change-points in the joint distribution of independent categorical variables. The procedures introduced rely on model selection and are based on a penalized least-squares criterion. Their performance is assessed from a nonasymptotic point of view. Using a special collection of models, a preliminary estimator is built. According to an existing model selection theorem, it satisfies an oracle-type inequality. Moreover, thanks to an approximation result demonstrated in this paper, it is also proved to be adaptive in the minimax sense. In order to eliminate some irrelevant change-points selected by that first estimator, a two-stage procedure is proposed, that also enjoys some adaptivity property. Besides, the first estimator can be computed with a complexity only linear in the size of the data. A heuristic method allows to implement the second procedure quite satisfactorily with the same computational complexity.
View on arXiv