A simple extension of Azadkia & Chatterjee's rank correlation to
multi-response vectors
Recently, Chatterjee (2023) recognized the lack of a direct generalization of his rank correlation in Azadkia and Chatterjee (2021) to a multi-dimensional response vector. As a natural solution to this problem, we here propose an extension of that is applicable to a set of response variables, where our approach builds upon converting the original vector-valued problem into a univariate problem and then applying the rank correlation to it. Our novel measure quantifies the scale-invariant extent of functional dependence of a response vector on predictor variables , characterizes independence of and as well as perfect dependence of on and hence fulfills all the characteristics of a measure of predictability. Aiming at maximum interpretability, we provide various invariance results for as well as a closed-form expression in multivariate normal models. Building upon the graph-based estimator for in Azadkia and Chatterjee (2021), we obtain a non-parametric, strongly consistent estimator for and show its asymptotic normality. Based on this estimator, we develop a model-free and dependence-based feature ranking and forward feature selection for multiple-outcome data. Simulation results and real case studies illustrate 's broad applicability.
View on arXiv