v1v2 (latest)
Feature Selection and Junta Testing are Statistically Equivalent
Appendix:32 Pages
Abstract
For a function , the junta testing problem asks whether depends on only variables. If depends on only variables, the feature selection problem asks to find those variables. We prove that these two tasks are statistically equivalent. Specifically, we show that the ``brute-force'' algorithm, which checks for any set of variables consistent with the sample, is simultaneously sample-optimal for both problems, and the optimal sample size is \[ \Theta\left(\frac 1 \varepsilon \left( \sqrt{2^k \log {n \choose k}} + \log {n \choose k}\right)\right). \]
View on arXivComments on this paper
