Towards "simultaneous selective inference": post-hoc bounds on the false discovery proportion

Some pitfalls of the false discovery rate (FDR) as an error criterion for multiple testing of hypotheses include (a) committing to an error level in advance limits its use in exploratory data analysis, and (b) controlling the false discovery proportion (FDP) on average provides no guarantee on its variability. We take a step towards overcoming these barriers using a new perspective we call "simultaneous selective inference." Many FDR procedures (such as Benjamini-Hochberg) can be viewed as carving out a of potential rejection sets , assigning some algorithm-dependent estimate to each one. Then, they choose . We prove that for all these algorithms, given independent null p-values and a confidence level , either the same or a minor variant thereof bounds the unknown FDP to within a small explicit (algorithm-dependent) constant factor , uniformly across the entire path, with probability . Our bounds open up a middle ground between fully simultaneous inference (guarantees for all possible rejection sets), and fully selective inference (guarantees only for ). They allow the scientist to one or more suitable rejection sets (Select Post-hoc On the algorithm's Trajectory), by picking data-dependent sizes or error-levels, after examining the entire path of and the uniform upper band on . The price for the additional flexibility of spotting is small, for example the multiplier for BH corresponding to 95% confidence is approximately 2.
View on arXiv