Accurate $p$ -Value Calculation for Generalized Fisher's Combination Tests Under Dependence

3 March 2020

Abstract

Combining dependent tests of significance has broad applications but the $p$ -value calculation is challenging. Current moment-matching methods (e.g., Brown's approximation) for Fisher's combination test tend to significantly inflate the type I error rate at the level less than 0.05. It could lead to significant false discoveries in big data analyses. This paper provides several more accurate and computationally efficient $p$ -value calculation methods for a general family of Fisher type statistics, referred as the GFisher. The GFisher covers Fisher's combination, Good's statistic, Lancaster's statistic, weighted Z-score combination, etc. It allows a flexible weighting scheme, as well as an omnibus procedure that automatically adapts proper weights and degrees of freedom to a given data. The new $p$ -value calculation methods are based on novel ideas of moment-ratio matching and joint-distribution surrogating. Systematic simulations show that they are accurate under multivariate Gaussian, and robust under the generalized linear model and the multivariate $t$ -distribution, down to at least $10^{-6}$ level. We illustrate the usefulness of the GFisher and the new $p$ -value calculation methods in analyzing both simulated and real data of gene-based SNP-set association studies in genetics. Relevant computation has been implemented into R package $GFisher$ .

View on arXiv

Comments on this paper

Accurate ppp-Value Calculation for Generalized Fisher's Combination Tests Under Dependence

Accurate $p$ -Value Calculation for Generalized Fisher's Combination Tests Under Dependence