scipy.stats.combine_pvalues¶
-
scipy.stats.
combine_pvalues
(pvalues, method='fisher', weights=None)[source]¶ Combine p-values from independent tests bearing upon the same hypothesis.
- Parameters
- pvaluesarray_like, 1-D
Array of p-values assumed to come from independent tests.
- method{‘fisher’, ‘pearson’, ‘tippett’, ‘stouffer’,
‘mudholkar_george’}, optional
Name of method to use to combine p-values. The following methods are available (default is ‘fisher’):
‘fisher’: Fisher’s method (Fisher’s combined probability test), the sum of the logarithm of the p-values
‘pearson’: Pearson’s method (similar to Fisher’s but uses sum of the complement of the p-values inside the logarithms)
‘tippett’: Tippett’s method (minimum of p-values)
‘stouffer’: Stouffer’s Z-score method
‘mudholkar_george’: the difference of Fisher’s and Pearson’s methods divided by 2
- weightsarray_like, 1-D, optional
Optional array of weights used only for Stouffer’s Z-score method.
- Returns
- statistic: float
The statistic calculated by the specified method.
- pval: float
The combined p-value.
Notes
Fisher’s method (also known as Fisher’s combined probability test) [1] uses a chi-squared statistic to compute a combined p-value. The closely related Stouffer’s Z-score method [2] uses Z-scores rather than p-values. The advantage of Stouffer’s method is that it is straightforward to introduce weights, which can make Stouffer’s method more powerful than Fisher’s method when the p-values are from studies of different size [6] [7]. The Pearson’s method uses \(log(1-p_i)\) inside the sum whereas Fisher’s method uses \(log(p_i)\) [4]. For Fisher’s and Pearson’s method, the sum of the logarithms is multiplied by -2 in the implementation. This quantity has a chi-square distribution that determines the p-value. The mudholkar_george method is the difference of the Fisher’s and Pearson’s test statistics, each of which include the -2 factor [4]. However, the mudholkar_george method does not include these -2 factors. The test statistic of mudholkar_george is the sum of logisitic random variables and equation 3.6 in [3] is used to approximate the p-value based on Student’s t-distribution.
Fisher’s method may be extended to combine p-values from dependent tests [5]. Extensions such as Brown’s method and Kost’s method are not currently implemented.
New in version 0.15.0.
References
- 1
- 2
https://en.wikipedia.org/wiki/Fisher%27s_method#Relation_to_Stouffer.27s_Z-score_method
- 3
George, E. O., and G. S. Mudholkar. “On the convolution of logistic random variables.” Metrika 30.1 (1983): 1-13.
- 4(1,2)
Heard, N. and Rubin-Delanchey, P. “Choosing between methods of combining p-values.” Biometrika 105.1 (2018): 239-246.
- 5
Whitlock, M. C. “Combining probability from independent tests: the weighted Z-method is superior to Fisher’s approach.” Journal of Evolutionary Biology 18, no. 5 (2005): 1368-1373.
- 6
Zaykin, Dmitri V. “Optimally weighted Z-test is a powerful method for combining probabilities in meta-analysis.” Journal of Evolutionary Biology 24, no. 8 (2011): 1836-1841.
- 7
https://en.wikipedia.org/wiki/Extensions_of_Fisher%27s_method