scipy.stats.wilcoxon¶

scipy.stats.
wilcoxon
(x, y=None, zero_method='wilcox', correction=False, alternative='twosided')[source]¶ Calculate the Wilcoxon signedrank test.
The Wilcoxon signedrank test tests the null hypothesis that two related paired samples come from the same distribution. In particular, it tests whether the distribution of the differences x  y is symmetric about zero. It is a nonparametric version of the paired Ttest.
 Parameters
 xarray_like
Either the first set of measurements (in which case y is the second set of measurements), or the differences between two sets of measurements (in which case y is not to be specified.) Must be onedimensional.
 yarray_like, optional
Either the second set of measurements (if x is the first set of measurements), or not specified (if x is the differences between two sets of measurements.) Must be onedimensional.
 zero_method{“pratt”, “wilcox”, “zsplit”}, optional. Default is “wilcox”.
 “pratt”:
includes zerodifferences in the ranking process, but drops the ranks of the zeros, see [4], (more conservative)
 “wilcox”:
discards all zerodifferences, the default
 “zsplit”:
includes zerodifferences in the ranking process and split the zero rank between positive and negative ones
 correctionbool, optional
If True, apply continuity correction by adjusting the Wilcoxon rank statistic by 0.5 towards the mean value when computing the zstatistic. Default is False.
 alternative{“twosided”, “greater”, “less”}, optional
The alternative hypothesis to be tested, see Notes. Default is “twosided”.
 Returns
 statisticfloat
If alternative is “twosided”, the sum of the ranks of the differences above or below zero, whichever is smaller. Otherwise the sum of the ranks of the differences above zero.
 pvaluefloat
The pvalue for the test depending on alternative.
See also
Notes
The test has been introduced in [4]. Given n independent samples (xi, yi) from a bivariate distribution (i.e. paired samples), it computes the differences di = xi  yi. One assumption of the test is that the differences are symmetric, see [2]. The twosided test has the null hypothesis that the median of the differences is zero against the alternative that it is different from zero. The onesided test has the null that the median is positive against the alternative that the it is negative (
alternative == 'less'
), or vice versa (alternative == 'greater.'
).The test uses a normal approximation to derive the pvalue (if
zero_method == 'pratt'
, the approximation is adjusted as in [5]). A typical rule is to require that n > 20 ([2], p. 383). For smaller n, exact tables can be used to find critical values.References
 1
 2(1,2,3,4)
Conover, W.J., Practical Nonparametric Statistics, 1971.
 3
Pratt, J.W., Remarks on Zeros and Ties in the Wilcoxon Signed Rank Procedures, Journal of the American Statistical Association, Vol. 54, 1959, pp. 655667. DOI:10.1080/01621459.1959.10501526
 4(1,2,3,4)
Wilcoxon, F., Individual Comparisons by Ranking Methods, Biometrics Bulletin, Vol. 1, 1945, pp. 8083. DOI:10.2307/3001968
 5(1,2)
Cureton, E.E., The Normal Approximation to the SignedRank Sampling Distribution When Zero Differences are Present, Journal of the American Statistical Association, Vol. 62, 1967, pp. 10681069. DOI:10.1080/01621459.1967.10500917
Examples
In [4], the differences in height between cross and selffertilized corn plants is given as follows:
>>> d = [6, 8, 14, 16, 23, 24, 28, 29, 41, 48, 49, 56, 60, 67, 75]
Crossfertilized plants appear to be be higher. To test the null hypothesis that there is no height difference, we can apply the twosided test:
>>> from scipy.stats import wilcoxon >>> w, p = wilcoxon(d) >>> w, p (24.0, 0.04088813291185591)
Hence, we would reject the null hypothesis at a confidence level of 5%, concluding that there is a difference in height between the groups. To confirm that the median of the differences can be assumed to be positive, we use:
>>> w, p = wilcoxon(d, alternative='greater') >>> w, p (96.0, 0.020444066455927955)
This shows that the null hypothesis that the median is negative can be rejected at a confidence level of 5% in favor of the alternative that the median is greater than zero. The pvalue based on the approximation is within the range of 0.019 and 0.054 given in [2]. Note that the statistic changed to 96 in the onesided case (the sum of ranks of positive differences) whereas it is 24 in the twosided case (the minimum of sum of ranks above and below zero).