scipy.stats.kruskal#
- scipy.stats.kruskal(*args, nan_policy='propagate', axis=0)[source]#
Compute the Kruskal-Wallis H-test for independent samples.
The Kruskal-Wallis H-test tests the null hypothesis that the population median of all of the groups are equal. It is a non-parametric version of ANOVA. The test works on 2 or more independent samples, which may have different sizes. Note that rejecting the null hypothesis does not indicate which of the groups differs. Post hoc comparisons between groups are required to determine which groups are different.
- Parameters
- sample1, sample2, …array_like
Two or more arrays with the sample measurements can be given as arguments. Samples must be one-dimensional.
- nan_policy{‘propagate’, ‘omit’, ‘raise’}
Defines how to handle input NaNs.
propagate
: if a NaN is present in the axis slice (e.g. row) along which the statistic is computed, the corresponding entry of the output will be NaN.omit
: NaNs will be omitted when performing the calculation. If insufficient data remains in the axis slice along which the statistic is computed, the corresponding entry of the output will be NaN.raise
: if a NaN is present, aValueError
will be raised.
- axisint or None, default: 0
If an int, the axis of the input along which to compute the statistic. The statistic of each axis-slice (e.g. row) of the input will appear in a corresponding element of the output. If
None
, the input will be raveled before computing the statistic.
- Returns
- statisticfloat
The Kruskal-Wallis H statistic, corrected for ties.
- pvaluefloat
The p-value for the test using the assumption that H has a chi square distribution. The p-value returned is the survival function of the chi square distribution evaluated at H.
See also
f_oneway
1-way ANOVA.
mannwhitneyu
Mann-Whitney rank test on two samples.
friedmanchisquare
Friedman test for repeated measurements.
Notes
Due to the assumption that H has a chi square distribution, the number of samples in each group must not be too small. A typical rule is that each sample must have at least 5 measurements.
References
- 1
W. H. Kruskal & W. W. Wallis, “Use of Ranks in One-Criterion Variance Analysis”, Journal of the American Statistical Association, Vol. 47, Issue 260, pp. 583-621, 1952.
- 2
https://en.wikipedia.org/wiki/Kruskal-Wallis_one-way_analysis_of_variance
Examples
>>> from scipy import stats >>> x = [1, 3, 5, 7, 9] >>> y = [2, 4, 6, 8, 10] >>> stats.kruskal(x, y) KruskalResult(statistic=0.2727272727272734, pvalue=0.6015081344405895)
>>> x = [1, 1, 1] >>> y = [2, 2, 2] >>> z = [2, 2] >>> stats.kruskal(x, y, z) KruskalResult(statistic=7.0, pvalue=0.0301973834223185)