scipy.stats.ks_2samp#

scipy.stats.ks_2samp(data1, data2, alternative='two-sided', method='auto')[source]#

Performs the two-sample Kolmogorov-Smirnov test for goodness of fit.

This test compares the underlying continuous distributions F(x) and G(x) of two independent samples. See Notes for a description of the available null and alternative hypotheses.

Parameters:
data1, data2array_like, 1-Dimensional

Two arrays of sample observations assumed to be drawn from a continuous distribution, sample sizes can be different.

alternative{‘two-sided’, ‘less’, ‘greater’}, optional

Defines the null and alternative hypotheses. Default is ‘two-sided’. Please see explanations in the Notes below.

method{‘auto’, ‘exact’, ‘asymp’}, optional

Defines the method used for calculating the p-value. The following options are available (default is ‘auto’):

  • ‘auto’ : use ‘exact’ for small size arrays, ‘asymp’ for large

  • ‘exact’ : use exact distribution of test statistic

  • ‘asymp’ : use asymptotic distribution of test statistic

Returns:
res: KstestResult

An object containing attributes:

statisticfloat

KS test statistic.

pvaluefloat

One-tailed or two-tailed p-value.

statistic_locationfloat

Value from data1 or data2 corresponding with the KS statistic; i.e., the distance between the empirical distribution functions is measured at this observation.

statistic_signint

+1 if the empirical distribution function of data1 exceeds the empirical distribution function of data2 at statistic_location, otherwise -1.

Notes

There are three options for the null and corresponding alternative hypothesis that can be selected using the alternative parameter.

  • less: The null hypothesis is that F(x) >= G(x) for all x; the alternative is that F(x) < G(x) for at least one x. The statistic is the magnitude of the minimum (most negative) difference between the empirical distribution functions of the samples.

  • greater: The null hypothesis is that F(x) <= G(x) for all x; the alternative is that F(x) > G(x) for at least one x. The statistic is the maximum (most positive) difference between the empirical distribution functions of the samples.

  • two-sided: The null hypothesis is that the two distributions are identical, F(x)=G(x) for all x; the alternative is that they are not identical. The statistic is the maximum absolute difference between the empirical distribution functions of the samples.

Note that the alternative hypotheses describe the CDFs of the underlying distributions, not the observed values of the data. For example, suppose x1 ~ F and x2 ~ G. If F(x) > G(x) for all x, the values in x1 tend to be less than those in x2.

If the KS statistic is large, then the p-value will be small, and this may be taken as evidence against the null hypothesis in favor of the alternative.

If method='exact', ks_2samp attempts to compute an exact p-value, that is, the probability under the null hypothesis of obtaining a test statistic value as extreme as the value computed from the data. If method='asymp', the asymptotic Kolmogorov-Smirnov distribution is used to compute an approximate p-value. If method='auto', an exact p-value computation is attempted if both sample sizes are less than 10000; otherwise, the asymptotic method is used. In any case, if an exact p-value calculation is attempted and fails, a warning will be emitted, and the asymptotic p-value will be returned.

The ‘two-sided’ ‘exact’ computation computes the complementary probability and then subtracts from 1. As such, the minimum probability it can return is about 1e-16. While the algorithm itself is exact, numerical errors may accumulate for large sample sizes. It is most suited to situations in which one of the sample sizes is only a few thousand.

We generally follow Hodges’ treatment of Drion/Gnedenko/Korolyuk [1].

References

[1]

Hodges, J.L. Jr., “The Significance Probability of the Smirnov Two-Sample Test,” Arkiv fiur Matematik, 3, No. 43 (1958), 469-86.

Examples

Suppose we wish to test the null hypothesis that two samples were drawn from the same distribution. We choose a confidence level of 95%; that is, we will reject the null hypothesis in favor of the alternative if the p-value is less than 0.05.

If the first sample were drawn from a uniform distribution and the second were drawn from the standard normal, we would expect the null hypothesis to be rejected.

>>> import numpy as np
>>> from scipy import stats
>>> rng = np.random.default_rng()
>>> sample1 = stats.uniform.rvs(size=100, random_state=rng)
>>> sample2 = stats.norm.rvs(size=110, random_state=rng)
>>> stats.ks_2samp(sample1, sample2)
KstestResult(statistic=0.5454545454545454, pvalue=7.37417839555191e-15)

Indeed, the p-value is lower than our threshold of 0.05, so we reject the null hypothesis in favor of the default “two-sided” alternative: the data were not drawn from the same distribution.

When both samples are drawn from the same distribution, we expect the data to be consistent with the null hypothesis most of the time.

>>> sample1 = stats.norm.rvs(size=105, random_state=rng)
>>> sample2 = stats.norm.rvs(size=95, random_state=rng)
>>> stats.ks_2samp(sample1, sample2)
KstestResult(statistic=0.10927318295739348, pvalue=0.5438289009927495)

As expected, the p-value of 0.54 is not below our threshold of 0.05, so we cannot reject the null hypothesis.

Suppose, however, that the first sample were drawn from a normal distribution shifted toward greater values. In this case, the cumulative density function (CDF) of the underlying distribution tends to be less than the CDF underlying the second sample. Therefore, we would expect the null hypothesis to be rejected with alternative='less':

>>> sample1 = stats.norm.rvs(size=105, loc=0.5, random_state=rng)
>>> stats.ks_2samp(sample1, sample2, alternative='less')
KstestResult(statistic=0.4055137844611529, pvalue=3.5474563068855554e-08)

and indeed, with p-value smaller than our threshold, we reject the null hypothesis in favor of the alternative.