scipy.stats.ks_1samp¶

scipy.stats.ks_1samp(x, cdf, args=(), alternative='two-sided', mode='auto')[source]¶

Performs the Kolmogorov-Smirnov test for goodness of fit.

This performs a test of the distribution F(x) of an observed random variable against a given distribution G(x). Under the null hypothesis, the two distributions are identical, F(x)=G(x). The alternative hypothesis can be either ‘two-sided’ (default), ‘less’ or ‘greater’. The KS test is only valid for continuous distributions.

Parameters

xarray_like

a 1-D array of observations of iid random variables.

cdfcallable

callable used to calculate the cdf.

argstuple, sequence, optional

Distribution parameters, used with cdf.

alternative{‘two-sided’, ‘less’, ‘greater’}, optional

Defines the alternative hypothesis. The following options are available (default is ‘two-sided’):

‘two-sided’

‘less’: one-sided, see explanation in Notes

‘greater’: one-sided, see explanation in Notes

mode{‘auto’, ‘exact’, ‘approx’, ‘asymp’}, optional

Defines the distribution used for calculating the p-value. The following options are available (default is ‘auto’):

‘auto’ : selects one of the other options.

‘exact’ : uses the exact distribution of test statistic.

‘approx’ : approximates the two-sided probability with twice the one-sided probability

‘asymp’: uses asymptotic distribution of test statistic

Returns

statisticfloat: KS test statistic, either D, D+ or D- (depending on the value of ‘alternative’)
pvaluefloat: One-tailed or two-tailed p-value.

See also

ks_2samp, kstest

Notes

In the one-sided test, the alternative is that the empirical cumulative distribution function of the random variable is “less” or “greater” than the cumulative distribution function G(x) of the hypothesis, F(x)<=G(x), resp. F(x)>=G(x).

Examples

>>> from scipy import stats

>>> x = np.linspace(-15, 15, 9)
>>> stats.ks_1samp(x, stats.norm.cdf)
(0.44435602715924361, 0.038850142705171065)

>>> np.random.seed(987654321) # set random seed to get the same result
>>> stats.ks_1samp(stats.norm.rvs(size=100), stats.norm.cdf)
(0.058352892479417884, 0.8653960860778898)

Test against one-sided alternative hypothesis

Shift distribution to larger values, so that `` CDF(x) < norm.cdf(x)``:

>>> np.random.seed(987654321)
>>> x = stats.norm.rvs(loc=0.2, size=100)
>>> stats.ks_1samp(x, stats.norm.cdf, alternative='less')
(0.12464329735846891, 0.040989164077641749)

Reject equal distribution against alternative hypothesis: less

>>> stats.ks_1samp(x, stats.norm.cdf, alternative='greater')
(0.0072115233216311081, 0.98531158590396395)

Don’t reject equal distribution against alternative hypothesis: greater

>>> stats.ks_1samp(x, stats.norm.cdf)
(0.12464329735846891, 0.08197335233541582)

Don’t reject equal distribution against alternative hypothesis: two-sided

Testing t distributed random variables against normal distribution

With 100 degrees of freedom the t distribution looks close to the normal distribution, and the K-S test does not reject the hypothesis that the sample came from the normal distribution:

>>> np.random.seed(987654321)
>>> stats.ks_1samp(stats.t.rvs(100,size=100), stats.norm.cdf)
(0.072018929165471257, 0.6505883498379312)

With 3 degrees of freedom the t distribution looks sufficiently different from the normal distribution, that we can reject the hypothesis that the sample came from the normal distribution at the 10% level:

>>> np.random.seed(987654321)
>>> stats.ks_1samp(stats.t.rvs(3,size=100), stats.norm.cdf)
(0.131016895759829, 0.058826222555312224)

Previous topic

scipy.stats.kstest

Next topic

scipy.stats.ks_2samp

scipy.stats.ks_1samp¶

Previous topic

Next topic

Quick search