scipy.stats.kstwo¶
-
scipy.stats.
kstwo
(*args, **kwds) = <scipy.stats._continuous_distns.kstwo_gen object>[source]¶ Kolmogorov-Smirnov two-sided test statistic distribution.
This is the distribution of the two-sided Kolmogorov-Smirnov (KS) statistic \(D_n\) for a finite sample size
n
(the shape parameter).As an instance of the
rv_continuous
class,kstwo
object inherits from it a collection of generic methods (see below for the full list), and completes them with details specific for this particular distribution.Notes
\(D_n\) is given by
\[D_n &= \text{sup}_x |F_n(x) - F(x)|\]where \(F\) is a (continuous) CDF and \(F_n\) is an empirical CDF.
kstwo
describes the distribution under the null hypothesis of the KS test that the empirical CDF corresponds to \(n\) i.i.d. random variates with CDF \(F\).The probability density above is defined in the “standardized” form. To shift and/or scale the distribution use the
loc
andscale
parameters. Specifically,kstwo.pdf(x, n, loc, scale)
is identically equivalent tokstwo.pdf(y, n) / scale
withy = (x - loc) / scale
. Note that shifting the location of a distribution does not make it a “noncentral” distribution; noncentral generalizations of some distributions are available in separate classes.References
- 1
Simard, R., L’Ecuyer, P. “Computing the Two-Sided Kolmogorov-Smirnov Distribution”, Journal of Statistical Software, Vol 39, 11, 1-18 (2011).
Examples
>>> from scipy.stats import kstwo >>> import matplotlib.pyplot as plt >>> fig, ax = plt.subplots(1, 1)
Calculate a few first moments:
>>> n = 10 >>> mean, var, skew, kurt = kstwo.stats(n, moments='mvsk')
Display the probability density function (
pdf
):>>> x = np.linspace(kstwo.ppf(0.01, n), ... kstwo.ppf(0.99, n), 100) >>> ax.plot(x, kstwo.pdf(x, n), ... 'r-', lw=5, alpha=0.6, label='kstwo pdf')
Alternatively, the distribution object can be called (as a function) to fix the shape, location and scale parameters. This returns a “frozen” RV object holding the given parameters fixed.
Freeze the distribution and display the frozen
pdf
:>>> rv = kstwo(n) >>> ax.plot(x, rv.pdf(x), 'k-', lw=2, label='frozen pdf')
Check accuracy of
cdf
andppf
:>>> vals = kstwo.ppf([0.001, 0.5, 0.999], n) >>> np.allclose([0.001, 0.5, 0.999], kstwo.cdf(vals, n)) True
Generate random numbers:
>>> r = kstwo.rvs(n, size=1000)
And compare the histogram:
>>> ax.hist(r, density=True, histtype='stepfilled', alpha=0.2) >>> ax.legend(loc='best', frameon=False) >>> plt.show()
Methods
rvs(n, loc=0, scale=1, size=1, random_state=None)
Random variates.
pdf(x, n, loc=0, scale=1)
Probability density function.
logpdf(x, n, loc=0, scale=1)
Log of the probability density function.
cdf(x, n, loc=0, scale=1)
Cumulative distribution function.
logcdf(x, n, loc=0, scale=1)
Log of the cumulative distribution function.
sf(x, n, loc=0, scale=1)
Survival function (also defined as
1 - cdf
, but sf is sometimes more accurate).logsf(x, n, loc=0, scale=1)
Log of the survival function.
ppf(q, n, loc=0, scale=1)
Percent point function (inverse of
cdf
— percentiles).isf(q, n, loc=0, scale=1)
Inverse survival function (inverse of
sf
).moment(n, n, loc=0, scale=1)
Non-central moment of order n
stats(n, loc=0, scale=1, moments=’mv’)
Mean(‘m’), variance(‘v’), skew(‘s’), and/or kurtosis(‘k’).
entropy(n, loc=0, scale=1)
(Differential) entropy of the RV.
fit(data)
Parameter estimates for generic data. See scipy.stats.rv_continuous.fit for detailed documentation of the keyword arguments.
expect(func, args=(n,), loc=0, scale=1, lb=None, ub=None, conditional=False, **kwds)
Expected value of a function (of one argument) with respect to the distribution.
median(n, loc=0, scale=1)
Median of the distribution.
mean(n, loc=0, scale=1)
Mean of the distribution.
var(n, loc=0, scale=1)
Variance of the distribution.
std(n, loc=0, scale=1)
Standard deviation of the distribution.
interval(alpha, n, loc=0, scale=1)
Endpoints of the range that contains alpha percent of the distribution