scipy.stats.anderson#

scipy.stats.anderson(x, dist='norm')[source]#

Anderson-Darling test for data coming from a particular distribution.

The Anderson-Darling test tests the null hypothesis that a sample is drawn from a population that follows a particular distribution. For the Anderson-Darling test, the critical values depend on which distribution is being tested against. This function works for normal, exponential, logistic, or Gumbel (Extreme Value Type I) distributions.

Parameters:

xarray_like: Array of sample data.
dist{‘norm’, ‘expon’, ‘logistic’, ‘gumbel’, ‘gumbel_l’, ‘gumbel_r’, ‘extreme1’}, optional: The type of distribution to test against. The default is ‘norm’. The names ‘extreme1’, ‘gumbel_l’ and ‘gumbel’ are synonyms for the same distribution.

Returns:

resultAndersonResult

An object with the following attributes:

statisticfloat: The Anderson-Darling test statistic.
critical_valueslist: The critical values for this distribution.
significance_levellist: The significance levels for the corresponding critical values in percents. The function returns critical values for a differing set of significance levels depending on the distribution that is being tested against.
fit_resultFitResult: An object containing the results of fitting the distribution to the data.

See also

kstest: The Kolmogorov-Smirnov test for goodness-of-fit.

Notes

Critical values provided are for the following significance levels:

normal/exponential: 15%, 10%, 5%, 2.5%, 1%
logistic: 25%, 10%, 5%, 2.5%, 1%, 0.5%
Gumbel: 25%, 10%, 5%, 2.5%, 1%

If the returned statistic is larger than these critical values then for the corresponding significance level, the null hypothesis that the data come from the chosen distribution can be rejected. The returned statistic is referred to as ‘A2’ in the references.

References

[1]

https://www.itl.nist.gov/div898/handbook/prc/section2/prc213.htm

[2]

Stephens, M. A. (1974). EDF Statistics for Goodness of Fit and Some Comparisons, Journal of the American Statistical Association, Vol. 69, pp. 730-737.

[3]

Stephens, M. A. (1976). Asymptotic Results for Goodness-of-Fit Statistics with Unknown Parameters, Annals of Statistics, Vol. 4, pp. 357-369.

[4]

Stephens, M. A. (1977). Goodness of Fit for the Extreme Value Distribution, Biometrika, Vol. 64, pp. 583-588.

[5]

Stephens, M. A. (1977). Goodness of Fit with Special Reference to Tests for Exponentiality , Technical Report No. 262, Department of Statistics, Stanford University, Stanford, CA.

[6]

Stephens, M. A. (1979). Tests of Fit for the Logistic Distribution Based on the Empirical Distribution Function, Biometrika, Vol. 66, pp. 591-595.

Examples

Test the null hypothesis that a random sample was drawn from a normal distribution (with unspecified mean and standard deviation).

>>> import numpy as np
>>> from scipy.stats import anderson
>>> rng = np.random.default_rng()
>>> data = rng.random(size=35)
>>> res = anderson(data)
>>> res.statistic
0.8398018749744764
>>> res.critical_values
array([0.527, 0.6  , 0.719, 0.839, 0.998])
>>> res.significance_level
array([15. , 10. ,  5. ,  2.5,  1. ])

The value of the statistic (barely) exceeds the critical value associated with a significance level of 2.5%, so the null hypothesis may be rejected at a significance level of 2.5%, but not at a significance level of 1%.