Statistical functions (scipy.stats
)¶
This module contains a large number of probability distributions as well as a growing library of statistical functions.
Each univariate distribution is an instance of a subclass of rv_continuous
(rv_discrete
for discrete distributions):
|
A generic continuous random variable class meant for subclassing. |
|
A generic discrete random variable class meant for subclassing. |
|
Generates a distribution given by a histogram. |
Continuous distributions¶
|
An alpha continuous random variable. |
|
An anglit continuous random variable. |
|
An arcsine continuous random variable. |
|
Argus distribution |
|
A beta continuous random variable. |
|
A beta prime continuous random variable. |
|
A Bradford continuous random variable. |
|
A Burr (Type III) continuous random variable. |
|
A Burr (Type XII) continuous random variable. |
|
A Cauchy continuous random variable. |
|
A chi continuous random variable. |
|
A chi-squared continuous random variable. |
|
A cosine continuous random variable. |
|
Crystalball distribution |
|
A double gamma continuous random variable. |
|
A double Weibull continuous random variable. |
|
An Erlang continuous random variable. |
|
An exponential continuous random variable. |
|
An exponentially modified Normal continuous random variable. |
|
An exponentiated Weibull continuous random variable. |
|
An exponential power continuous random variable. |
|
An F continuous random variable. |
|
A fatigue-life (Birnbaum-Saunders) continuous random variable. |
|
A Fisk continuous random variable. |
|
A folded Cauchy continuous random variable. |
|
A folded normal continuous random variable. |
|
A frechet_r continuous random variable. |
|
A frechet_l continuous random variable. |
|
A generalized logistic continuous random variable. |
|
A generalized normal continuous random variable. |
|
A generalized Pareto continuous random variable. |
|
A generalized exponential continuous random variable. |
|
A generalized extreme value continuous random variable. |
|
A Gauss hypergeometric continuous random variable. |
|
A gamma continuous random variable. |
|
A generalized gamma continuous random variable. |
|
A generalized half-logistic continuous random variable. |
|
A Gilbrat continuous random variable. |
|
A Gompertz (or truncated Gumbel) continuous random variable. |
|
A right-skewed Gumbel continuous random variable. |
|
A left-skewed Gumbel continuous random variable. |
|
A Half-Cauchy continuous random variable. |
|
A half-logistic continuous random variable. |
|
A half-normal continuous random variable. |
|
The upper half of a generalized normal continuous random variable. |
|
A hyperbolic secant continuous random variable. |
|
An inverted gamma continuous random variable. |
|
An inverse Gaussian continuous random variable. |
|
An inverted Weibull continuous random variable. |
|
A Johnson SB continuous random variable. |
|
A Johnson SU continuous random variable. |
|
Kappa 4 parameter distribution. |
|
Kappa 3 parameter distribution. |
|
General Kolmogorov-Smirnov one-sided test. |
|
Kolmogorov-Smirnov two-sided test for large N. |
|
A Laplace continuous random variable. |
|
A Levy continuous random variable. |
|
A left-skewed Levy continuous random variable. |
|
A Levy-stable continuous random variable. |
|
A logistic (or Sech-squared) continuous random variable. |
|
A log gamma continuous random variable. |
|
A log-Laplace continuous random variable. |
|
A lognormal continuous random variable. |
|
A Lomax (Pareto of the second kind) continuous random variable. |
|
A Maxwell continuous random variable. |
|
A Mielke’s Beta-Kappa continuous random variable. |
|
A Moyal continuous random variable. |
|
A Nakagami continuous random variable. |
|
A non-central chi-squared continuous random variable. |
|
A non-central F distribution continuous random variable. |
|
A non-central Student’s t continuous random variable. |
|
A normal continuous random variable. |
|
A Normal Inverse Gaussian continuous random variable. |
|
A Pareto continuous random variable. |
|
A pearson type III continuous random variable. |
|
A power-function continuous random variable. |
|
A power log-normal continuous random variable. |
|
A power normal continuous random variable. |
|
An R-distributed continuous random variable. |
|
A reciprocal continuous random variable. |
|
A Rayleigh continuous random variable. |
|
A Rice continuous random variable. |
|
A reciprocal inverse Gaussian continuous random variable. |
|
A semicircular continuous random variable. |
|
A skew-normal random variable. |
|
A Student’s t continuous random variable. |
|
A trapezoidal continuous random variable. |
|
A triangular continuous random variable. |
|
A truncated exponential continuous random variable. |
|
A truncated normal continuous random variable. |
|
A Tukey-Lamdba continuous random variable. |
|
A uniform continuous random variable. |
|
A Von Mises continuous random variable. |
|
A Von Mises continuous random variable. |
|
A Wald continuous random variable. |
|
Weibull minimum continuous random variable. |
|
Weibull maximum continuous random variable. |
|
A wrapped Cauchy continuous random variable. |
Multivariate distributions¶
|
A multivariate normal random variable. |
|
A matrix normal random variable. |
|
A Dirichlet random variable. |
|
A Wishart random variable. |
|
An inverse Wishart random variable. |
|
A multinomial random variable. |
|
A matrix-valued SO(N) random variable. |
A matrix-valued O(N) random variable. |
|
A matrix-valued U(N) random variable. |
|
A random correlation matrix. |
Discrete distributions¶
|
A Bernoulli discrete random variable. |
|
A binomial discrete random variable. |
|
A Boltzmann (Truncated Discrete Exponential) random variable. |
|
A Laplacian discrete random variable. |
|
A geometric discrete random variable. |
|
A hypergeometric discrete random variable. |
|
A Logarithmic (Log-Series, Series) discrete random variable. |
|
A negative binomial discrete random variable. |
|
A Planck discrete exponential random variable. |
|
A Poisson discrete random variable. |
|
A uniform discrete random variable. |
|
A Skellam discrete random variable. |
|
A Zipf discrete random variable. |
|
A Yule-Simon discrete random variable. |
An overview of statistical functions is given below.
Several of these functions have a similar version in
scipy.stats.mstats
which work for masked arrays.
Summary statistics¶
|
Compute several descriptive statistics of the passed array. |
|
Compute the geometric mean along the specified axis. |
|
Calculate the harmonic mean along the specified axis. |
|
Compute the kurtosis (Fisher or Pearson) of a dataset. |
|
Return an array of the modal (most common) value in the passed array. |
|
Calculate the nth moment about the mean for a sample. |
|
Compute the skewness of a data set. |
|
Return the nth k-statistic (1<=n<=4 so far). |
|
Returns an unbiased estimator of the variance of the k-statistic. |
|
Compute the trimmed mean. |
|
Compute the trimmed variance. |
|
Compute the trimmed minimum. |
|
Compute the trimmed maximum. |
|
Compute the trimmed sample standard deviation. |
|
Compute the trimmed standard error of the mean. |
|
Compute the coefficient of variation, the ratio of the biased standard deviation to the mean. |
|
Find repeats and repeat counts. |
|
Return mean of array after trimming distribution from both tails. |
|
Compute the interquartile range of the data along the specified axis. |
|
Calculate the standard error of the mean (or standard error of measurement) of the values in the input array. |
|
Bayesian confidence intervals for the mean, var, and std. |
|
‘Frozen’ distributions for mean, variance, and standard deviation of data. |
|
Calculate the entropy of a distribution for given probability values. |
Frequency statistics¶
|
Return a cumulative frequency histogram, using the histogram function. |
|
|
|
The percentile rank of a score relative to a list of scores. |
|
Calculate the score at a given percentile of the input sequence. |
|
Return a relative frequency histogram, using the histogram function. |
|
Compute a binned statistic for one or more sets of data. |
|
Compute a bidimensional binned statistic for one or more sets of data. |
|
Compute a multidimensional binned statistic for a set of data. |
Correlation functions¶
|
Performs a 1-way ANOVA. |
|
Calculate a Pearson correlation coefficient and the p-value for testing non-correlation. |
|
Calculate a Spearman rank-order correlation coefficient and the p-value to test for non-correlation. |
|
Calculate a point biserial correlation coefficient and its p-value. |
|
Calculate Kendall’s tau, a correlation measure for ordinal data. |
|
Compute a weighted version of Kendall’s \(\tau\). |
|
Calculate a linear least-squares regression for two sets of measurements. |
|
Computes the Siegel estimator for a set of points (x, y). |
|
Computes the Theil-Sen estimator for a set of points (x, y). |
Statistical tests¶
|
Calculate the T-test for the mean of ONE group of scores. |
|
Calculate the T-test for the means of two independent samples of scores. |
|
T-test for means of two independent samples from descriptive statistics. |
|
Calculate the T-test on TWO RELATED samples of scores, a and b. |
|
Perform the Kolmogorov-Smirnov test for goodness of fit. |
|
Calculate a one-way chi square test. |
|
Cressie-Read power divergence statistic and goodness of fit test. |
|
Compute the Kolmogorov-Smirnov statistic on 2 samples. |
|
Compute the Mann-Whitney rank test on samples x and y. |
|
Tie correction factor for ties in the Mann-Whitney U and Kruskal-Wallis H tests. |
|
Assign ranks to data, dealing with ties appropriately. |
|
Compute the Wilcoxon rank-sum statistic for two samples. |
|
Calculate the Wilcoxon signed-rank test. |
|
Compute the Kruskal-Wallis H-test for independent samples |
|
Compute the Friedman test for repeated measurements |
|
Computes the Brunner-Munzel test on samples x and y |
|
Methods for combining the p-values of independent tests bearing upon the same hypothesis. |
|
Perform the Jarque-Bera goodness of fit test on sample data. |
|
Perform the Ansari-Bradley test for equal scale parameters |
|
Perform Bartlett’s test for equal variances |
|
Perform Levene test for equal variances. |
|
Perform the Shapiro-Wilk test for normality. |
|
Anderson-Darling test for data coming from a particular distribution |
|
The Anderson-Darling test for k-samples. |
|
Perform a test that the probability of success is p. |
|
Perform Fligner-Killeen test for equality of variance. |
|
Mood’s median test. |
|
Perform Mood’s test for equal scale parameters. |
|
Test whether the skew is different from the normal distribution. |
|
Test whether a dataset has normal kurtosis. |
|
Test whether a sample differs from a normal distribution. |
Transformations¶
|
Return a positive dataset transformed by a Box-Cox power transformation. |
|
Compute optimal Box-Cox transform parameter for input data. |
|
The boxcox log-likelihood function. |
|
Return a dataset transformed by a Yeo-Johnson power transformation. |
|
Compute optimal Yeo-Johnson transform parameter for input data, using maximum likelihood estimation. |
|
The yeojohnson log-likelihood function. |
|
Compute the O’Brien transform on input data (any number of arrays). |
|
Iterative sigma-clipping of array elements. |
|
Slices off a proportion of items from both ends of an array. |
|
Slices off a proportion from ONE end of the passed array distribution. |
|
Calculate the relative z-scores. |
|
Calculate the z score of each value in the sample, relative to the sample mean and standard deviation. |
Statistical distances¶
|
Compute the first Wasserstein distance between two 1D distributions. |
|
Compute the energy distance between two 1D distributions. |
Random variate generation¶
|
Generate random samples from a probability density function using the ratio-of-uniforms method. |
Circular statistical functions¶
|
Compute the circular mean for samples in a range. |
|
Compute the circular variance for samples assumed to be in a range |
|
Compute the circular standard deviation for samples assumed to be in the range [low to high]. |
Contingency table functions¶
|
Chi-square test of independence of variables in a contingency table. |
|
Compute the expected frequencies from a contingency table. |
Return a list of the marginal sums of the array a. |
|
|
Performs a Fisher exact test on a 2x2 contingency table. |
Plot-tests¶
|
Calculate the shape parameter that maximizes the PPCC |
|
Calculate and optionally plot probability plot correlation coefficient. |
|
Calculate quantiles for a probability plot, and optionally show the plot. |
|
Compute parameters for a Box-Cox normality plot, optionally show it. |
|
Compute parameters for a Yeo-Johnson normality plot, optionally show it. |
Masked statistics functions¶
- Statistical functions for masked arrays (
scipy.stats.mstats
)- Summary statistics
- scipy.stats.mstats.describe
- scipy.stats.mstats.gmean
- scipy.stats.mstats.hmean
- scipy.stats.mstats.kurtosis
- scipy.stats.mstats.mode
- scipy.stats.mstats.mquantiles
- scipy.stats.mstats.hdmedian
- scipy.stats.mstats.hdquantiles
- scipy.stats.mstats.hdquantiles_sd
- scipy.stats.mstats.idealfourths
- scipy.stats.mstats.plotting_positions
- scipy.stats.mstats.meppf
- scipy.stats.mstats.moment
- scipy.stats.mstats.skew
- scipy.stats.mstats.tmean
- scipy.stats.mstats.tvar
- scipy.stats.mstats.tmin
- scipy.stats.mstats.tmax
- scipy.stats.mstats.tsem
- scipy.stats.mstats.variation
- scipy.stats.mstats.find_repeats
- scipy.stats.mstats.sem
- scipy.stats.mstats.trimmed_mean
- scipy.stats.mstats.trimmed_mean_ci
- scipy.stats.mstats.trimmed_std
- scipy.stats.mstats.trimmed_var
- Frequency statistics
- Correlation functions
- scipy.stats.mstats.f_oneway
- scipy.stats.mstats.pearsonr
- scipy.stats.mstats.spearmanr
- scipy.stats.mstats.pointbiserialr
- scipy.stats.mstats.kendalltau
- scipy.stats.mstats.kendalltau_seasonal
- scipy.stats.mstats.linregress
- scipy.stats.mstats.siegelslopes
- scipy.stats.mstats.theilslopes
- scipy.stats.mstats.sen_seasonal_slopes
- Statistical tests
- scipy.stats.mstats.ttest_1samp
- scipy.stats.mstats.ttest_onesamp
- scipy.stats.mstats.ttest_ind
- scipy.stats.mstats.ttest_rel
- scipy.stats.mstats.chisquare
- scipy.stats.mstats.ks_2samp
- scipy.stats.mstats.ks_twosamp
- scipy.stats.mstats.mannwhitneyu
- scipy.stats.mstats.rankdata
- scipy.stats.mstats.kruskal
- scipy.stats.mstats.kruskalwallis
- scipy.stats.mstats.friedmanchisquare
- scipy.stats.mstats.brunnermunzel
- scipy.stats.mstats.skewtest
- scipy.stats.mstats.kurtosistest
- scipy.stats.mstats.normaltest
- Transformations
- Other
- Summary statistics
Univariate and multivariate kernel density estimation (scipy.stats.kde
)¶
|
Representation of a kernel-density estimate using Gaussian kernels. |
For many more stat related functions install the software R and the interface package rpy.