Statistical functions (`scipy.stats`)¶

This module contains a large number of probability distributions as well as a growing library of statistical functions.

Each univariate distribution is an instance of a subclass of rv_continuous (rv_discrete for discrete distributions):

`rv_continuous`([momtype, a, b, xtol, ...])	A generic continuous random variable class meant for subclassing.
`rv_discrete`([a, b, name, badvalue, ...])	A generic discrete random variable class meant for subclassing.

Continuous distributions¶

`alpha`	An alpha continuous random variable.
`anglit`	An anglit continuous random variable.
`arcsine`	An arcsine continuous random variable.
`beta`	A beta continuous random variable.
`betaprime`	A beta prime continuous random variable.
`bradford`	A Bradford continuous random variable.
`burr`	A Burr continuous random variable.
`cauchy`	A Cauchy continuous random variable.
`chi`	A chi continuous random variable.
`chi2`	A chi-squared continuous random variable.
`cosine`	A cosine continuous random variable.
`dgamma`	A double gamma continuous random variable.
`dweibull`	A double Weibull continuous random variable.
`erlang`	An Erlang continuous random variable.
`expon`	An exponential continuous random variable.
`exponnorm`	An exponentially modified Normal continuous random variable.
`exponweib`	An exponentiated Weibull continuous random variable.
`exponpow`	An exponential power continuous random variable.
`f`	An F continuous random variable.
`fatiguelife`	A fatigue-life (Birnbaum-Saunders) continuous random variable.
`fisk`	A Fisk continuous random variable.
`foldcauchy`	A folded Cauchy continuous random variable.
`foldnorm`	A folded normal continuous random variable.
`frechet_r`	A Frechet right (or Weibull minimum) continuous random variable.
`frechet_l`	A Frechet left (or Weibull maximum) continuous random variable.
`genlogistic`	A generalized logistic continuous random variable.
`gennorm`	A generalized normal continuous random variable.
`genpareto`	A generalized Pareto continuous random variable.
`genexpon`	A generalized exponential continuous random variable.
`genextreme`	A generalized extreme value continuous random variable.
`gausshyper`	A Gauss hypergeometric continuous random variable.
`gamma`	A gamma continuous random variable.
`gengamma`	A generalized gamma continuous random variable.
`genhalflogistic`	A generalized half-logistic continuous random variable.
`gilbrat`	A Gilbrat continuous random variable.
`gompertz`	A Gompertz (or truncated Gumbel) continuous random variable.
`gumbel_r`	A right-skewed Gumbel continuous random variable.
`gumbel_l`	A left-skewed Gumbel continuous random variable.
`halfcauchy`	A Half-Cauchy continuous random variable.
`halflogistic`	A half-logistic continuous random variable.
`halfnorm`	A half-normal continuous random variable.
`halfgennorm`	The upper half of a generalized normal continuous random variable.
`hypsecant`	A hyperbolic secant continuous random variable.
`invgamma`	An inverted gamma continuous random variable.
`invgauss`	An inverse Gaussian continuous random variable.
`invweibull`	An inverted Weibull continuous random variable.
`johnsonsb`	A Johnson SB continuous random variable.
`johnsonsu`	A Johnson SU continuous random variable.
`ksone`	General Kolmogorov-Smirnov one-sided test.
`kstwobign`	Kolmogorov-Smirnov two-sided test for large N.
`laplace`	A Laplace continuous random variable.
`logistic`	A logistic (or Sech-squared) continuous random variable.
`loggamma`	A log gamma continuous random variable.
`loglaplace`	A log-Laplace continuous random variable.
`lognorm`	A lognormal continuous random variable.
`lomax`	A Lomax (Pareto of the second kind) continuous random variable.
`maxwell`	A Maxwell continuous random variable.
`mielke`	A Mielke’s Beta-Kappa continuous random variable.
`nakagami`	A Nakagami continuous random variable.
`ncx2`	A non-central chi-squared continuous random variable.
`ncf`	A non-central F distribution continuous random variable.
`nct`	A non-central Student’s T continuous random variable.
`norm`	A normal continuous random variable.
`pareto`	A Pareto continuous random variable.
`pearson3`	A pearson type III continuous random variable.
`powerlaw`	A power-function continuous random variable.
`powerlognorm`	A power log-normal continuous random variable.
`powernorm`	A power normal continuous random variable.
`rdist`	An R-distributed continuous random variable.
`reciprocal`	A reciprocal continuous random variable.
`rayleigh`	A Rayleigh continuous random variable.
`rice`	A Rice continuous random variable.
`recipinvgauss`	A reciprocal inverse Gaussian continuous random variable.
`semicircular`	A semicircular continuous random variable.
`t`	A Student’s T continuous random variable.
`triang`	A triangular continuous random variable.
`truncexpon`	A truncated exponential continuous random variable.
`truncnorm`	A truncated normal continuous random variable.
`tukeylambda`	A Tukey-Lamdba continuous random variable.
`uniform`	A uniform continuous random variable.
`vonmises`	A Von Mises continuous random variable.
`wald`	A Wald continuous random variable.
`weibull_min`	A Frechet right (or Weibull minimum) continuous random variable.
`weibull_max`	A Frechet left (or Weibull maximum) continuous random variable.
`wrapcauchy`	A wrapped Cauchy continuous random variable.

Multivariate distributions¶

`multivariate_normal`	A multivariate normal random variable.
`dirichlet`	A Dirichlet random variable.
`wishart`	A Wishart random variable.
`invwishart`	An inverse Wishart random variable.

Discrete distributions¶

`bernoulli`	A Bernoulli discrete random variable.
`binom`	A binomial discrete random variable.
`boltzmann`	A Boltzmann (Truncated Discrete Exponential) random variable.
`dlaplace`	A Laplacian discrete random variable.
`geom`	A geometric discrete random variable.
`hypergeom`	A hypergeometric discrete random variable.
`logser`	A Logarithmic (Log-Series, Series) discrete random variable.
`nbinom`	A negative binomial discrete random variable.
`planck`	A Planck discrete exponential random variable.
`poisson`	A Poisson discrete random variable.
`randint`	A uniform discrete random variable.
`skellam`	A Skellam discrete random variable.
`zipf`	A Zipf discrete random variable.

Statistical functions¶

Several of these functions have a similar version in scipy.stats.mstats which work for masked arrays.

`describe`(a[, axis, ddof])	Computes several descriptive statistics of the passed array.
`gmean`(a[, axis, dtype])	Compute the geometric mean along the specified axis.
`hmean`(a[, axis, dtype])	Calculates the harmonic mean along the specified axis.
`kurtosis`(a[, axis, fisher, bias])	Computes the kurtosis (Fisher or Pearson) of a dataset.
`kurtosistest`(a[, axis])	Tests whether a dataset has normal kurtosis This function tests the null hypothesis that the kurtosis of the population from which the sample was drawn is that of the normal distribution: `kurtosis = 3(n-1)/(n+1)`.
`mode`(a[, axis])	Returns an array of the modal (most common) value in the passed array.
`moment`(a[, moment, axis])	Calculates the nth moment about the mean for a sample.
`normaltest`(a[, axis])	Tests whether a sample differs from a normal distribution.
`skew`(a[, axis, bias])	Computes the skewness of a data set.
`skewtest`(a[, axis])	Tests whether the skew is different from the normal distribution.
`kstat`(data[, n])	Return the nth k-statistic (1<=n<=4 so far).
`kstatvar`(data[, n])	Returns an unbiased estimator of the variance of the k-statistic.
`tmean`(a[, limits, inclusive])	Compute the trimmed mean.
`tvar`(a[, limits, inclusive])	Compute the trimmed variance This function computes the sample variance of an array of values, while ignoring values which are outside of given limits.
`tmin`(a[, lowerlimit, axis, inclusive])	Compute the trimmed minimum This function finds the miminum value of an array a along the specified axis, but only considering values greater than a specified lower limit.
`tmax`(a[, upperlimit, axis, inclusive])	Compute the trimmed maximum This function computes the maximum value of an array along a given axis, while ignoring values larger than a specified upper limit.
`tstd`(a[, limits, inclusive])	Compute the trimmed sample standard deviation This function finds the sample standard deviation of given values, ignoring values outside the given limits.
`tsem`(a[, limits, inclusive])	Compute the trimmed standard error of the mean.
`nanmean`(args, *kwds)	`nanmean` is deprecated!
`nanstd`(args, *kwds)	`nanstd` is deprecated!
`nanmedian`(args, *kwds)	`nanmedian` is deprecated!
`variation`(a[, axis])	Computes the coefficient of variation, the ratio of the biased standard deviation to the mean.

`cumfreq`(a[, numbins, defaultreallimits, weights])	Returns a cumulative frequency histogram, using the histogram function.
`histogram2`(args, *kwds)	`histogram2` is deprecated!
`histogram`(a[, numbins, defaultlimits, ...])	Separates the range into several bins and returns the number of instances in each bin.
`itemfreq`(a)	Returns a 2-D array of item frequencies.
`percentileofscore`(a, score[, kind])	The percentile rank of a score relative to a list of scores.
`scoreatpercentile`(a, per[, limit, ...])	Calculate the score at a given percentile of the input sequence.
`relfreq`(a[, numbins, defaultreallimits, weights])	Returns a relative frequency histogram, using the histogram function.

`binned_statistic`(x, values[, statistic, ...])	Compute a binned statistic for a set of data.
`binned_statistic_2d`(x, y, values[, ...])	Compute a bidimensional binned statistic for a set of data.
`binned_statistic_dd`(sample, values[, ...])	Compute a multidimensional binned statistic for a set of data.

`obrientransform`(*args)	Computes the O’Brien transform on input data (any number of arrays).
`signaltonoise`(args, *kwds)	`signaltonoise` is deprecated!
`bayes_mvs`(data[, alpha])	Bayesian confidence intervals for the mean, var, and std.
`mvsdist`(data)	‘Frozen’ distributions for mean, variance, and standard deviation of data.
`sem`(a[, axis, ddof])	Calculates the standard error of the mean (or standard error of measurement) of the values in the input array.
`zmap`(scores, compare[, axis, ddof])	Calculates the relative z-scores.
`zscore`(a[, axis, ddof])	Calculates the z score of each value in the sample, relative to the sample mean and standard deviation.

`sigmaclip`(a[, low, high])	Iterative sigma-clipping of array elements.
`threshold`(a[, threshmin, threshmax, newval])	Clip array to a given value.
`trimboth`(a, proportiontocut[, axis])	Slices off a proportion of items from both ends of an array.
`trim1`(a, proportiontocut[, tail])	Slices off a proportion of items from ONE end of the passed array distribution.

`f_oneway`(*args)	Performs a 1-way ANOVA.
`pearsonr`(x, y)	Calculates a Pearson correlation coefficient and the p-value for testing non-correlation.
`spearmanr`(a[, b, axis])	Calculates a Spearman rank-order correlation coefficient and the p-value to test for non-correlation.
`pointbiserialr`(x, y)	Calculates a point biserial correlation coefficient and the associated p-value.
`kendalltau`(x, y[, initial_lexsort])	Calculates Kendall’s tau, a correlation measure for ordinal data.
`linregress`(x[, y])	Calculate a regression line This computes a least-squares regression for two sets of measurements.
`theilslopes`(y[, x, alpha])	Computes the Theil-Sen estimator for a set of points (x, y).

`ttest_1samp`(a, popmean[, axis])	Calculates the T-test for the mean of ONE group of scores.
`ttest_ind`(a, b[, axis, equal_var])	Calculates the T-test for the means of TWO INDEPENDENT samples of scores.
`ttest_ind_from_stats`(mean1, std1, nobs1, ...)	T-test for means of two independent samples from descriptive statistics.
`ttest_rel`(a, b[, axis])	Calculates the T-test on TWO RELATED samples of scores, a and b.
`kstest`(rvs, cdf[, args, N, alternative, mode])	Perform the Kolmogorov-Smirnov test for goodness of fit.
`chisquare`(f_obs[, f_exp, ddof, axis])	Calculates a one-way chi square test.
`power_divergence`(f_obs[, f_exp, ddof, axis, ...])	Cressie-Read power divergence statistic and goodness of fit test.
`ks_2samp`(data1, data2)	Computes the Kolmogorov-Smirnov statistic on 2 samples.
`mannwhitneyu`(x, y[, use_continuity])	Computes the Mann-Whitney rank test on samples x and y.
`tiecorrect`(rankvals)	Tie correction factor for ties in the Mann-Whitney U and Kruskal-Wallis H tests.
`rankdata`(a[, method])	Assign ranks to data, dealing with ties appropriately.
`ranksums`(x, y)	Compute the Wilcoxon rank-sum statistic for two samples.
`wilcoxon`(x[, y, zero_method, correction])	Calculate the Wilcoxon signed-rank test.
`kruskal`(*args)	Compute the Kruskal-Wallis H-test for independent samples The Kruskal-Wallis H-test tests the null hypothesis that the population median of all of the groups are equal.
`friedmanchisquare`(*args)	Computes the Friedman test for repeated measurements The Friedman test tests the null hypothesis that repeated measurements of the same individuals have the same distribution.
`combine_pvalues`(pvalues[, method, weights])	Methods for combining the p-values of independent tests bearing upon the same hypothesis.

`ansari`(x, y)	Perform the Ansari-Bradley test for equal scale parameters The Ansari-Bradley test is a non-parametric test for the equality of the scale parameter of the distributions from which two samples were drawn.
`bartlett`(*args)	Perform Bartlett’s test for equal variances Bartlett’s test tests the null hypothesis that all input samples are from populations with equal variances.
`levene`(args, *kwds)	Perform Levene test for equal variances.
`shapiro`(x[, a, reta])	Perform the Shapiro-Wilk test for normality.
`anderson`(x[, dist])	Anderson-Darling test for data coming from a particular distribution The Anderson-Darling test is a modification of the Kolmogorov- Smirnov test `kstest` for the null hypothesis that a sample is drawn from a population that follows a particular distribution.
`anderson_ksamp`(samples[, midrank])	The Anderson-Darling test for k-samples.
`binom_test`(x[, n, p])	Perform a test that the probability of success is p.
`fligner`(args, *kwds)	Perform Fligner’s test for equal variances.
`median_test`(args, *kwds)	Mood’s median test.
`mood`(x, y[, axis])	Perform Mood’s test for equal scale parameters.

`boxcox`(x[, lmbda, alpha])	Return a positive dataset transformed by a Box-Cox power transformation.
`boxcox_normmax`(x[, brack, method])	Compute optimal Box-Cox transform parameter for input data.
`boxcox_llf`(lmb, data)	The boxcox log-likelihood function.
`entropy`(pk[, qk, base])	Calculate the entropy of a distribution for given probability values.

Circular statistical functions¶

`circmean`(samples[, high, low, axis])	Compute the circular mean for samples in a range.
`circvar`(samples[, high, low, axis])	Compute the circular variance for samples assumed to be in a range :Parameters: samples : array_like Input array.
`circstd`(samples[, high, low, axis])	Compute the circular standard deviation for samples assumed to be in the range [low to high].

Contingency table functions¶

`chi2_contingency`(observed[, correction, lambda_])	Chi-square test of independence of variables in a contingency table.
`contingency.expected_freq`(observed)	Compute the expected frequencies from a contingency table.
`contingency.margins`(a)	Return a list of the marginal sums of the array a.
`fisher_exact`(table[, alternative])	Performs a Fisher exact test on a 2x2 contingency table.

Plot-tests¶

`ppcc_max`(x[, brack, dist])	Returns the shape parameter that maximizes the probability plot correlation coefficient for the given data to a one-parameter family of distributions.
`ppcc_plot`(x, a, b[, dist, plot, N])	Calculate and optionally plot probability plot correlation coefficient.
`probplot`(x[, sparams, dist, fit, plot])	Calculate quantiles for a probability plot, and optionally show the plot.
`boxcox_normplot`(x, la, lb[, plot, N])	Compute parameters for a Box-Cox normality plot, optionally show it.

Masked statistics functions¶

Statistical functions for masked arrays (scipy.stats.mstats)

Univariate and multivariate kernel density estimation (`scipy.stats.kde`)¶

gaussian_kde(dataset[, bw_method]) Representation of a kernel-density estimate using Gaussian kernels.

For many more stat related functions install the software R and the interface package rpy.

Statistical functions (scipy.stats)

Previous topic

scipy.special.xlog1py

Next topic

scipy.stats.rv_continuous

Statistical functions (scipy.stats)¶