Statistical functions (`scipy.stats`)¶

This module contains a large number of probability distributions as well as a growing library of statistical functions.

Each included distribution is an instance of the class rv_continous: For each given name the following methods are available:

`rv_continuous`([momtype, a, b, xa, xb, xtol, ...])	A generic continuous random variable class meant for subclassing.
`rv_continuous.rvs`(args, *kwds)	Random variates of given type.
`rv_continuous.pdf`(x, args, *kwds)	Probability density function at x of the given RV.
`rv_continuous.logpdf`(x, args, *kwds)	Log of the probability density function at x of the given RV.
`rv_continuous.cdf`(x, args, *kwds)	Cumulative distribution function at x of the given RV.
`rv_continuous.logcdf`(x, args, *kwds)	Log of the cumulative distribution function at x of the given RV.
`rv_continuous.sf`(x, args, *kwds)	Survival function (1-cdf) at x of the given RV.
`rv_continuous.logsf`(x, args, *kwds)	Log of the survival function of the given RV.
`rv_continuous.ppf`(q, args, *kwds)	Percent point function (inverse of cdf) at q of the given RV.
`rv_continuous.isf`(q, args, *kwds)	Inverse survival function at q of the given RV.
`rv_continuous.moment`(n, args, *kwds)	n’th order non-central moment of distribution
`rv_continuous.stats`(args, *kwds)	Some statistics of the given RV
`rv_continuous.entropy`(args, *kwds)	Differential entropy of the RV.
`rv_continuous.fit`(data, args, *kwds)	Return MLEs for shape, location, and scale parameters from data.
`rv_continuous.expect`([func, args, loc, ...])	calculate expected value of a function with respect to the distribution
`rv_continuous.median`(args, *kwds)	Median of the distribution.
`rv_continuous.mean`(args, *kwds)	Mean of the distribution
`rv_continuous.var`(args, *kwds)	Variance of the distribution
`rv_continuous.std`(args, *kwds)	Standard deviation of the distribution.
`rv_continuous.interval`(alpha, args, *kwds)	Confidence interval with equal areas around the median

Calling the instance as a function returns a frozen pdf whose shape, location, and scale parameters are fixed.

Similarly, each discrete distribution is an instance of the class rv_discrete:

`rv_discrete`([a, b, name, badvalue, ...])	A generic discrete random variable class meant for subclassing.
`rv_discrete.rvs`(args, *kwargs)	Random variates of given type.
`rv_discrete.pmf`(k, args, *kwds)	Probability mass function at k of the given RV.
`rv_discrete.logpmf`(k, args, *kwds)	Log of the probability mass function at k of the given RV.
`rv_discrete.cdf`(k, args, *kwds)	Cumulative distribution function at k of the given RV
`rv_discrete.logcdf`(k, args, *kwds)	Log of the cumulative distribution function at k of the given RV
`rv_discrete.sf`(k, args, *kwds)	Survival function (1-cdf) at k of the given RV
`rv_discrete.logsf`(k, args, *kwds)	Log of the survival function (1-cdf) at k of the given RV
`rv_discrete.ppf`(q, args, *kwds)	Percent point function (inverse of cdf) at q of the given RV
`rv_discrete.isf`(q, args, *kwds)	Inverse survival function (1-sf) at q of the given RV
`rv_discrete.stats`(args, *kwds)	Some statistics of the given discrete RV
`rv_discrete.moment`(n, args, *kwds)	n’th non-central moment of the distribution
`rv_discrete.entropy`(args, *kwds)
`rv_discrete.expect`([func, args, loc, lb, ...])	calculate expected value of a function with respect to the distribution
`rv_discrete.median`(args, *kwds)	Median of the distribution.
`rv_discrete.mean`(args, *kwds)	Mean of the distribution
`rv_discrete.var`(args, *kwds)	Variance of the distribution
`rv_discrete.std`(args, *kwds)	Standard deviation of the distribution.
`rv_discrete.interval`(alpha, args, *kwds)	Confidence interval with equal areas around the median

Continuous distributions¶

`norm`	A normal continuous random variable.
`alpha`	An alpha continuous random variable.
`anglit`	An anglit continuous random variable.
`arcsine`	An arcsine continuous random variable.
`beta`	A beta continuous random variable.
`betaprime`	A beta prima continuous random variable.
`bradford`	A Bradford continuous random variable.
`burr`	A Burr continuous random variable.
`cauchy`	A Cauchy continuous random variable.
`chi`	A chi continuous random variable.
`chi2`	A chi-squared continuous random variable.
`cosine`	A cosine continuous random variable.
`dgamma`	A double gamma continuous random variable.
`dweibull`	A double Weibull continuous random variable.
`erlang`	An Erlang continuous random variable.
`expon`	An exponential continuous random variable.
`exponweib`	An exponentiated Weibull continuous random variable.
`exponpow`	An exponential power continuous random variable.
`f`	An F continuous random variable.
`fatiguelife`	A fatigue-life (Birnbaum-Sanders) continuous random variable.
`fisk`	A Fisk continuous random variable.
`foldcauchy`	A folded Cauchy continuous random variable.
`foldnorm`	A folded normal continuous random variable.
`frechet_r`	A Frechet right (or Weibull minimum) continuous random variable.
`frechet_l`	A Frechet left (or Weibull maximum) continuous random variable.
`genlogistic`	A generalized logistic continuous random variable.
`genpareto`	A generalized Pareto continuous random variable.
`genexpon`	A generalized exponential continuous random variable.
`genextreme`	A generalized extreme value continuous random variable.
`gausshyper`	A Gauss hypergeometric continuous random variable.
`gamma`	A gamma continuous random variable.
`gengamma`	A generalized gamma continuous random variable.
`genhalflogistic`	A generalized half-logistic continuous random variable.
`gilbrat`	A Gilbrat continuous random variable.
`gompertz`	A Gompertz (or truncated Gumbel) continuous random variable.
`gumbel_r`	A right-skewed Gumbel continuous random variable.
`gumbel_l`	A left-skewed Gumbel continuous random variable.
`halfcauchy`	A Half-Cauchy continuous random variable.
`halflogistic`	A half-logistic continuous random variable.
`halfnorm`	A half-normal continuous random variable.
`hypsecant`	A hyperbolic secant continuous random variable.
`invgamma`	An inverted gamma continuous random variable.
`invgauss`	An inverse Gaussian continuous random variable.
`invweibull`	An inverted Weibull continuous random variable.
`johnsonsb`	A Johnson SB continuous random variable.
`johnsonsu`	A Johnson SU continuous random variable.
`ksone`	General Kolmogorov-Smirnov one-sided test.
`kstwobign`	Kolmogorov-Smirnov two-sided test for large N.
`laplace`	A Laplace continuous random variable.
`logistic`	A logistic continuous random variable.
`loggamma`	A log gamma continuous random variable.
`loglaplace`	A log-Laplace continuous random variable.
`lognorm`	A lognormal continuous random variable.
`lomax`	A Lomax (Pareto of the second kind) continuous random variable.
`maxwell`	A Maxwell continuous random variable.
`mielke`	A Mielke’s Beta-Kappa continuous random variable.
`nakagami`	A Nakagami continuous random variable.
`ncx2`	A non-central chi-squared continuous random variable.
`ncf`	A non-central F distribution continuous random variable.
`nct`	A non-central Student’s T continuous random variable.
`pareto`	A Pareto continuous random variable.
`powerlaw`	A power-function continuous random variable.
`powerlognorm`	A power log-normal continuous random variable.
`powernorm`	A power normal continuous random variable.
`rdist`	An R-distributed continuous random variable.
`reciprocal`	A reciprocal continuous random variable.
`rayleigh`	A Rayleigh continuous random variable.
`rice`	A Rice continuous random variable.
`recipinvgauss`	A reciprocal inverse Gaussian continuous random variable.
`semicircular`	A semicircular continuous random variable.
`t`	A Student’s T continuous random variable.
`triang`	A triangular continuous random variable.
`truncexpon`	A truncated exponential continuous random variable.
`truncnorm`	A truncated normal continuous random variable.
`tukeylambda`	A Tukey-Lamdba continuous random variable.
`uniform`	A uniform continuous random variable.
`vonmises`	A Von Mises continuous random variable.
`wald`	A Wald continuous random variable.
`weibull_min`	A Frechet right (or Weibull minimum) continuous random variable.
`weibull_max`	A Frechet left (or Weibull maximum) continuous random variable.
`wrapcauchy`	A wrapped Cauchy continuous random variable.

Discrete distributions¶

`bernoulli`	A Bernoulli discrete random variable.
`binom`	A binomial discrete random variable.
`boltzmann`	A Boltzmann (Truncated Discrete Exponential) random variable.
`dlaplace`	A Laplacian discrete random variable.
`geom`	A geometric discrete random variable.
`hypergeom`	A hypergeometric discrete random variable.
`logser`	A Logarithmic (Log-Series, Series) discrete random variable.
`nbinom`	A negative binomial discrete random variable.
`planck`	A Planck discrete exponential random variable.
`poisson`	A Poisson discrete random variable.
`randint`	A uniform discrete random variable.
`skellam`	A Skellam discrete random variable.
`zipf`	A Zipf discrete random variable.

Statistical functions¶

Several of these functions have a similar version in scipy.stats.mstats which work for masked arrays.

`gmean`(a[, axis, dtype])	Compute the geometric mean along the specified axis.
`hmean`(a[, axis, dtype])	Calculates the harmonic mean along the specified axis.
`cmedian`(a[, numbins])	Returns the computed median value of an array.
`mode`(a[, axis])	Returns an array of the modal (most common) value in the passed array.
`tmean`(a[, limits, inclusive])	Compute the trimmed mean
`tvar`(a[, limits, inclusive])	Compute the trimmed variance
`tmin`(a[, lowerlimit, axis, inclusive])	Compute the trimmed minimum
`tmax`(a, upperlimit[, axis, inclusive])	Compute the trimmed maximum
`tstd`(a[, limits, inclusive])	Compute the trimmed sample standard deviation
`tsem`(a[, limits, inclusive])	Compute the trimmed standard error of the mean
`moment`(a[, moment, axis])	Calculates the nth moment about the mean for a sample.
`variation`(a[, axis])	Computes the coefficient of variation, the ratio of the biased standard deviation to the mean.
`skew`(a[, axis, bias])	Computes the skewness of a data set.
`kurtosis`(a[, axis, fisher, bias])	Computes the kurtosis (Fisher or Pearson) of a dataset.
`describe`(a[, axis])	Computes several descriptive statistics of the passed array.
`skewtest`(a[, axis])	Tests whether the skew is different from the normal distribution.
`kurtosistest`(a[, axis])	Tests whether a dataset has normal kurtosis
`normaltest`(a[, axis])	Tests whether a sample differs from a normal distribution.

`itemfreq`(a)	Returns a 2D array of item frequencies.
`scoreatpercentile`(a, per[, limit, ...])	Calculate the score at the given per percentile of the sequence a.
`percentileofscore`(a, score[, kind])	The percentile rank of a score relative to a list of scores.
`histogram2`(a, bins)	Compute histogram using divisions in bins.
`histogram`(a[, numbins, defaultlimits, ...])	Separates the range into several bins and returns the number of instances of a in each bin.
`cumfreq`(a[, numbins, defaultreallimits, weights])	Returns a cumulative frequency histogram, using the histogram function.
`relfreq`(a[, numbins, defaultreallimits, weights])	Returns a relative frequency histogram, using the histogram function.

`obrientransform`(*args)	Computes a transform on input data (any number of columns).
`signaltonoise`(a[, axis, ddof])	The signal-to-noise ratio of the input data.
`bayes_mvs`(data[, alpha])	Bayesian confidence intervals for the mean, var, and std.
`sem`(a[, axis, ddof])	Calculates the standard error of the mean (or standard error of measurement) of the values in the input array.
`zmap`(scores, compare[, axis, ddof])	Calculates the relative z-scores.
`zscore`(a[, axis, ddof])	Calculates the z score of each value in the sample, relative to the sample mean and standard deviation.

`threshold`(a[, threshmin, threshmax, newval])	Clip array to a given value.
`trimboth`(a, proportiontocut)	Slices off a proportion of items from both ends of an array.
`trim1`(a, proportiontocut[, tail])	Slices off a proportion of items from ONE end of the passed array

`f_oneway`(*args)	Performs a 1-way ANOVA.
`pearsonr`(x, y)	Calculates a Pearson correlation coefficient and the p-value for testing
`spearmanr`(a[, b, axis])	Calculates a Spearman rank-order correlation coefficient and the p-value
`pointbiserialr`(x, y)	Calculates a point biserial correlation coefficient and the associated p-value.
`kendalltau`(x, y[, initial_lexsort])	Calculates Kendall’s tau, a correlation measure for ordinal data.
`linregress`(x[, y])	Calculate a regression line

`ttest_1samp`(a, popmean[, axis])	Calculates the T-test for the mean of ONE group of scores a.
`ttest_ind`(a, b[, axis, equal_var])	Calculates the T-test for the means of TWO INDEPENDENT samples of scores.
`ttest_rel`(a, b[, axis])	Calculates the T-test on TWO RELATED samples of scores, a and b.
`kstest`(rvs, cdf[, args, N, alternative, mode])	Perform the Kolmogorov-Smirnov test for goodness of fit
`chisquare`(f_obs[, f_exp, ddof])	Calculates a one-way chi square test.
`ks_2samp`(data1, data2)	Computes the Kolmogorov-Smirnof statistic on 2 samples.
`mannwhitneyu`(x, y[, use_continuity])	Computes the Mann-Whitney rank test on samples x and y.
`tiecorrect`(rankvals)	Tie correction factor for ties in the Mann-Whitney U and
`ranksums`(x, y)	Compute the Wilcoxon rank-sum statistic for two samples.
`wilcoxon`(x[, y])	Calculate the Wilcoxon signed-rank test.
`kruskal`(*args)	Compute the Kruskal-Wallis H-test for independent samples
`friedmanchisquare`(*args)	Computes the Friedman test for repeated measurements

`ansari`(x, y)	Perform the Ansari-Bradley test for equal scale parameters
`bartlett`(*args)	Perform Bartlett’s test for equal variances
`levene`(args, *kwds)	Perform Levene test for equal variances.
`shapiro`(x[, a, reta])	Perform the Shapiro-Wilk test for normality.
`anderson`(x[, dist])	Anderson-Darling test for data coming from a particular distribution
`binom_test`(x[, n, p])	Perform a test that the probability of success is p.
`fligner`(args, *kwds)	Perform Fligner’s test for equal variances.
`mood`(x, y)	Perform Mood’s test for equal scale parameters.
`oneway`(args, *kwds)	Test for equal means in two or more samples from the normal distribution.

Contingency table functions¶

`fisher_exact`(table[, alternative])	Performs a Fisher exact test on a 2x2 contingency table.
`chi2_contingency`(observed[, correction])	Chi-square test of independence of variables in a contingency table.
`contingency.expected_freq`(observed)	Compute the expected frequencies from a contingency table.
`contingency.margins`(a)	Return a list of the marginal sums of the array a.

General linear model¶

glm(data, para) Calculates a linear model fit ...

Plot-tests¶

`probplot`(x[, sparams, dist, fit, plot])	Calculate quantiles for a probability plot of sample data against a specified theoretical distribution.
`ppcc_max`(x[, brack, dist])	Returns the shape parameter that maximizes the probability plot correlation coefficient for the given data to a one-parameter family of distributions.
`ppcc_plot`(x, a, b[, dist, plot, N])	Returns (shape, ppcc), and optionally plots shape vs.

Masked statistics functions¶

Statistical functions for masked arrays (scipy.stats.mstats)

Univariate and multivariate kernel density estimation (`scipy.stats.kde`)¶

gaussian_kde(dataset[, bw_method]) Representation of a kernel-density estimate using Gaussian kernels.

For many more stat related functions install the software R and the interface package rpy.

Statistical functions (`scipy.stats`)¶

Continuous distributions¶

Discrete distributions¶

Statistical functions¶

Contingency table functions¶

General linear model¶

Plot-tests¶

Masked statistics functions¶

Univariate and multivariate kernel density estimation (`scipy.stats.kde`)¶

Table Of Contents

Previous topic

Next topic

Navigation

Statistical functions (scipy.stats)¶

Continuous distributions¶

Discrete distributions¶

Statistical functions¶

Contingency table functions¶

General linear model¶

Plot-tests¶

Masked statistics functions¶

Univariate and multivariate kernel density estimation (scipy.stats.kde)¶

Table Of Contents

Previous topic

Next topic

Quick search

Navigation

Statistical functions (`scipy.stats`)¶

Univariate and multivariate kernel density estimation (`scipy.stats.kde`)¶