Statistical functions (scipy.stats)#
This module contains a large number of probability distributions, summary and frequency statistics, correlation functions and statistical tests, masked statistics, kernel density estimation, quasi-Monte Carlo functionality, and more.
Statistics is a very large area, and there are topics that are out of scope for SciPy and are covered by other packages. Some of the most important ones are:
- statsmodels: regression, linear models, time series analysis, extensions to topics also covered by - scipy.stats.
- Pandas: tabular data, time series functionality, interfaces to other statistical languages. 
- PyMC: Bayesian statistical modeling, probabilistic machine learning. 
- scikit-learn: classification, regression, model selection. 
- Seaborn: statistical data visualization. 
- rpy2: Python to R bridge. 
Probability distributions#
Each univariate distribution is an instance of a subclass of rv_continuous
(rv_discrete for discrete distributions):
| 
 | A generic continuous random variable class meant for subclassing. | 
| 
 | A generic discrete random variable class meant for subclassing. | 
| 
 | Generates a distribution given by a histogram. | 
Continuous distributions#
| An alpha continuous random variable. | |
| An anglit continuous random variable. | |
| An arcsine continuous random variable. | |
| Argus distribution | |
| A beta continuous random variable. | |
| A beta prime continuous random variable. | |
| A Bradford continuous random variable. | |
| A Burr (Type III) continuous random variable. | |
| A Burr (Type XII) continuous random variable. | |
| A Cauchy continuous random variable. | |
| A chi continuous random variable. | |
| A chi-squared continuous random variable. | |
| A cosine continuous random variable. | |
| Crystalball distribution | |
| A double gamma continuous random variable. | |
| A double Pareto lognormal continuous random variable. | |
| A double Weibull continuous random variable. | |
| An Erlang continuous random variable. | |
| An exponential continuous random variable. | |
| An exponentially modified Normal continuous random variable. | |
| An exponentiated Weibull continuous random variable. | |
| An exponential power continuous random variable. | |
| An F continuous random variable. | |
| A fatigue-life (Birnbaum-Saunders) continuous random variable. | |
| A Fisk continuous random variable. | |
| A folded Cauchy continuous random variable. | |
| A folded normal continuous random variable. | |
| A generalized logistic continuous random variable. | |
| A generalized normal continuous random variable. | |
| A generalized Pareto continuous random variable. | |
| A generalized exponential continuous random variable. | |
| A generalized extreme value continuous random variable. | |
| A Gauss hypergeometric continuous random variable. | |
| A gamma continuous random variable. | |
| A generalized gamma continuous random variable. | |
| A generalized half-logistic continuous random variable. | |
| A generalized hyperbolic continuous random variable. | |
| A Generalized Inverse Gaussian continuous random variable. | |
| A Gibrat continuous random variable. | |
| A Gompertz (or truncated Gumbel) continuous random variable. | |
| A right-skewed Gumbel continuous random variable. | |
| A left-skewed Gumbel continuous random variable. | |
| A Half-Cauchy continuous random variable. | |
| A half-logistic continuous random variable. | |
| A half-normal continuous random variable. | |
| The upper half of a generalized normal continuous random variable. | |
| A hyperbolic secant continuous random variable. | |
| An inverted gamma continuous random variable. | |
| An inverse Gaussian continuous random variable. | |
| An inverted Weibull continuous random variable. | |
| An Irwin-Hall (Uniform Sum) continuous random variable. | |
| Jones and Faddy skew-t distribution. | |
| A Johnson SB continuous random variable. | |
| A Johnson SU continuous random variable. | |
| Kappa 4 parameter distribution. | |
| Kappa 3 parameter distribution. | |
| Kolmogorov-Smirnov one-sided test statistic distribution. | |
| Kolmogorov-Smirnov two-sided test statistic distribution. | |
| Limiting distribution of scaled Kolmogorov-Smirnov two-sided test statistic. | |
| A Landau continuous random variable. | |
| A Laplace continuous random variable. | |
| An asymmetric Laplace continuous random variable. | |
| A Levy continuous random variable. | |
| A left-skewed Levy continuous random variable. | |
| A Levy-stable continuous random variable. | |
| A logistic (or Sech-squared) continuous random variable. | |
| A log gamma continuous random variable. | |
| A log-Laplace continuous random variable. | |
| A lognormal continuous random variable. | |
| A loguniform or reciprocal continuous random variable. | |
| A Lomax (Pareto of the second kind) continuous random variable. | |
| A Maxwell continuous random variable. | |
| A Mielke Beta-Kappa / Dagum continuous random variable. | |
| A Moyal continuous random variable. | |
| A Nakagami continuous random variable. | |
| A non-central chi-squared continuous random variable. | |
| A non-central F distribution continuous random variable. | |
| A non-central Student's t continuous random variable. | |
| A normal continuous random variable. | |
| A Normal Inverse Gaussian continuous random variable. | |
| A Pareto continuous random variable. | |
| A pearson type III continuous random variable. | |
| A power-function continuous random variable. | |
| A power log-normal continuous random variable. | |
| A power normal continuous random variable. | |
| An R-distributed (symmetric beta) continuous random variable. | |
| A Rayleigh continuous random variable. | |
| A relativistic Breit-Wigner random variable. | |
| A Rice continuous random variable. | |
| A reciprocal inverse Gaussian continuous random variable. | |
| A semicircular continuous random variable. | |
| A skewed Cauchy random variable. | |
| A skew-normal random variable. | |
| A studentized range continuous random variable. | |
| A Student's t continuous random variable. | |
| A trapezoidal continuous random variable. | |
| A triangular continuous random variable. | |
| A truncated exponential continuous random variable. | |
| A truncated normal continuous random variable. | |
| An upper truncated Pareto continuous random variable. | |
| A doubly truncated Weibull minimum continuous random variable. | |
| A Tukey-Lamdba continuous random variable. | |
| A uniform continuous random variable. | |
| A Von Mises continuous random variable. | |
| A Von Mises continuous random variable. | |
| A Wald continuous random variable. | |
| Weibull minimum continuous random variable. | |
| Weibull maximum continuous random variable. | |
| A wrapped Cauchy continuous random variable. | 
The fit method of the univariate continuous distributions uses
maximum likelihood estimation to fit the distribution to a data set.
The fit method can accept regular data or censored data.
Censored data is represented with instances of the CensoredData
class.
| 
 | Instances of this class represent censored data. | 
Multivariate distributions#
| A multivariate normal random variable. | |
| A matrix normal random variable. | |
| A Dirichlet random variable. | |
| A Dirichlet multinomial random variable. | |
| A Wishart random variable. | |
| An inverse Wishart random variable. | |
| A multinomial random variable. | |
| A Special Orthogonal matrix (SO(N)) random variable. | |
| An Orthogonal matrix (O(N)) random variable. | |
| A matrix-valued U(N) random variable. | |
| A random correlation matrix. | |
| A multivariate t-distributed random variable. | |
| A multivariate hypergeometric random variable. | |
| Normal-inverse-gamma distribution. | |
| Contingency tables from independent samples with fixed marginal sums. | |
| A vector-valued uniform direction. | |
| A von Mises-Fisher variable. | 
scipy.stats.multivariate_normal methods accept instances
of the following class to represent the covariance.
| Representation of a covariance matrix | 
Discrete distributions#
| A Bernoulli discrete random variable. | |
| A beta-binomial discrete random variable. | |
| A beta-negative-binomial discrete random variable. | |
| A binomial discrete random variable. | |
| A Boltzmann (Truncated Discrete Exponential) random variable. | |
| A Laplacian discrete random variable. | |
| A geometric discrete random variable. | |
| A hypergeometric discrete random variable. | |
| A Logarithmic (Log-Series, Series) discrete random variable. | |
| A negative binomial discrete random variable. | |
| A Fisher's noncentral hypergeometric discrete random variable. | |
| A Wallenius' noncentral hypergeometric discrete random variable. | |
| A negative hypergeometric discrete random variable. | |
| A Planck discrete exponential random variable. | |
| A Poisson discrete random variable. | |
| A Poisson Binomial discrete random variable. | |
| A uniform discrete random variable. | |
| A Skellam discrete random variable. | |
| A Yule-Simon discrete random variable. | |
| A Zipf (Zeta) discrete random variable. | |
| A Zipfian discrete random variable. | 
An overview of statistical functions is given below.  Many of these functions
have a similar version in scipy.stats.mstats which work for masked arrays.
Summary statistics#
| 
 | Compute several descriptive statistics of the passed array. | 
| 
 | Compute the weighted geometric mean along the specified axis. | 
| 
 | Calculate the weighted harmonic mean along the specified axis. | 
| 
 | Calculate the weighted power mean along the specified axis. | 
| 
 | Compute the kurtosis (Fisher or Pearson) of a dataset. | 
| 
 | Return an array of the modal (most common) value in the passed array. | 
| 
 | Calculate the nth moment about the mean for a sample. | 
| 
 | Compute L-moments of a sample from a continuous distribution | 
| 
 | Compute the expectile at the specified level. | 
| 
 | Compute the sample skewness of a data set. | 
| 
 | Return the n th k-statistic (  | 
| 
 | Return an unbiased estimator of the variance of the k-statistic. | 
| 
 | Compute the trimmed mean. | 
| 
 | Compute the trimmed variance. | 
| 
 | Compute the trimmed minimum. | 
| 
 | Compute the trimmed maximum. | 
| 
 | Compute the trimmed sample standard deviation. | 
| 
 | Compute the trimmed standard error of the mean. | 
| 
 | Compute the coefficient of variation. | 
| 
 | Find repeats and repeat counts. | 
| 
 | Assign ranks to data, dealing with ties appropriately. | 
| 
 | Tie correction factor for Mann-Whitney U and Kruskal-Wallis H tests. | 
| 
 | Return mean of array after trimming a specified fraction of extreme values | 
| 
 | Calculate the geometric standard deviation of an array. | 
| 
 | Compute the interquartile range of the data along the specified axis. | 
| 
 | Compute standard error of the mean. | 
| 
 | Bayesian confidence intervals for the mean, var, and std. | 
| 
 | 'Frozen' distributions for mean, variance, and standard deviation of data. | 
| 
 | Calculate the Shannon entropy/relative entropy of given distribution(s). | 
| 
 | Given a sample of a distribution, estimate the differential entropy. | 
| 
 | Compute the median absolute deviation of the data along the given axis. | 
Frequency statistics#
| 
 | Return a cumulative frequency histogram, using the histogram function. | 
| 
 | Compute the percentile rank of a score relative to a list of scores. | 
| 
 | Calculate the score at a given percentile of the input sequence. | 
| 
 | Return a relative frequency histogram, using the histogram function. | 
| 
 | Compute a binned statistic for one or more sets of data. | 
| 
 | Compute a bidimensional binned statistic for one or more sets of data. | 
| 
 | Compute a multidimensional binned statistic for a set of data. | 
Random Variables#
| 
 | Generate a ContinuousDistribution from an instance of  | 
| 
 | Normal distribution with prescribed mean and standard deviation. | 
| 
 | Uniform distribution. | 
| 
 | Representation of a mixture distribution. | 
| 
 | Probability distribution of an order statistic | 
| 
 | Truncate the support of a random variable. | 
| 
 | Absolute value of a random variable | 
| 
 | Natural exponential of a random variable | 
| 
 | Natural logarithm of a non-negative random variable | 
Quasi-Monte Carlo#
Contingency Tables#
Masked statistics functions#
- Statistical functions for masked arrays (scipy.stats.mstats)- Summary statistics
- Frequency statistics
- Correlation functions
- Statistical tests
- Transformations
- Other
 
Other statistical functionality#
Transformations#
| 
 | Return a dataset transformed by a Box-Cox power transformation. | 
| 
 | Compute optimal Box-Cox transform parameter for input data. | 
| 
 | The boxcox log-likelihood function. | 
| 
 | Return a dataset transformed by a Yeo-Johnson power transformation. | 
| 
 | Compute optimal Yeo-Johnson transform parameter. | 
| 
 | The yeojohnson log-likelihood function. | 
| 
 | Compute the O'Brien transform on input data (any number of arrays). | 
| 
 | Perform iterative sigma-clipping of array elements. | 
| 
 | Slice off a proportion of items from both ends of an array. | 
| 
 | Slice off a proportion from ONE end of the passed array distribution. | 
| 
 | Calculate the relative z-scores. | 
| 
 | Compute the z score. | 
| 
 | Compute the geometric standard score. | 
Statistical distances#
| 
 | Compute the Wasserstein-1 distance between two 1D discrete distributions. | 
| 
 | Compute the Wasserstein-1 distance between two N-D discrete distributions. | 
| 
 | Compute the energy distance between two 1D distributions. | 
Sampling#
Fitting / Survival Analysis#
| 
 | Fit a discrete or continuous distribution to data | 
| 
 | Empirical cumulative distribution function of a sample. | 
| 
 | Compare the survival distributions of two samples via the logrank test. | 
Directional statistical functions#
| 
 | Computes sample statistics for directional data. | 
| 
 | Compute the circular mean of a sample of angle observations. | 
| 
 | Compute the circular variance of a sample of angle observations. | 
| 
 | Compute the circular standard deviation of a sample of angle observations. | 
Sensitivity Analysis#
| 
 | Global sensitivity indices of Sobol'. | 
Plot-tests#
| 
 | Calculate the shape parameter that maximizes the PPCC. | 
| 
 | Calculate and optionally plot probability plot correlation coefficient. | 
| 
 | Calculate quantiles for a probability plot, and optionally show the plot. | 
| 
 | Compute parameters for a Box-Cox normality plot, optionally show it. | 
| 
 | Compute parameters for a Yeo-Johnson normality plot, optionally show it. | 
Univariate and multivariate kernel density estimation#
| 
 | Representation of a kernel-density estimate using Gaussian kernels. | 
Warnings / Errors used in scipy.stats#
| 
 | Warns when data is degenerate and results may not be reliable. | 
| 
 | Warns when all values in data are exactly equal. | 
| 
 | Warns when all values in data are nearly equal. | 
| 
 | Represents an error condition when fitting a distribution to data. | 
Result classes used in scipy.stats#
Warning
These classes are private, but they are included here because instances of them are returned by other statistical functions. User import and instantiation is not supported.