scipy.stats.rv_continuous¶
- class scipy.stats.rv_continuous(momtype=1, a=None, b=None, xtol=1e-14, badvalue=None, name=None, longname=None, shapes=None, extradoc=None, seed=None)[source]¶
A generic continuous random variable class meant for subclassing.
rv_continuous is a base class to construct specific distribution classes and instances for continuous random variables. It cannot be used directly as a distribution.
Parameters: momtype : int, optional
The type of generic moment calculation to use: 0 for pdf, 1 (default) for ppf.
a : float, optional
Lower bound of the support of the distribution, default is minus infinity.
b : float, optional
Upper bound of the support of the distribution, default is plus infinity.
xtol : float, optional
The tolerance for fixed point calculation for generic ppf.
badvalue : float, optional
The value in a result arrays that indicates a value that for which some argument restriction is violated, default is np.nan.
name : str, optional
The name of the instance. This string is used to construct the default example for distributions.
longname : str, optional
This string is used as part of the first line of the docstring returned when a subclass has no docstring of its own. Note: longname exists for backwards compatibility, do not use for new subclasses.
shapes : str, optional
The shape of the distribution. For example "m, n" for a distribution that takes two integers as the two shape arguments for all its methods. If not provided, shape parameters will be inferred from the signature of the private methods, _pdf and _cdf of the instance.
extradoc : str, optional, deprecated
This string is used as the last part of the docstring returned when a subclass has no docstring of its own. Note: extradoc exists for backwards compatibility, do not use for new subclasses.
seed : None or int or numpy.random.RandomState instance, optional
This parameter defines the RandomState object to use for drawing random variates. If None (or np.random), the global np.random state is used. If integer, it is used to seed the local RandomState instance. Default is None.
Notes
Public methods of an instance of a distribution class (e.g., pdf, cdf) check their arguments and pass valid arguments to private, computational methods (_pdf, _cdf). For pdf(x), x is valid if it is within the support of a distribution, self.a <= x <= self.b. Whether a shape parameter is valid is decided by an _argcheck method (which defaults to checking that its arguments are strictly positive.)
Subclassing
New random variables can be defined by subclassing the rv_continuous class and re-defining at least the _pdf or the _cdf method (normalized to location 0 and scale 1).
If positive argument checking is not correct for your RV then you will also need to re-define the _argcheck method.
Correct, but potentially slow defaults exist for the remaining methods but for speed and/or accuracy you can over-ride:
_logpdf, _cdf, _logcdf, _ppf, _rvs, _isf, _sf, _logsf
Rarely would you override _isf, _sf or _logsf, but you could.
Methods that can be overwritten by subclasses
_rvs _pdf _cdf _sf _ppf _isf _stats _munp _entropy _argcheck
There are additional (internal and private) generic methods that can be useful for cross-checking and for debugging, but might work in all cases when directly called.
A note on shapes: subclasses need not specify them explicitly. In this case, shapes will be automatically deduced from the signatures of the overridden methods (pdf, cdf etc). If, for some reason, you prefer to avoid relying on introspection, you can specify shapes explicitly as an argument to the instance constructor.
Frozen Distributions
Normally, you must provide shape parameters (and, optionally, location and scale parameters to each call of a method of a distribution.
Alternatively, the object may be called (as a function) to fix the shape, location, and scale parameters returning a “frozen” continuous RV object:
- rv = generic(<shape(s)>, loc=0, scale=1)
- frozen RV object with the same methods but holding the given shape, location, and scale fixed
Statistics
Statistics are computed using numerical integration by default. For speed you can redefine this using _stats:
- take shape parameters and return mu, mu2, g1, g2
- If you can’t compute one of these, return it as None
- Can also be defined with a keyword argument moments, which is a string composed of “m”, “v”, “s”, and/or “k”. Only the components appearing in string should be computed and returned in the order “m”, “v”, “s”, or “k” with missing values returned as None.
Alternatively, you can override _munp, which takes n and shape parameters and returns the n-th non-central moment of the distribution.
Examples
To create a new Gaussian distribution, we would do the following:
>>> from scipy.stats import rv_continuous >>> class gaussian_gen(rv_continuous): ... "Gaussian distribution" ... def _pdf(self, x): ... return np.exp(-x**2 / 2.) / np.sqrt(2.0 * np.pi) >>> gaussian = gaussian_gen(name='gaussian')
scipy.stats distributions are instances, so here we subclass rv_continuous and create an instance. With this, we now have a fully functional distribution with all relevant methods automagically generated by the framework.
Note that above we defined a standard normal distribution, with zero mean and unit variance. Shifting and scaling of the distribution can be done by using loc and scale parameters: gaussian.pdf(x, loc, scale) essentially computes y = (x - loc) / scale and gaussian._pdf(y) / scale.
Attributes
random_state Get or set the RandomState object for generating random variates. Methods
rvs(*args, **kwds) Random variates of given type. pdf(x, *args, **kwds) Probability density function at x of the given RV. logpdf(x, *args, **kwds) Log of the probability density function at x of the given RV. cdf(x, *args, **kwds) Cumulative distribution function of the given RV. logcdf(x, *args, **kwds) Log of the cumulative distribution function at x of the given RV. sf(x, *args, **kwds) Survival function (1 - cdf) at x of the given RV. logsf(x, *args, **kwds) Log of the survival function of the given RV. ppf(q, *args, **kwds) Percent point function (inverse of cdf) at q of the given RV. isf(q, *args, **kwds) Inverse survival function (inverse of sf) at q of the given RV. moment(n, *args, **kwds) n-th order non-central moment of distribution. stats(*args, **kwds) Some statistics of the given RV. entropy(*args, **kwds) Differential entropy of the RV. expect([func, args, loc, scale, lb, ub, ...]) Calculate expected value of a function with respect to the distribution. median(*args, **kwds) Median of the distribution. mean(*args, **kwds) Mean of the distribution. std(*args, **kwds) Standard deviation of the distribution. var(*args, **kwds) Variance of the distribution. interval(alpha, *args, **kwds) Confidence interval with equal areas around the median. __call__(*args, **kwds) Freeze the distribution for the given arguments. fit(data, *args, **kwds) Return MLEs for shape (if applicable), location, and scale parameters from data. fit_loc_scale(data, *args) Estimate loc and scale parameters from data using 1st and 2nd moments. nnlf(theta, x) Return negative loglikelihood function.