FastGeneratorInversion#
- class scipy.stats.sampling.FastGeneratorInversion(dist, *, domain=None, ignore_shape_range=False, random_state=None)[source]#
Fast sampling by numerical inversion of the CDF for a large class of continuous distributions in
scipy.stats
.- Parameters:
- distrv_frozen object
Frozen distribution object from
scipy.stats
. The list of supported distributions can be found in the Notes section. The shape parameters,loc
andscale
used to create the distributions must be scalars. For example, for the Gamma distribution with shape parameter p, p has to be a float, and for the beta distribution with shape parameters (a, b), both a and b have to be floats.- domaintuple of floats, optional
If one wishes to sample from a truncated/conditional distribution, the domain has to be specified. The default is None. In that case, the random variates are not truncated, and the domain is inferred from the support of the distribution.
- ignore_shape_rangeboolean, optional.
If False, shape parameters that are outside of the valid range of values to ensure that the numerical accuracy (see Notes) is high, raise a ValueError. If True, any shape parameters that are valid for the distribution are accepted. This can be useful for testing. The default is False.
- random_state{None, int,
numpy.random.Generator
, numpy.random.RandomState
}, optionalA NumPy random number generator or seed for the underlying NumPy random number generator used to generate the stream of uniform random numbers. If
random_state
is None, it usesself.random_state
. Ifrandom_state
is an int,np.random.default_rng(random_state)
is used. Ifrandom_state
is already aGenerator
orRandomState
instance then that instance is used.
- Attributes:
- locfloat
The location parameter.
- random_state{
numpy.random.Generator
,numpy.random.RandomState
} The random state used in relevant methods like
rvs
(unless anotherrandom_state
is passed as an argument to these methods).- scalefloat
The scale parameter.
Methods
evaluate_error
([size, random_state, x_error])Evaluate the numerical accuracy of the inversion (u- and x-error).
ppf
(q)Very fast PPF (inverse CDF) of the distribution which is a very close approximation of the exact PPF values.
qrvs
([size, d, qmc_engine])Quasi-random variates of the given distribution.
rvs
([size])Sample from the distribution by inversion.
support
()Support of the distribution.
cdf
Notes
The class creates an object for continuous distributions specified by dist. The method
rvs
uses a generator fromscipy.stats.sampling
that is created when the object is instantiated. In addition, the methodsqrvs
andppf
are added.qrvs
generate samples based on quasi-random numbers fromscipy.stats.qmc
.ppf
is the PPF based on the numerical inversion method in [1] (NumericalInversePolynomial
) that is used to generate random variates.Supported distributions (distname) are:
alpha
,anglit
,argus
,beta
,betaprime
,bradford
,burr
,burr12
,cauchy
,chi
,chi2
,cosine
,crystalball
,expon
,gamma
,gennorm
,geninvgauss
,gumbel_l
,gumbel_r
,hypsecant
,invgamma
,invgauss
,invweibull
,laplace
,logistic
,maxwell
,moyal
,norm
,pareto
,powerlaw
,t
,rayleigh
,semicircular
,wald
,weibull_max
,weibull_min
.rvs
relies on the accuracy of the numerical inversion. If very extreme shape parameters are used, the numerical inversion might not work. However, for all implemented distributions, the admissible shape parameters have been tested, and an error will be raised if the user supplies values outside of the allowed range. The u-error should not exceed 1e-10 for all valid parameters. Note that warnings might be raised even if parameters are within the valid range when the object is instantiated. To check numerical accuracy, the methodevaluate_error
can be used.Note that all implemented distributions are also part of
scipy.stats
, and the object created byFastGeneratorInversion
relies on methods likeppf
, cdf and pdf from rv_frozen. The main benefit of using this class can be summarized as follows: Once the generator to sample random variates is created in the setup step, sampling and evaluation of the PPF usingppf
are very fast, and performance is essentially independent of the distribution. Therefore, a substantial speed-up can be achieved for many distributions if large numbers of random variates are required. It is important to know that this fast sampling is achieved by inversion of the CDF. Thus, one uniform random variate is transformed into a non-uniform variate, which is an advantage for several simulation methods, e.g., when the variance reduction methods of common random variates or antithetic variates are be used ([2]).In addition, inversion makes it possible to - to use a QMC generator from
scipy.stats.qmc
(methodqrvs
), - to generate random variates truncated to an interval. For example, if one aims to sample standard normal random variates from the interval (2, 4), this can be easily achieved by using the parameter domain.The location and scale that are initially defined by dist can be reset without having to rerun the setup step to create the generator that is used for sampling. The relation of the distribution Y with
loc
andscale
to the standard distribution X (i.e.,loc=0
andscale=1
) is given byY = loc + scale * X
.References
[1]Derflinger, Gerhard, Wolfgang Hörmann, and Josef Leydold. “Random variate generation by numerical inversion when only the density is known.” ACM Transactions on Modeling and Computer Simulation (TOMACS) 20.4 (2010): 1-25.
[2]Hörmann, Wolfgang, Josef Leydold and Gerhard Derflinger. “Automatic nonuniform random number generation.” Springer, 2004.
Examples
>>> import numpy as np >>> from scipy import stats >>> from scipy.stats.sampling import FastGeneratorInversion
Let’s start with a simple example to illustrate the main features:
>>> gamma_frozen = stats.gamma(1.5) >>> gamma_dist = FastGeneratorInversion(gamma_frozen) >>> r = gamma_dist.rvs(size=1000)
The mean should be approximately equal to the shape parameter 1.5:
>>> r.mean() 1.52423591130436 # may vary
Similarly, we can draw a sample based on quasi-random numbers:
>>> r = gamma_dist.qrvs(size=1000) >>> r.mean() 1.4996639255942914 # may vary
Compare the PPF against approximation
ppf
.>>> q = [0.001, 0.2, 0.5, 0.8, 0.999] >>> np.max(np.abs(gamma_frozen.ppf(q) - gamma_dist.ppf(q))) 4.313394796895409e-08
To confirm that the numerical inversion is accurate, we evaluate the approximation error (u-error), which should be below 1e-10 (for more details, refer to the documentation of
evaluate_error
):>>> gamma_dist.evaluate_error() (7.446320551265581e-11, nan) # may vary
Note that the location and scale can be changed without instantiating a new generator:
>>> gamma_dist.loc = 2 >>> gamma_dist.scale = 3 >>> r = gamma_dist.rvs(size=1000)
The mean should be approximately 2 + 3*1.5 = 6.5.
>>> r.mean() 6.399549295242894 # may vary
Let us also illustrate how truncation can be applied:
>>> trunc_norm = FastGeneratorInversion(stats.norm(), domain=(3, 4)) >>> r = trunc_norm.rvs(size=1000) >>> 3 < r.min() < r.max() < 4 True
Check the mean:
>>> r.mean() 3.250433367078603 # may vary
>>> stats.norm.expect(lb=3, ub=4, conditional=True) 3.260454285589997
In this particular, case,
scipy.stats.truncnorm
could also be used to generate truncated normal random variates.