scipy.stats.NumericalInverseHermite

class scipy.stats.NumericalInverseHermite(dist, *, tol=1e-12, max_intervals=100000)[source]

A Hermite spline fast numerical inverse of a probability distribution.

The initializer of NumericalInverseHermite accepts dist, an object representing a continuous distribution, and provides an object with methods that approximate dist.ppf and dist.rvs. For most distributions, these methods are faster than those of dist itself.

Parameters
distobject

Object representing the distribution for which a fast numerical inverse is desired; for instance, a frozen instance of a scipy.stats continuous distribution. See Notes and Examples for details.

tolfloat, optional

u-error tolerance (see Notes). The default is 1e-12.

max_intervalsint, optional

Maximum number of intervals in the cubic Hermite spline used to approximate the percent point function. The default is 100000.

Notes

NumericalInverseHermite approximates the inverse of a continuous statistical distribution’s CDF with a cubic Hermite spline.

As described in [1], it begins by evaluating the distribution’s PDF and CDF at a mesh of quantiles x within the distribution’s support. It uses the results to fit a cubic Hermite spline H such that H(p) == x, where p is the array of percentiles corresponding with the quantiles x. Therefore, the spline approximates the inverse of the distribution’s CDF to machine precision at the percentiles p, but typically, the spline will not be as accurate at the midpoints between the percentile points:

p_mid = (p[:-1] + p[1:])/2

so the mesh of quantiles is refined as needed to reduce the maximum “u-error”:

u_error = np.max(np.abs(dist.cdf(H(p_mid)) - p_mid))

below the specified tolerance tol. Refinement stops when the required tolerance is achieved or when the number of mesh intervals after the next refinement could exceed the maximum allowed number max_intervals.

The object dist must have methods pdf, cdf, and ppf that behave like those of a frozen instance of scipy.stats.rv_continuous. Specifically, it must have methods pdf and cdf that accept exactly one ndarray argument x and return the probability density function and cumulative density function (respectively) at x. The object must also have a method ppf that accepts a float p and returns the percentile point function at p. The object may also have a method isf that accepts a float p and returns the inverse survival function at p; if it does not, it will be assigned an attribute isf that calculates the inverse survival function using ppf. The ppf and isf` methods will each be evaluated at a small positive float ``p (e.g. p = utol/10), and the domain over which the approximate numerical inverse is defined will be ppf(p) to isf(p). The approximation will not be accurate in the extreme tails beyond this domain.

References

1

Hörmann, Wolfgang, and Josef Leydold. “Continuous random variate generation by fast numerical inversion.” ACM Transactions on Modeling and Computer Simulation (TOMACS) 13.4 (2003): 347-362.

Examples

For some distributions, dist.ppf and dist.rvs are quite slow. For instance, consider scipy.stats.genexpon. We freeze the distribution by passing all shape parameters into its initializer and time the resulting object’s ppf and rvs functions.

>>> import numpy as np
>>> from scipy import stats
>>> from timeit import timeit
>>> time_once = lambda f: f"{timeit(f, number=1)*1000:.6} ms"
>>> dist = stats.genexpon(9, 16, 3)  # freeze the distribution
>>> p = np.linspace(0.01, 0.99, 99)  # percentiles from 1% to 99%
>>> time_once(lambda: dist.ppf(p))
'154.565 ms'  # may vary
>>> time_once(lambda: dist.rvs(size=100))
'148.979 ms'  # may vary

The NumericalInverseHermite has a method that approximates dist.ppf.

>>> from scipy.stats import NumericalInverseHermite
>>> fni = NumericalInverseHermite(dist)
>>> np.allclose(fni.ppf(p), dist.ppf(p))
True

In some cases, it is faster to both generate the fast numerical inverse and use it than to call dist.ppf.

>>> def time_me():
...     fni = NumericalInverseHermite(dist)
...     fni.ppf(p)
>>> time_once(time_me)
'11.9222 ms'  # may vary

After generating the fast numerical inverse, subsequent calls to its methods are much faster. >>> time_once(lambda: fni.ppf(p)) ‘0.0819 ms’ # may vary

The fast numerical inverse can also be used to generate random variates using inverse transform sampling.

>>> time_once(lambda: fni.rvs(size=100))
'0.0911 ms'  # may vary

Depending on the implementation of the distribution’s random sampling method, the random variates generated may be nearly identical, given the same random state.

>>> # `seed` ensures identical random streams are used by each `rvs` method
>>> seed = 500072020
>>> rvs1 = dist.rvs(size=100, random_state=np.random.default_rng(seed))
>>> rvs2 = fni.rvs(size=100, random_state=np.random.default_rng(seed))
>>> np.allclose(rvs1, rvs2)
True

To use NumericalInverseHermite with a custom distribution, users may subclass scipy.stats.rv_continuous and initialize a frozen instance or create an object with equivalent pdf, cdf, and ppf methods. For instance, the following object represents the standard normal distribution. For simplicity, we use scipy.special.ndtr and scipy.special.ndtri to compute the cdf and ppf, respectively.

>>> from scipy.special import ndtr, ndtri
>>>
>>> class MyNormal:
...
...     def pdf(self, x):
...        return 1/np.sqrt(2*np.pi) * np.exp(-x**2 / 2)
...
...     def cdf(self, x):
...        return ndtr(x)
...
...     def ppf(self, x):
...        return ndtri(x)
...
>>> dist1 = MyNormal()
>>> fni1 = NumericalInverseHermite(dist1)
>>>
>>> dist2 = stats.norm()
>>> fni2 = NumericalInverseHermite(dist2)
>>>
>>> print(fni1.rvs(random_state=seed), fni2.rvs(random_state=seed))
-1.9603810921759424 -1.9603810921747074
Attributes
intervalsint

The number of intervals of the interpolant.

midpoint_errorfloat

The maximum u-error at an interpolant interval midpoint.

Methods

ppf(q)

Approximate percent point function (inverse cdf) of the given RV.

qrvs([size, d, qmc_engine])

Quasi-random variates of the given RV.

rvs([size, random_state])

Random variates of the given RV.