SciPy

scipy.stats.rvs_ratio_uniforms

scipy.stats.rvs_ratio_uniforms(pdf, umax, vmin, vmax, size=1, c=0, random_state=None)[source]

Generate random samples from a probability density function using the ratio-of-uniforms method.

Parameters
pdfcallable

A function with signature pdf(x) that is the probability density function of the distribution.

umaxfloat

The upper bound of the bounding rectangle in the u-direction.

vminfloat

The lower bound of the bounding rectangle in the v-direction.

vmaxfloat

The upper bound of the bounding rectangle in the v-direction.

sizeint or tuple of ints, optional

Defining number of random variates (default is 1).

cfloat, optional.

Shift parameter of ratio-of-uniforms method, see Notes. Default is 0.

random_stateint or np.random.RandomState instance, optional

If already a RandomState instance, use it. If seed is an int, return a new RandomState instance seeded with seed. If None, use np.random.RandomState. Default is None.

Returns
rvsndarray

The random variates distributed according to the probability distribution defined by the pdf.

Notes

Given a univariate probability density function pdf and a constant c, define the set A = {(u, v) : 0 < u <= sqrt(pdf(v/u + c))}. If (U, V) is a random vector uniformly distributed over A, then V/U + c follows a distribution according to pdf.

The above result (see [1], [2]) can be used to sample random variables using only the pdf, i.e. no inversion of the cdf is required. Typical choices of c are zero or the mode of pdf. The set A is a subset of the rectangle R = [0, umax] x [vmin, vmax] where

  • umax = sup sqrt(pdf(x))

  • vmin = inf (x - c) sqrt(pdf(x))

  • vmax = sup (x - c) sqrt(pdf(x))

In particular, these values are finite if pdf is bounded and x**2 * pdf(x) is bounded (i.e. subquadratic tails). One can generate (U, V) uniformly on R and return V/U + c if (U, V) are also in A which can be directly verified.

Intuitively, the method works well if A fills up most of the enclosing rectangle such that the probability is high that (U, V) lies in A whenever it lies in R as the number of required iterations becomes too large otherwise. To be more precise, note that the expected number of iterations to draw (U, V) uniformly distributed on R such that (U, V) is also in A is given by the ratio area(R) / area(A) = 2 * umax * (vmax - vmin), using the fact that the area of A is equal to 1/2 (Theorem 7.1 in [1]). A warning is displayed if this ratio is larger than 20. Moreover, if the sampling fails to generate a single random variate after 50000 iterations (i.e. not a single draw is in A), an exception is raised.

If the bounding rectangle is not correctly specified (i.e. if it does not contain A), the algorithm samples from a distribution different from the one given by pdf. It is therefore recommended to perform a test such as stats.kstest as a check.

References

1(1,2)

L. Devroye, “Non-Uniform Random Variate Generation”, Springer-Verlag, 1986.

2(1,2)

W. Hoermann and J. Leydold, “Generating generalized inverse Gaussian random variates”, Statistics and Computing, 24(4), p. 547–557, 2014.

3

A.J. Kinderman and J.F. Monahan, “Computer Generation of Random Variables Using the Ratio of Uniform Deviates”, ACM Transactions on Mathematical Software, 3(3), p. 257–260, 1977.

Examples

>>> from scipy import stats

Simulate normally distributed random variables. It is easy to compute the bounding rectangle explicitly in that case.

>>> f = stats.norm.pdf
>>> v_bound = np.sqrt(f(np.sqrt(2))) * np.sqrt(2)
>>> umax, vmin, vmax = np.sqrt(f(0)), -v_bound, v_bound
>>> np.random.seed(12345)
>>> rvs = stats.rvs_ratio_uniforms(f, umax, vmin, vmax, size=2500)

The K-S test confirms that the random variates are indeed normally distributed (normality is not rejected at 5% significance level):

>>> stats.kstest(rvs, 'norm')[1]
0.3420173467307603

The exponential distribution provides another example where the bounding rectangle can be determined explicitly.

>>> np.random.seed(12345)
>>> rvs = stats.rvs_ratio_uniforms(lambda x: np.exp(-x), umax=1,
...                                vmin=0, vmax=2*np.exp(-1), size=1000)
>>> stats.kstest(rvs, 'expon')[1]
0.928454552559516

Sometimes it can be helpful to use a non-zero shift parameter c, see e.g. [2] above in the case of the generalized inverse Gaussian distribution.

Previous topic

scipy.stats.energy_distance

Next topic

scipy.stats.circmean