scipy.stats.qmc.Sobol#

class scipy.stats.qmc.Sobol(d, *, scramble=True, bits=None, seed=None)[source]#

Engine for generating (scrambled) Sobol’ sequences.

Sobol’ sequences are low-discrepancy, quasi-random numbers. Points can be drawn using two methods:

  • random_base2: safely draw \(n=2^m\) points. This method guarantees the balance properties of the sequence.

  • random: draw an arbitrary number of points from the sequence. See warning below.

Parameters
dint

Dimensionality of the sequence. Max dimensionality is 21201.

scramblebool, optional

If True, use LMS+shift scrambling. Otherwise, no scrambling is done. Default is True.

bitsint, optional

Number of bits of the generator. Control the maximum number of points that can be generated, which is 2**bits. Maximal value is 64. It does not correspond to the return type, which is always np.float64 to prevent points from repeating themselves. Default is None, which for backward compatibility, corresponds to 30.

seed{None, int, numpy.random.Generator}, optional

If seed is None the numpy.random.Generator singleton is used. If seed is an int, a new Generator instance is used, seeded with seed. If seed is already a Generator instance then that instance is used.

Notes

Sobol’ sequences [1] provide \(n=2^m\) low discrepancy points in \([0,1)^{d}\). Scrambling them [3] makes them suitable for singular integrands, provides a means of error estimation, and can improve their rate of convergence. The scrambling strategy which is implemented is a (left) linear matrix scramble (LMS) followed by a digital random shift (LMS+shift) [2].

There are many versions of Sobol’ sequences depending on their ‘direction numbers’. This code uses direction numbers from [4]. Hence, the maximum number of dimension is 21201. The direction numbers have been precomputed with search criterion 6 and can be retrieved at https://web.maths.unsw.edu.au/~fkuo/sobol/.

Warning

Sobol’ sequences are a quadrature rule and they lose their balance properties if one uses a sample size that is not a power of 2, or skips the first point, or thins the sequence [5].

If \(n=2^m\) points are not enough then one should take \(2^M\) points for \(M>m\). When scrambling, the number R of independent replicates does not have to be a power of 2.

Sobol’ sequences are generated to some number \(B\) of bits. After \(2^B\) points have been generated, the sequence would repeat. Hence, an error is raised. The number of bits can be controlled with the parameter bits.

References

1

I. M. Sobol’, “The distribution of points in a cube and the accurate evaluation of integrals.” Zh. Vychisl. Mat. i Mat. Phys., 7:784-802, 1967.

2

J. Matousek, “On the L2-discrepancy for anchored boxes.” J. of Complexity 14, 527-556, 1998.

3

Art B. Owen, “Scrambling Sobol and Niederreiter-Xing points.” Journal of Complexity, 14(4):466-489, December 1998.

4

S. Joe and F. Y. Kuo, “Constructing sobol sequences with better two-dimensional projections.” SIAM Journal on Scientific Computing, 30(5):2635-2654, 2008.

5

Art B. Owen, “On dropping the first Sobol’ point.” arXiv:2008.08051, 2020.

Examples

Generate samples from a low discrepancy sequence of Sobol’.

>>> from scipy.stats import qmc
>>> sampler = qmc.Sobol(d=2, scramble=False)
>>> sample = sampler.random_base2(m=3)
>>> sample
array([[0.   , 0.   ],
       [0.5  , 0.5  ],
       [0.75 , 0.25 ],
       [0.25 , 0.75 ],
       [0.375, 0.375],
       [0.875, 0.875],
       [0.625, 0.125],
       [0.125, 0.625]])

Compute the quality of the sample using the discrepancy criterion.

>>> qmc.discrepancy(sample)
0.013882107204860938

To continue an existing design, extra points can be obtained by calling again random_base2. Alternatively, you can skip some points like:

>>> _ = sampler.reset()
>>> _ = sampler.fast_forward(4)
>>> sample_continued = sampler.random_base2(m=2)
>>> sample_continued
array([[0.375, 0.375],
       [0.875, 0.875],
       [0.625, 0.125],
       [0.125, 0.625]])

Finally, samples can be scaled to bounds.

>>> l_bounds = [0, 2]
>>> u_bounds = [10, 5]
>>> qmc.scale(sample_continued, l_bounds, u_bounds)
array([[3.75 , 3.125],
       [8.75 , 4.625],
       [6.25 , 2.375],
       [1.25 , 3.875]])

Methods

fast_forward(n)

Fast-forward the sequence by n positions.

integers(l_bounds, *[, u_bounds, n, ...])

Draw n integers from l_bounds (inclusive) to u_bounds (exclusive), or if endpoint=True, l_bounds (inclusive) to u_bounds (inclusive).

random([n])

Draw next point(s) in the Sobol' sequence.

random_base2(m)

Draw point(s) from the Sobol' sequence.

reset()

Reset the engine to base state.