scipy.stats.gstd#

scipy.stats.gstd(a, axis=0, ddof=1)[source]#

Calculate the geometric standard deviation of an array.

The geometric standard deviation describes the spread of a set of numbers where the geometric mean is preferred. It is a multiplicative factor, and so a dimensionless quantity.

It is defined as the exponent of the standard deviation of log(a). Mathematically the population geometric standard deviation can be evaluated as:

gstd = exp(std(log(a)))

New in version 1.3.0.

Parameters:

aarray_like: An array like object containing the sample data.
axisint, tuple or None, optional: Axis along which to operate. Default is 0. If None, compute over the whole array a.
ddofint, optional: Degree of freedom correction in the calculation of the geometric standard deviation. Default is 1.

Returns:

ndarray or float: An array of the geometric standard deviation. If axis is None or a is a 1d array a float is returned.

See also

gmean: Geometric mean
numpy.std: Standard deviation

Notes

As the calculation requires the use of logarithms the geometric standard deviation only supports strictly positive values. Any non-positive or infinite values will raise a ValueError. The geometric standard deviation is sometimes confused with the exponent of the standard deviation, exp(std(a)). Instead the geometric standard deviation is exp(std(log(a))). The default value for ddof is different to the default value (0) used by other ddof containing functions, such as np.std and np.nanstd.

References

[1]

Kirkwood, T. B., “Geometric means and measures of dispersion”, Biometrics, vol. 35, pp. 908-909, 1979

Examples

Find the geometric standard deviation of a log-normally distributed sample. Note that the standard deviation of the distribution is one, on a log scale this evaluates to approximately exp(1).

>>> import numpy as np
>>> from scipy.stats import gstd
>>> rng = np.random.default_rng()
>>> sample = rng.lognormal(mean=0, sigma=1, size=1000)
>>> gstd(sample)
2.810010162475324

Compute the geometric standard deviation of a multidimensional array and of a given axis.

>>> a = np.arange(1, 25).reshape(2, 3, 4)
>>> gstd(a, axis=None)
2.2944076136018947
>>> gstd(a, axis=2)
array([[1.82424757, 1.22436866, 1.13183117],
       [1.09348306, 1.07244798, 1.05914985]])
>>> gstd(a, axis=(1,2))
array([2.12939215, 1.22120169])

The geometric standard deviation further handles masked arrays.

>>> a = np.arange(1, 25).reshape(2, 3, 4)
>>> ma = np.ma.masked_where(a > 16, a)
>>> ma
masked_array(
  data=[[[1, 2, 3, 4],
         [5, 6, 7, 8],
         [9, 10, 11, 12]],
        [[13, 14, 15, 16],
         [--, --, --, --],
         [--, --, --, --]]],
  mask=[[[False, False, False, False],
         [False, False, False, False],
         [False, False, False, False]],
        [[False, False, False, False],
         [ True,  True,  True,  True],
         [ True,  True,  True,  True]]],
  fill_value=999999)
>>> gstd(ma, axis=2)
masked_array(
  data=[[1.8242475707663655, 1.2243686572447428, 1.1318311657788478],
        [1.0934830582350938, --, --]],
  mask=[[False, False, False],
        [False,  True,  True]],
  fill_value=999999)