SciPy

scipy.stats.ppcc_max

scipy.stats.ppcc_max(x, brack=(0.0, 1.0), dist='tukeylambda')[source]

Calculate the shape parameter that maximizes the PPCC

The probability plot correlation coefficient (PPCC) plot can be used to determine the optimal shape parameter for a one-parameter family of distributions. ppcc_max returns the shape parameter that would maximize the probability plot correlation coefficient for the given data to a one-parameter family of distributions.

Parameters:

x : array_like

Input array.

brack : tuple, optional

Triple (a,b,c) where (a<b<c). If bracket consists of two numbers (a, c) then they are assumed to be a starting interval for a downhill bracket search (see scipy.optimize.brent).

dist : str or stats.distributions instance, optional

Distribution or distribution function name. Objects that look enough like a stats.distributions instance (i.e. they have a ppf method) are also accepted. The default is 'tukeylambda'.

Returns:

shape_value : float

The shape parameter at which the probability plot correlation coefficient reaches its max value.

Notes

The brack keyword serves as a starting point which is useful in corner cases. One can use a plot to obtain a rough visual estimate of the location for the maximum to start the search near it.

References

[R626]J.J. Filliben, “The Probability Plot Correlation Coefficient Test for Normality”, Technometrics, Vol. 17, pp. 111-117, 1975.
[R627]http://www.itl.nist.gov/div898/handbook/eda/section3/ppccplot.htm

Examples

First we generate some random data from a Tukey-Lambda distribution, with shape parameter -0.7:

>>> from scipy import stats
>>> x = stats.tukeylambda.rvs(-0.7, loc=2, scale=0.5, size=10000,
...                           random_state=1234567) + 1e4

Now we explore this data with a PPCC plot as well as the related probability plot and Box-Cox normplot. A red line is drawn where we expect the PPCC value to be maximal (at the shape parameter -0.7 used above):

>>> import matplotlib.pyplot as plt
>>> fig = plt.figure(figsize=(8, 6))
>>> ax = fig.add_subplot(111)
>>> res = stats.ppcc_plot(x, -5, 5, plot=ax)

We calculate the value where the shape should reach its maximum and a red line is drawn there. The line should coincide with the highest point in the ppcc_plot.

>>> max = stats.ppcc_max(x)
>>> ax.vlines(max, 0, 1, colors='r', label='Expected shape value')
>>> plt.show()

(Source code)

../_images/scipy-stats-ppcc_max-1.png