SciPy

scipy.stats.kendalltau

scipy.stats.kendalltau(x, y, initial_lexsort=None, nan_policy='propagate')[source]

Calculates Kendall’s tau, a correlation measure for ordinal data.

Kendall’s tau is a measure of the correspondence between two rankings. Values close to 1 indicate strong agreement, values close to -1 indicate strong disagreement. This is the tau-b version of Kendall’s tau which accounts for ties.

Parameters:

x, y : array_like

Arrays of rankings, of the same shape. If arrays are not 1-D, they will be flattened to 1-D.

initial_lexsort : bool, optional

Whether to use lexsort or quicksort as the sorting method for the initial sort of the inputs. Default is lexsort (True), for which kendalltau is of complexity O(n log(n)). If False, the complexity is O(n^2), but with a smaller pre-factor (so quicksort may be faster for small arrays).

nan_policy : {‘propagate’, ‘raise’, ‘omit’}, optional

Defines how to handle when input contains nan. ‘propagate’ returns nan, ‘raise’ throws an error, ‘omit’ performs the calculations ignoring nan values. Default is ‘propagate’.

Returns:

correlation : float

The tau statistic.

pvalue : float

The two-sided p-value for a hypothesis test whose null hypothesis is an absence of association, tau = 0.

See also

spearmanr
Calculates a Spearman rank-order correlation coefficient.
theilslopes
Computes the Theil-Sen estimator for a set of points (x, y).

Notes

The definition of Kendall’s tau that is used is:

tau = (P - Q) / sqrt((P + Q + T) * (P + Q + U))

where P is the number of concordant pairs, Q the number of discordant pairs, T the number of ties only in x, and U the number of ties only in y. If a tie occurs for the same pair in both x and y, it is not added to either T or U.

References

W.R. Knight, “A Computer Method for Calculating Kendall’s Tau with Ungrouped Data”, Journal of the American Statistical Association, Vol. 61, No. 314, Part 1, pp. 436-439, 1966.

Examples

>>> from scipy import stats
>>> x1 = [12, 2, 1, 12, 2]
>>> x2 = [1, 4, 7, 1, 0]
>>> tau, p_value = stats.kendalltau(x1, x2)
>>> tau
-0.47140452079103173
>>> p_value
0.24821309157521476