scipy.stats.kendalltau¶

scipy.stats.
kendalltau
(x, y, initial_lexsort=None, nan_policy='propagate', method='auto', variant='b')[source]¶ Calculate Kendall’s tau, a correlation measure for ordinal data.
Kendall’s tau is a measure of the correspondence between two rankings. Values close to 1 indicate strong agreement, and values close to 1 indicate strong disagreement. This implements two variants of Kendall’s tau: taub (the default) and tauc (also known as Stuart’s tauc). These differ only in how they are normalized to lie within the range 1 to 1; the hypothesis tests (their pvalues) are identical. Kendall’s original taua is not implemented separately because both taub and tauc reduce to taua in the absence of ties.
 Parameters
 x, yarray_like
Arrays of rankings, of the same shape. If arrays are not 1D, they will be flattened to 1D.
 initial_lexsortbool, optional
Unused (deprecated).
 nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):
‘propagate’: returns nan
‘raise’: throws an error
‘omit’: performs the calculations ignoring nan values
 method{‘auto’, ‘asymptotic’, ‘exact’}, optional
Defines which method is used to calculate the pvalue [5]. The following options are available (default is ‘auto’):
‘auto’: selects the appropriate method based on a tradeoff between speed and accuracy
‘asymptotic’: uses a normal approximation valid for large samples
‘exact’: computes the exact pvalue, but can only be used if no ties are present. As the sample size increases, the ‘exact’ computation time may grow and the result may lose some precision.
 variant: {‘b’, ‘c’}, optional
Defines which variant of Kendall’s tau is returned. Default is ‘b’.
 Returns
 correlationfloat
The tau statistic.
 pvaluefloat
The twosided pvalue for a hypothesis test whose null hypothesis is an absence of association, tau = 0.
See also
spearmanr
Calculates a Spearman rankorder correlation coefficient.
theilslopes
Computes the TheilSen estimator for a set of points (x, y).
weightedtau
Computes a weighted version of Kendall’s tau.
Notes
The definition of Kendall’s tau that is used is [2]:
tau_b = (P  Q) / sqrt((P + Q + T) * (P + Q + U)) tau_c = 2 (P  Q) / (n**2 * (m  1) / m)
where P is the number of concordant pairs, Q the number of discordant pairs, T the number of ties only in x, and U the number of ties only in y. If a tie occurs for the same pair in both x and y, it is not added to either T or U. n is the total number of samples, and m is the number of unique values in either x or y, whichever is smaller.
References
 1
Maurice G. Kendall, “A New Measure of Rank Correlation”, Biometrika Vol. 30, No. 1/2, pp. 8193, 1938.
 2
Maurice G. Kendall, “The treatment of ties in ranking problems”, Biometrika Vol. 33, No. 3, pp. 239251. 1945.
 3
Gottfried E. Noether, “Elements of Nonparametric Statistics”, John Wiley & Sons, 1967.
 4
Peter M. Fenwick, “A new data structure for cumulative frequency tables”, Software: Practice and Experience, Vol. 24, No. 3, pp. 327336, 1994.
 5
Maurice G. Kendall, “Rank Correlation Methods” (4th Edition), Charles Griffin & Co., 1970.
Examples
>>> from scipy import stats >>> x1 = [12, 2, 1, 12, 2] >>> x2 = [1, 4, 7, 1, 0] >>> tau, p_value = stats.kendalltau(x1, x2) >>> tau 0.47140452079103173 >>> p_value 0.2827454599327748