scipy.stats.contingency.association#

scipy.stats.contingency.association(observed, method='cramer', correction=False, lambda_=None)[source]#

Calculates degree of association between two nominal variables.

The function provides the option for computing one of three measures of association between two nominal variables from the data given in a 2d contingency table: Tschuprow’s T, Pearson’s Contingency Coefficient and Cramer’s V.

Parameters:
observedarray-like

The array of observed values

method{“cramer”, “tschuprow”, “pearson”} (default = “cramer”)

The association test statistic.

correctionbool, optional

Inherited from scipy.stats.contingency.chi2_contingency()

lambda_float or str, optional

Inherited from scipy.stats.contingency.chi2_contingency()

Returns:
statisticfloat

Value of the test statistic

Notes

Cramer’s V, Tschuprow’s T and Pearson’s Contingency Coefficient, all measure the degree to which two nominal or ordinal variables are related, or the level of their association. This differs from correlation, although many often mistakenly consider them equivalent. Correlation measures in what way two variables are related, whereas, association measures how related the variables are. As such, association does not subsume independent variables, and is rather a test of independence. A value of 1.0 indicates perfect association, and 0.0 means the variables have no association.

Both the Cramer’s V and Tschuprow’s T are extensions of the phi coefficient. Moreover, due to the close relationship between the Cramer’s V and Tschuprow’s T the returned values can often be similar or even equivalent. They are likely to diverge more as the array shape diverges from a 2x2.

References

[2]

Tschuprow, A. A. (1939) Principles of the Mathematical Theory of Correlation; translated by M. Kantorowitsch. W. Hodge & Co.

[4]

“Nominal Association: Phi and Cramer’s V”, http://www.people.vcu.edu/~pdattalo/702SuppRead/MeasAssoc/NominalAssoc.html

[5]

Gingrich, Paul, “Association Between Variables”, http://uregina.ca/~gingrich/ch11a.pdf

Examples

An example with a 4x2 contingency table:

>>> import numpy as np
>>> from scipy.stats.contingency import association
>>> obs4x2 = np.array([[100, 150], [203, 322], [420, 700], [320, 210]])

Pearson’s contingency coefficient

>>> association(obs4x2, method="pearson")
0.18303298140595667

Cramer’s V

>>> association(obs4x2, method="cramer")
0.18617813077483678

Tschuprow’s T

>>> association(obs4x2, method="tschuprow")
0.14146478765062995