scipy.stats.fisher_exact

scipy.stats.fisher_exact(table, alternative='two-sided')[source]

Performs a Fisher exact test on a 2x2 contingency table.

Parameters :

table : array_like of ints

A 2x2 contingency table. Elements should be non-negative integers.

alternative : {‘two-sided’, ‘less’, ‘greater’}, optional

Which alternative hypothesis to the null hypothesis the test uses. Default is ‘two-sided’.

Returns :

oddsratio : float

This is prior odds ratio and not a posterior estimate.

p_value : float

P-value, the probability of obtaining a distribution at least as extreme as the one that was actually observed, assuming that the null hypothesis is true.

See also

chi2_contingency
Chi-square test of independence of variables in a contingency table.

Notes

The calculated odds ratio is different from the one R uses. In R language, this implementation returns the (more common) “unconditional Maximum Likelihood Estimate”, while R uses the “conditional Maximum Likelihood Estimate”.

For tables with large numbers the (inexact) chi-square test implemented in the function chi2_contingency can also be used.

Examples

Say we spend a few days counting whales and sharks in the Atlantic and Indian oceans. In the Atlantic ocean we find 8 whales and 1 shark, in the Indian ocean 2 whales and 5 sharks. Then our contingency table is:

        Atlantic  Indian
whales     8        2
sharks     1        5

We use this table to find the p-value:

>>> oddsratio, pvalue = stats.fisher_exact([[8, 2], [1, 5]])
>>> pvalue
0.0349...

The probability that we would observe this or an even more imbalanced ratio by chance is about 3.5%. A commonly used significance level is 5%, if we adopt that we can therefore conclude that our observed imbalance is statistically significant; whales prefer the Atlantic while sharks prefer the Indian ocean.

Previous topic

scipy.stats.contingency.margins

Next topic

scipy.stats.glm