scipy.stats.fisher_exact¶
- scipy.stats.fisher_exact(table, alternative='two-sided')[source]¶
Performs a Fisher exact test on a 2x2 contingency table.
Parameters: table : array_like of ints
A 2x2 contingency table. Elements should be non-negative integers.
alternative : {‘two-sided’, ‘less’, ‘greater’}, optional
Which alternative hypothesis to the null hypothesis the test uses. Default is ‘two-sided’.
Returns: oddsratio : float
This is prior odds ratio and not a posterior estimate.
p_value : float
P-value, the probability of obtaining a distribution at least as extreme as the one that was actually observed, assuming that the null hypothesis is true.
See also
- chi2_contingency
- Chi-square test of independence of variables in a contingency table.
Notes
The calculated odds ratio is different from the one R uses. This scipy implementation returns the (more common) “unconditional Maximum Likelihood Estimate”, while R uses the “conditional Maximum Likelihood Estimate”.
For tables with large numbers, the (inexact) chi-square test implemented in the function chi2_contingency can also be used.
Examples
Say we spend a few days counting whales and sharks in the Atlantic and Indian oceans. In the Atlantic ocean we find 8 whales and 1 shark, in the Indian ocean 2 whales and 5 sharks. Then our contingency table is:
Atlantic Indian whales 8 2 sharks 1 5
We use this table to find the p-value:
>>> import scipy.stats as stats >>> oddsratio, pvalue = stats.fisher_exact([[8, 2], [1, 5]]) >>> pvalue 0.0349...
The probability that we would observe this or an even more imbalanced ratio by chance is about 3.5%. A commonly used significance level is 5%–if we adopt that, we can therefore conclude that our observed imbalance is statistically significant; whales prefer the Atlantic while sharks prefer the Indian ocean.