# scipy.stats.fisher_exact¶

scipy.stats.fisher_exact(table, alternative='two-sided')[source]

Perform a Fisher exact test on a 2x2 contingency table.

Parameters
tablearray_like of ints

A 2x2 contingency table. Elements must be non-negative integers.

alternative{‘two-sided’, ‘less’, ‘greater’}, optional

Defines the alternative hypothesis. The following options are available (default is ‘two-sided’):

• ‘two-sided’

• ‘less’: one-sided

• ‘greater’: one-sided

See the Notes for more details.

Returns
oddsratiofloat

This is prior odds ratio and not a posterior estimate.

p_valuefloat

P-value, the probability of obtaining a distribution at least as extreme as the one that was actually observed, assuming that the null hypothesis is true.

`chi2_contingency`

Chi-square test of independence of variables in a contingency table. This can be used as an alternative to `fisher_exact` when the numbers in the table are large.

`barnard_exact`

Barnard’s exact test, which is a more powerful alternative than Fisher’s exact test for 2x2 contingency tables.

`boschloo_exact`

Boschloo’s exact test, which is a more powerful alternative than Fisher’s exact test for 2x2 contingency tables.

Notes

Null hypothesis and p-values

The null hypothesis is that the input table is from the hypergeometric distribution with parameters (as used in `hypergeom`) `M = a + b + c + d`, `n = a + b` and `N = a + c`, where the input table is `[[a, b], [c, d]]`. This distribution has support `max(0, N + n - M) <= x <= min(N, n)`, or, in terms of the values in the input table, `min(0, a - d) <= x <= a + min(b, c)`. `x` can be interpreted as the upper-left element of a 2x2 table, so the tables in the distribution have form:

```[  x           n - x     ]
[N - x    M - (n + N) + x]
```

For example, if:

```table = [6  2]
[1  4]
```

then the support is `2 <= x <= 7`, and the tables in the distribution are:

```[2 6]   [3 5]   [4 4]   [5 3]   [6 2]  [7 1]
[5 0]   [4 1]   [3 2]   [2 3]   [1 4]  [0 5]
```

The probability of each table is given by the hypergeometric distribution `hypergeom.pmf(x, M, n, N)`. For this example, these are (rounded to three significant digits):

```x       2      3      4      5       6        7
p  0.0163  0.163  0.408  0.326  0.0816  0.00466
```

These can be computed with:

```>>> from scipy.stats import hypergeom
>>> table = np.array([[6, 2], [1, 4]])
>>> M = table.sum()
>>> n = table.sum()
>>> N = table[:, 0].sum()
>>> start, end = hypergeom.support(M, n, N)
>>> hypergeom.pmf(np.arange(start, end+1), M, n, N)
array([0.01631702, 0.16317016, 0.40792541, 0.32634033, 0.08158508,
0.004662  ])
```

The two-sided p-value is the probability that, under the null hypothesis, a random table would have a probability equal to or less than the probability of the input table. For our example, the probability of the input table (where `x = 6`) is 0.0816. The x values where the probability does not exceed this are 2, 6 and 7, so the two-sided p-value is `0.0163 + 0.0816 + 0.00466 ~= 0.10256`:

```>>> from scipy.stats import fisher_exact
>>> oddsr, p = fisher_exact(table, alternative='two-sided')
>>> p
0.10256410256410257
```

The one-sided p-value for `alternative='greater'` is the probability that a random table has `x >= a`, which in our example is `x >= 6`, or `0.0816 + 0.00466 ~= 0.08626`:

```>>> oddsr, p = fisher_exact(table, alternative='greater')
>>> p
0.08624708624708627
```

This is equivalent to computing the survival function of the distribution at `x = 5` (one less than `x` from the input table, because we want to include the probability of `x = 6` in the sum):

```>>> hypergeom.sf(5, M, n, N)
0.08624708624708627
```

For `alternative='less'`, the one-sided p-value is the probability that a random table has `x <= a`, (i.e. `x <= 6` in our example), or `0.0163 + 0.163 + 0.408 + 0.326 + 0.0816 ~= 0.9949`:

```>>> oddsr, p = fisher_exact(table, alternative='less')
>>> p
0.9953379953379957
```

This is equivalent to computing the cumulative distribution function of the distribution at `x = 6`:

```>>> hypergeom.cdf(6, M, n, N)
0.9953379953379957
```

Odds ratio

The calculated odds ratio is different from the one R uses. This SciPy implementation returns the (more common) “unconditional Maximum Likelihood Estimate”, while R uses the “conditional Maximum Likelihood Estimate”.

Examples

Say we spend a few days counting whales and sharks in the Atlantic and Indian oceans. In the Atlantic ocean we find 8 whales and 1 shark, in the Indian ocean 2 whales and 5 sharks. Then our contingency table is:

```        Atlantic  Indian
whales     8        2
sharks     1        5
```

We use this table to find the p-value:

```>>> from scipy.stats import fisher_exact
>>> oddsratio, pvalue = fisher_exact([[8, 2], [1, 5]])
>>> pvalue
0.0349...
```

The probability that we would observe this or an even more imbalanced ratio by chance is about 3.5%. A commonly used significance level is 5%–if we adopt that, we can therefore conclude that our observed imbalance is statistically significant; whales prefer the Atlantic while sharks prefer the Indian ocean.