pytutorial/scipy/scipy.stats.fisher_exact
David Rotermund f6497ea091
Add files via upload
Signed-off-by: David Rotermund <54365609+davrot@users.noreply.github.com>
2024-01-05 14:32:42 +01:00
..
image1.png Add files via upload 2024-01-05 14:32:42 +01:00
image2.png Add files via upload 2024-01-05 14:32:42 +01:00
README.md Update README.md 2024-01-05 14:16:20 +01:00

Fisher Exact Test

{:.no_toc}

* TOC {:toc}

Top

Questions to David Rotermund

scipy.stats.fisher_exact

scipy.stats.fisher_exact(table, alternative='two-sided')

Perform a Fisher exact test on a 2x2 contingency table.

The null hypothesis is that the true odds ratio of the populations underlying the observations is one, and the observations were sampled from these populations under a condition: the marginals of the resulting table must equal those of the observed table. The statistic returned is the unconditional maximum likelihood estimate of the odds ratio, and the p-value is the probability under the null hypothesis of obtaining a table at least as extreme as the one that was actually observed. There are other possible choices of statistic and two-sided p-value definition associated with Fishers exact test; please see the Notes for more information.

Parameters:

alternative : {two-sided, less, greater}, optional Defines the alternative hypothesis. The following options are available (default is two-sided):

  • two-sided: the odds ratio of the underlying population is not one (The two-sided p-value is the probability that, under the null hypothesis, a random table would have a probability equal to or less than the probability of the input table.)
  • less: the odds ratio of the underlying population is less than one
  • greater: the odds ratio of the underlying population is greater than one

Returns:

res : SignificanceResult

An object containing attributes:

statistic : float

This is the prior odds ratio, not a posterior estimate.

pvalue : float

The probability under the null hypothesis of obtaining a table at least as extreme as the one that was actually observed.

The input table is a, b], [c, d.

a b
c d

Where N_A = a + c for the elements in group A (performance values of network A with N_A as number of test pattern) and N_B = b + d for the elements in group B (performance values of network B with N_B as number of test pattern).

N_A - c N_B-d
c d

If network architectures are tested, typically, the same data set is used in both conditions and such N = N_A = N_B.

N - c N - d
c d

Example

Group A Group B
Yes 7 17
No 15 5

This translates in to the table: 7, 17], [15, 5

from scipy.stats import fisher_exact

res = fisher_exact([[7, 17], [15, 5]], alternative="less")
print(res.statistic) # -> 0.13725490196078433
print(res.pvalue) # -> 0.0028841933752349743