Class FisherExactTest

java.lang.Object
org.apache.commons.statistics.inference.FisherExactTest

public final class FisherExactTest extends Object
Implements Fisher's exact test.

Performs an exact test for the statistical significance of the association (contingency) between two kinds of categorical classification.

Fisher's test applies in the case that the row sums and column sums are fixed in advance and not random.

Since:
1.1
See Also:
  • Field Details

  • Constructor Details

    • FisherExactTest

      private FisherExactTest(AlternativeHypothesis alternative)
      Parameters:
      alternative - Alternative hypothesis.
  • Method Details

    • withDefaults

      public static FisherExactTest withDefaults()
      Return an instance using the default options.
      Returns:
      default instance
    • with

      Return an instance with the configured alternative hypothesis.
      Parameters:
      v - Value.
      Returns:
      an instance
    • statistic

      public double statistic(int[][] table)
      Compute the prior odds ratio for the 2-by-2 contingency table. This is the "sample" or "unconditional" maximum likelihood estimate. For a table of:

      \[ \left[ {\begin{array}{cc} a & b \\ c & d \\ \end{array} } \right] \]

      this is:

      \[ r = \frac{a d}{b c} \]

      Special cases:

      • If the denominator is zero, the value is infinity.
      • If a row or column sum is zero, the value is NaN.

      Note: This statistic is equal to the statistic computed by the SciPy function scipy.stats.fisher_exact. It is different to the conditional maximum likelihood estimate computed by R function fisher.test.

      Parameters:
      table - 2-by-2 contingency table.
      Returns:
      odds ratio
      Throws:
      IllegalArgumentException - if the table is not a 2-by-2 table; any table entry is negative; or the sum of the table is 0 or larger than a 32-bit signed integer.
      See Also:
    • test

      public SignificanceResult test(int[][] table)
      Performs Fisher's exact test on the 2-by-2 contingency table.

      The test statistic is equal to the prior odds ratio. This is the "sample" or "unconditional" maximum likelihood estimate.

      The test is defined by the AlternativeHypothesis.

      For a table of [[a, b], [c, d]] the possible values of any table are conditioned with the same marginals (row and column totals). In this case the possible values x of the upper-left element a are min(0, a - d) <= x <= a + min(b, c).

      • 'two-sided': the odds ratio of the underlying population is not one; the p-value is the probability that a random table has probability equal to or less than the input table.
      • 'greater': the odds ratio of the underlying population is greater than one; the p-value is the probability that a random table has x >= a.
      • 'less': the odds ratio of the underlying population is less than one; the p-value is the probability that a random table has x <= a.
      Parameters:
      table - 2-by-2 contingency table.
      Returns:
      test result
      Throws:
      IllegalArgumentException - if the table is not a 2-by-2 table; any table entry is negative; or the sum of the table is 0 or larger than a 32-bit signed integer.
      See Also:
    • twoSidedTest

      private static double twoSidedTest(int k, HypergeometricDistribution distribution)
      Returns the observed significance level, or p-value, associated with a two-sided test about the observed value.
      Parameters:
      k - Observed value.
      distribution - Hypergeometric distribution.
      Returns:
      p-value