Class HypergeometricDistribution

java.lang.Object
org.apache.commons.statistics.distribution.AbstractDiscreteDistribution
org.apache.commons.statistics.distribution.HypergeometricDistribution
All Implemented Interfaces:
DiscreteDistribution

public final class HypergeometricDistribution extends AbstractDiscreteDistribution
Implementation of the hypergeometric distribution.

The probability mass function of \( X \) is:

\[ f(k; N, K, n) = \frac{\binom{K}{k} \binom{N - K}{n-k}}{\binom{N}{n}} \]

for \( N \in \{0, 1, 2, \dots\} \) the population size, \( K \in \{0, 1, \dots, N\} \) the number of success states, \( n \in \{0, 1, \dots, N\} \) the number of samples, \( k \in \{\max(0, n+K-N), \dots, \min(n, K)\} \) the number of successes, and

\[ \binom{a}{b} = \frac{a!}{b! \, (a-b)!} \]

is the binomial coefficient.

See Also:
  • Nested Class Summary

    Nested classes/interfaces inherited from interface org.apache.commons.statistics.distribution.DiscreteDistribution

    DiscreteDistribution.Sampler
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    private final int
    The lower bound of the support (inclusive).
    private final int
    The number of successes in the population.
    private final double
    Binomial probability of success (sampleSize / populationSize).
    private final int
    The population size.
    private final double
    Binomial probability of failure ((populationSize - sampleSize) / populationSize).
    private final int
    The sample size.
    private final int
    The upper bound of the support (inclusive).
  • Constructor Summary

    Constructors
    Modifier
    Constructor
    Description
    private
    HypergeometricDistribution(int populationSize, int numberOfSuccesses, int sampleSize)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    private double
    Compute the log probability.
    double
    For a random variable X whose values are distributed according to this distribution, this method returns P(X <= x).
    private static int
    getLowerDomain(int nn, int k, int n)
    Return the lowest domain value for the given hypergeometric distribution parameters.
    double
    Gets the mean of this distribution.
    int
    Gets the number of successes parameter of this distribution.
    int
    Gets the population size parameter of this distribution.
    int
    Gets the sample size parameter of this distribution.
    int
    Gets the lower bound of the support.
    int
    Gets the upper bound of the support.
    private static int
    getUpperDomain(int k, int n)
    Return the highest domain value for the given hypergeometric distribution parameters.
    double
    Gets the variance of this distribution.
    private double
    innerCumulativeProbability(int x0, int x1)
    For this distribution, X, this method returns P(x0 <= X <= x1).
    double
    For a random variable X whose values are distributed according to this distribution, this method returns log(P(X = x)), where log is the natural logarithm.
    of(int populationSize, int numberOfSuccesses, int sampleSize)
    Creates a hypergeometric distribution.
    double
    probability(int x)
    For a random variable X whose values are distributed according to this distribution, this method returns P(X = x).
    double
    For a random variable X whose values are distributed according to this distribution, this method returns P(X > x).

    Methods inherited from class org.apache.commons.statistics.distribution.AbstractDiscreteDistribution

    createSampler, getMedian, inverseCumulativeProbability, inverseSurvivalProbability, probability

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • numberOfSuccesses

      private final int numberOfSuccesses
      The number of successes in the population.
    • populationSize

      private final int populationSize
      The population size.
    • sampleSize

      private final int sampleSize
      The sample size.
    • lowerBound

      private final int lowerBound
      The lower bound of the support (inclusive).
    • upperBound

      private final int upperBound
      The upper bound of the support (inclusive).
    • p

      private final double p
      Binomial probability of success (sampleSize / populationSize).
    • q

      private final double q
      Binomial probability of failure ((populationSize - sampleSize) / populationSize).
  • Constructor Details

    • HypergeometricDistribution

      private HypergeometricDistribution(int populationSize, int numberOfSuccesses, int sampleSize)
      Parameters:
      populationSize - Population size.
      numberOfSuccesses - Number of successes in the population.
      sampleSize - Sample size.
  • Method Details

    • of

      public static HypergeometricDistribution of(int populationSize, int numberOfSuccesses, int sampleSize)
      Creates a hypergeometric distribution.
      Parameters:
      populationSize - Population size.
      numberOfSuccesses - Number of successes in the population.
      sampleSize - Sample size.
      Returns:
      the distribution
      Throws:
      IllegalArgumentException - if numberOfSuccesses < 0, or populationSize <= 0 or numberOfSuccesses > populationSize, or sampleSize > populationSize.
    • getLowerDomain

      private static int getLowerDomain(int nn, int k, int n)
      Return the lowest domain value for the given hypergeometric distribution parameters.
      Parameters:
      nn - Population size.
      k - Number of successes in the population.
      n - Sample size.
      Returns:
      the lowest domain value of the hypergeometric distribution.
    • getUpperDomain

      private static int getUpperDomain(int k, int n)
      Return the highest domain value for the given hypergeometric distribution parameters.
      Parameters:
      k - Number of successes in the population.
      n - Sample size.
      Returns:
      the highest domain value of the hypergeometric distribution.
    • getPopulationSize

      public int getPopulationSize()
      Gets the population size parameter of this distribution.
      Returns:
      the population size.
    • getNumberOfSuccesses

      public int getNumberOfSuccesses()
      Gets the number of successes parameter of this distribution.
      Returns:
      the number of successes.
    • getSampleSize

      public int getSampleSize()
      Gets the sample size parameter of this distribution.
      Returns:
      the sample size.
    • probability

      public double probability(int x)
      For a random variable X whose values are distributed according to this distribution, this method returns P(X = x). In other words, this method represents the probability mass function (PMF) for the distribution.
      Parameters:
      x - Point at which the PMF is evaluated.
      Returns:
      the value of the probability mass function at x.
    • logProbability

      public double logProbability(int x)
      For a random variable X whose values are distributed according to this distribution, this method returns log(P(X = x)), where log is the natural logarithm.
      Parameters:
      x - Point at which the PMF is evaluated.
      Returns:
      the logarithm of the value of the probability mass function at x.
    • computeLogProbability

      private double computeLogProbability(int x)
      Compute the log probability.
      Parameters:
      x - Value.
      Returns:
      log(P(X = x))
    • cumulativeProbability

      public double cumulativeProbability(int x)
      For a random variable X whose values are distributed according to this distribution, this method returns P(X <= x). In other, words, this method represents the (cumulative) distribution function (CDF) for this distribution.
      Parameters:
      x - Point at which the CDF is evaluated.
      Returns:
      the probability that a random variable with this distribution takes a value less than or equal to x.
    • survivalProbability

      public double survivalProbability(int x)
      For a random variable X whose values are distributed according to this distribution, this method returns P(X > x). In other words, this method represents the complementary cumulative distribution function.

      By default, this is defined as 1 - cumulativeProbability(x), but the specific implementation may be more accurate.

      Parameters:
      x - Point at which the survival function is evaluated.
      Returns:
      the probability that a random variable with this distribution takes a value greater than x.
    • innerCumulativeProbability

      private double innerCumulativeProbability(int x0, int x1)
      For this distribution, X, this method returns P(x0 <= X <= x1). This probability is computed by summing the point probabilities for the values x0, x0 + dx, x0 + 2 * dx, ..., x1; the direction dx is determined using a comparison of the input bounds. This should be called by using x0 as the domain limit and x1 as the internal value. This will result in an initial sum of increasing larger magnitudes.
      Parameters:
      x0 - Inclusive domain bound.
      x1 - Inclusive internal bound.
      Returns:
      P(x0 <= X <= x1).
    • getMean

      public double getMean()
      Gets the mean of this distribution.

      For population size \( N \), number of successes \( K \), and sample size \( n \), the mean is:

      \[ n \frac{K}{N} \]

      Returns:
      the mean.
    • getVariance

      public double getVariance()
      Gets the variance of this distribution.

      For population size \( N \), number of successes \( K \), and sample size \( n \), the variance is:

      \[ n \frac{K}{N} \frac{N-K}{N} \frac{N-n}{N-1} \]

      Returns:
      the variance.
    • getSupportLowerBound

      public int getSupportLowerBound()
      Gets the lower bound of the support. This method must return the same value as inverseCumulativeProbability(0), i.e. \( \inf \{ x \in \mathbb Z : P(X \le x) \gt 0 \} \). By convention, Integer.MIN_VALUE should be substituted for negative infinity.

      For population size \( N \), number of successes \( K \), and sample size \( n \), the lower bound of the support is \( \max \{ 0, n + K - N \} \).

      Returns:
      lower bound of the support
    • getSupportUpperBound

      public int getSupportUpperBound()
      Gets the upper bound of the support. This method must return the same value as inverseCumulativeProbability(1), i.e. \( \inf \{ x \in \mathbb Z : P(X \le x) = 1 \} \). By convention, Integer.MAX_VALUE should be substituted for positive infinity.

      For number of successes \( K \), and sample size \( n \), the upper bound of the support is \( \min \{ n, K \} \).

      Returns:
      upper bound of the support