Class TruncatedNormalDistribution

java.lang.Object
org.apache.commons.statistics.distribution.AbstractContinuousDistribution
org.apache.commons.statistics.distribution.TruncatedNormalDistribution
All Implemented Interfaces:
ContinuousDistribution

public final class TruncatedNormalDistribution extends AbstractContinuousDistribution
Implementation of the truncated normal distribution.

The probability density function of \( X \) is:

\[ f(x;\mu,\sigma,a,b) = \frac{1}{\sigma}\,\frac{\phi(\frac{x - \mu}{\sigma})}{\Phi(\frac{b - \mu}{\sigma}) - \Phi(\frac{a - \mu}{\sigma}) } \]

for \( \mu \) mean of the parent normal distribution, \( \sigma \) standard deviation of the parent normal distribution, \( -\infty \le a \lt b \le \infty \) the truncation interval, and \( x \in [a, b] \), where \( \phi \) is the probability density function of the standard normal distribution and \( \Phi \) is its cumulative distribution function.

See Also:
  • Nested Class Summary

    Nested classes/interfaces inherited from interface org.apache.commons.statistics.distribution.ContinuousDistribution

    ContinuousDistribution.Sampler
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    private final double
    Stored value of parentNormal.cumulativeProbability(lower).
    private final double
    Stored value of parentNormal.probability(lower, upper).
    private final double
    log(cdfDelta).
    private final double
    Lower bound of this distribution.
    private static final double
    The max allowed value for x where (x*x) will not overflow.
    private static final double
    The min allowed probability range of the parent normal distribution.
    private final NormalDistribution
    Parent normal distribution.
    private static final double
    The threshold to switch to a rejection sampler.
    private static final double
    Normalisation constant 2 / sqrt(2 pi) = sqrt(2 / pi).
    private static final double
    Normalisation constant sqrt(2 pi) / 2 = sqrt(pi / 2).
    private static final double
    sqrt(2).
    private final double
    Stored value of parentNormal.survivalProbability(upper).
    private final double
    Upper bound of this distribution.
  • Constructor Summary

    Constructors
    Modifier
    Constructor
    Description
    private
    TruncatedNormalDistribution(NormalDistribution parent, double z, double lower, double upper)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    private static double
    clip(double x, double lower, double upper)
    Clip the value to the range [lower, upper].
    private double
    clipToRange(double x)
    Clip the value to the range [lower, upper].
    createSampler(org.apache.commons.rng.UniformRandomProvider rng)
    Creates a sampler.
    double
    For a random variable X whose values are distributed according to this distribution, this method returns P(X <= x).
    double
    density(double x)
    Returns the probability density function (PDF) of this distribution evaluated at the specified point x.
    double
    Gets the mean of this distribution.
    double
    Gets the lower bound of the support.
    double
    Gets the upper bound of the support.
    double
    Gets the variance of this distribution.
    double
    Computes the quantile function of this distribution.
    double
    Computes the inverse survival probability function of this distribution.
    double
    logDensity(double x)
    Returns the natural logarithm of the probability density function (PDF) of this distribution evaluated at the specified point x.
    (package private) static double
    moment1(double a, double b)
    Compute the first moment (mean) of the truncated standard normal distribution.
    private static double
    moment2(double a, double b)
    Compute the second moment of the truncated standard normal distribution.
    of(double mean, double sd, double lower, double upper)
    Creates a truncated normal distribution.
    double
    probability(double x0, double x1)
    For a random variable X whose values are distributed according to this distribution, this method returns P(x0 < X <= x1).
    double
    For a random variable X whose values are distributed according to this distribution, this method returns P(X > x).
    (package private) static double
    variance(double a, double b)
    Compute the variance of the truncated standard normal distribution.

    Methods inherited from class org.apache.commons.statistics.distribution.AbstractContinuousDistribution

    getMedian, isSupportConnected

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • MAX_X

      private static final double MAX_X
      The max allowed value for x where (x*x) will not overflow. This is a limit on computation of the moments of the truncated normal as some calculations assume x*x is finite. Value is sqrt(MAX_VALUE).
      See Also:
    • MIN_P

      private static final double MIN_P
      The min allowed probability range of the parent normal distribution. Set to 0.0. This may be too low for accurate usage. It is a signal that the truncation is invalid.
      See Also:
    • ROOT2

      private static final double ROOT2
      sqrt(2).
      See Also:
    • ROOT_2_PI

      private static final double ROOT_2_PI
      Normalisation constant 2 / sqrt(2 pi) = sqrt(2 / pi).
      See Also:
    • ROOT_PI_2

      private static final double ROOT_PI_2
      Normalisation constant sqrt(2 pi) / 2 = sqrt(pi / 2).
      See Also:
    • REJECTION_THRESHOLD

      private static final double REJECTION_THRESHOLD
      The threshold to switch to a rejection sampler. When the truncated distribution covers more than this fraction of the CDF then rejection sampling will be more efficient than inverse CDF sampling. Performance benchmarks indicate that a normalized Gaussian sampler is up to 10 times faster than inverse transform sampling using a fast random generator. See STATISTICS-55.
      See Also:
    • parentNormal

      private final NormalDistribution parentNormal
      Parent normal distribution.
    • lower

      private final double lower
      Lower bound of this distribution.
    • upper

      private final double upper
      Upper bound of this distribution.
    • cdfDelta

      private final double cdfDelta
      Stored value of parentNormal.probability(lower, upper). This is used to normalise the probability computations.
    • logCdfDelta

      private final double logCdfDelta
      log(cdfDelta).
    • cdfAlpha

      private final double cdfAlpha
      Stored value of parentNormal.cumulativeProbability(lower). Used to map a probability into the range of the parent normal distribution.
    • sfBeta

      private final double sfBeta
      Stored value of parentNormal.survivalProbability(upper). Used to map a probability into the range of the parent normal distribution.
  • Constructor Details

    • TruncatedNormalDistribution

      private TruncatedNormalDistribution(NormalDistribution parent, double z, double lower, double upper)
      Parameters:
      parent - Parent distribution.
      z - Probability of the parent distribution for [lower, upper].
      lower - Lower bound (inclusive) of the distribution, can be Double.NEGATIVE_INFINITY.
      upper - Upper bound (inclusive) of the distribution, can be Double.POSITIVE_INFINITY.
  • Method Details

    • of

      public static TruncatedNormalDistribution of(double mean, double sd, double lower, double upper)
      Creates a truncated normal distribution.

      Note that the mean and sd is of the parent normal distribution, and not the true mean and standard deviation of the truncated normal distribution. The lower and upper bounds define the truncation of the parent normal distribution.

      Parameters:
      mean - Mean for the parent distribution.
      sd - Standard deviation for the parent distribution.
      lower - Lower bound (inclusive) of the distribution, can be Double.NEGATIVE_INFINITY.
      upper - Upper bound (inclusive) of the distribution, can be Double.POSITIVE_INFINITY.
      Returns:
      the distribution
      Throws:
      IllegalArgumentException - if sd <= 0; if lower >= upper; or if the truncation covers no probability range in the parent distribution.
    • density

      public double density(double x)
      Returns the probability density function (PDF) of this distribution evaluated at the specified point x. In general, the PDF is the derivative of the CDF. If the derivative does not exist at x, then an appropriate replacement should be returned, e.g. Double.POSITIVE_INFINITY, Double.NaN, or the limit inferior or limit superior of the difference quotient.
      Parameters:
      x - Point at which the PDF is evaluated.
      Returns:
      the value of the probability density function at x.
    • probability

      public double probability(double x0, double x1)
      For a random variable X whose values are distributed according to this distribution, this method returns P(x0 < X <= x1). The default implementation uses the identity P(x0 < X <= x1) = P(X <= x1) - P(X <= x0)
      Specified by:
      probability in interface ContinuousDistribution
      Overrides:
      probability in class AbstractContinuousDistribution
      Parameters:
      x0 - Lower bound (exclusive).
      x1 - Upper bound (inclusive).
      Returns:
      the probability that a random variable with this distribution takes a value between x0 and x1, excluding the lower and including the upper endpoint.
    • logDensity

      public double logDensity(double x)
      Returns the natural logarithm of the probability density function (PDF) of this distribution evaluated at the specified point x.
      Parameters:
      x - Point at which the PDF is evaluated.
      Returns:
      the logarithm of the value of the probability density function at x.
    • cumulativeProbability

      public double cumulativeProbability(double x)
      For a random variable X whose values are distributed according to this distribution, this method returns P(X <= x). In other words, this method represents the (cumulative) distribution function (CDF) for this distribution.
      Parameters:
      x - Point at which the CDF is evaluated.
      Returns:
      the probability that a random variable with this distribution takes a value less than or equal to x.
    • survivalProbability

      public double survivalProbability(double x)
      For a random variable X whose values are distributed according to this distribution, this method returns P(X > x). In other words, this method represents the complementary cumulative distribution function.

      By default, this is defined as 1 - cumulativeProbability(x), but the specific implementation may be more accurate.

      Parameters:
      x - Point at which the survival function is evaluated.
      Returns:
      the probability that a random variable with this distribution takes a value greater than x.
    • inverseCumulativeProbability

      public double inverseCumulativeProbability(double p)
      Computes the quantile function of this distribution. For a random variable X distributed according to this distribution, the returned value is:

      \[ x = \begin{cases} \inf \{ x \in \mathbb R : P(X \le x) \ge p\} & \text{for } 0 \lt p \le 1 \\ \inf \{ x \in \mathbb R : P(X \le x) \gt 0 \} & \text{for } p = 0 \end{cases} \]

      The default implementation returns:

      Specified by:
      inverseCumulativeProbability in interface ContinuousDistribution
      Overrides:
      inverseCumulativeProbability in class AbstractContinuousDistribution
      Parameters:
      p - Cumulative probability.
      Returns:
      the smallest p-quantile of this distribution (largest 0-quantile for p = 0).
    • inverseSurvivalProbability

      public double inverseSurvivalProbability(double p)
      Computes the inverse survival probability function of this distribution. For a random variable X distributed according to this distribution, the returned value is:

      \[ x = \begin{cases} \inf \{ x \in \mathbb R : P(X \ge x) \le p\} & \text{for } 0 \le p \lt 1 \\ \inf \{ x \in \mathbb R : P(X \ge x) \lt 1 \} & \text{for } p = 1 \end{cases} \]

      By default, this is defined as inverseCumulativeProbability(1 - p), but the specific implementation may be more accurate.

      The default implementation returns:

      Specified by:
      inverseSurvivalProbability in interface ContinuousDistribution
      Overrides:
      inverseSurvivalProbability in class AbstractContinuousDistribution
      Parameters:
      p - Survival probability.
      Returns:
      the smallest (1-p)-quantile of this distribution (largest 0-quantile for p = 1).
    • createSampler

      public ContinuousDistribution.Sampler createSampler(org.apache.commons.rng.UniformRandomProvider rng)
      Creates a sampler.
      Specified by:
      createSampler in interface ContinuousDistribution
      Overrides:
      createSampler in class AbstractContinuousDistribution
      Parameters:
      rng - Generator of uniformly distributed numbers.
      Returns:
      a sampler that produces random numbers according this distribution.
    • getMean

      public double getMean()
      Gets the mean of this distribution.

      Represents the true mean of the truncated normal distribution rather than the parent normal distribution mean.

      For \( \mu \) mean of the parent normal distribution, \( \sigma \) standard deviation of the parent normal distribution, and \( a \lt b \) the truncation interval of the parent normal distribution, the mean is:

      \[ \mu + \frac{\phi(a)-\phi(b)}{\Phi(b) - \Phi(a)}\sigma \]

      where \( \phi \) is the probability density function of the standard normal distribution and \( \Phi \) is its cumulative distribution function.

      Returns:
      the mean.
    • getVariance

      public double getVariance()
      Gets the variance of this distribution.

      Represents the true variance of the truncated normal distribution rather than the parent normal distribution variance.

      For \( \mu \) mean of the parent normal distribution, \( \sigma \) standard deviation of the parent normal distribution, and \( a \lt b \) the truncation interval of the parent normal distribution, the variance is:

      \[ \sigma^2 \left[1 + \frac{a\phi(a)-b\phi(b)}{\Phi(b) - \Phi(a)} - \left( \frac{\phi(a)-\phi(b)}{\Phi(b) - \Phi(a)} \right)^2 \right] \]

      where \( \phi \) is the probability density function of the standard normal distribution and \( \Phi \) is its cumulative distribution function.

      Returns:
      the variance.
    • getSupportLowerBound

      public double getSupportLowerBound()
      Gets the lower bound of the support. It must return the same value as inverseCumulativeProbability(0), i.e. \( \inf \{ x \in \mathbb R : P(X \le x) \gt 0 \} \).

      The lower bound of the support is equal to the lower bound parameter of the distribution.

      Returns:
      the lower bound of the support.
    • getSupportUpperBound

      public double getSupportUpperBound()
      Gets the upper bound of the support. It must return the same value as inverseCumulativeProbability(1), i.e. \( \inf \{ x \in \mathbb R : P(X \le x) = 1 \} \).

      The upper bound of the support is equal to the upper bound parameter of the distribution.

      Returns:
      the upper bound of the support.
    • clipToRange

      private double clipToRange(double x)
      Clip the value to the range [lower, upper]. This is used to handle floating-point error at the support bound.
      Parameters:
      x - Value x
      Returns:
      x clipped to the range
    • clip

      private static double clip(double x, double lower, double upper)
      Clip the value to the range [lower, upper].
      Parameters:
      x - Value x
      lower - Lower bound (inclusive)
      upper - Upper bound (inclusive)
      Returns:
      x clipped to the range
    • moment1

      static double moment1(double a, double b)
      Compute the first moment (mean) of the truncated standard normal distribution.

      Assumes a <= b.

      Parameters:
      a - Lower bound
      b - Upper bound
      Returns:
      the first moment
    • moment2

      private static double moment2(double a, double b)
      Compute the second moment of the truncated standard normal distribution.

      Assumes a <= b.

      Parameters:
      a - Lower bound
      b - Upper bound
      Returns:
      the first moment
    • variance

      static double variance(double a, double b)
      Compute the variance of the truncated standard normal distribution.

      Assumes a <= b.

      Parameters:
      a - Lower bound
      b - Upper bound
      Returns:
      the first moment