Class TruncatedNormalDistribution

  • All Implemented Interfaces:
    ContinuousDistribution

    public final class TruncatedNormalDistribution
    extends AbstractContinuousDistribution
    Implementation of the truncated normal distribution.

    The probability density function of \( X \) is:

    \[ f(x;\mu,\sigma,a,b) = \frac{1}{\sigma}\,\frac{\phi(\frac{x - \mu}{\sigma})}{\Phi(\frac{b - \mu}{\sigma}) - \Phi(\frac{a - \mu}{\sigma}) } \]

    for \( \mu \) mean of the parent normal distribution, \( \sigma \) standard deviation of the parent normal distribution, \( -\infty \le a \lt b \le \infty \) the truncation interval, and \( x \in [a, b] \), where \( \phi \) is the probability density function of the standard normal distribution and \( \Phi \) is its cumulative distribution function.

    See Also:
    Truncated normal distribution (Wikipedia)
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private double cdfAlpha
      Stored value of parentNormal.cumulativeProbability(lower).
      private double cdfDelta
      Stored value of parentNormal.probability(lower, upper).
      private double logCdfDelta
      log(cdfDelta).
      private double lower
      Lower bound of this distribution.
      private static double MAX_X
      The max allowed value for x where (x*x) will not overflow.
      private static double MIN_P
      The min allowed probability range of the parent normal distribution.
      private NormalDistribution parentNormal
      Parent normal distribution.
      private static double REJECTION_THRESHOLD
      The threshold to switch to a rejection sampler.
      private static double ROOT_2_PI
      Normalisation constant 2 / sqrt(2 pi) = sqrt(2 / pi).
      private static double ROOT_PI_2
      Normalisation constant sqrt(2 pi) / 2 = sqrt(pi / 2).
      private static double ROOT2
      sqrt(2).
      private double sfBeta
      Stored value of parentNormal.survivalProbability(upper).
      private double upper
      Upper bound of this distribution.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private static double clip​(double x, double lower, double upper)
      Clip the value to the range [lower, upper].
      private double clipToRange​(double x)
      Clip the value to the range [lower, upper].
      ContinuousDistribution.Sampler createSampler​(org.apache.commons.rng.UniformRandomProvider rng)
      Creates a sampler.
      double cumulativeProbability​(double x)
      For a random variable X whose values are distributed according to this distribution, this method returns P(X <= x).
      double density​(double x)
      Returns the probability density function (PDF) of this distribution evaluated at the specified point x.
      double getMean()
      Gets the mean of this distribution.
      double getSupportLowerBound()
      Gets the lower bound of the support.
      double getSupportUpperBound()
      Gets the upper bound of the support.
      double getVariance()
      Gets the variance of this distribution.
      double inverseCumulativeProbability​(double p)
      Computes the quantile function of this distribution.
      double inverseSurvivalProbability​(double p)
      Computes the inverse survival probability function of this distribution.
      double logDensity​(double x)
      Returns the natural logarithm of the probability density function (PDF) of this distribution evaluated at the specified point x.
      (package private) static double moment1​(double a, double b)
      Compute the first moment (mean) of the truncated standard normal distribution.
      private static double moment2​(double a, double b)
      Compute the second moment of the truncated standard normal distribution.
      static TruncatedNormalDistribution of​(double mean, double sd, double lower, double upper)
      Creates a truncated normal distribution.
      double probability​(double x0, double x1)
      For a random variable X whose values are distributed according to this distribution, this method returns P(x0 < X <= x1).
      double survivalProbability​(double x)
      For a random variable X whose values are distributed according to this distribution, this method returns P(X > x).
      (package private) static double variance​(double a, double b)
      Compute the variance of the truncated standard normal distribution.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • MAX_X

        private static final double MAX_X
        The max allowed value for x where (x*x) will not overflow. This is a limit on computation of the moments of the truncated normal as some calculations assume x*x is finite. Value is sqrt(MAX_VALUE).
        See Also:
        Constant Field Values
      • MIN_P

        private static final double MIN_P
        The min allowed probability range of the parent normal distribution. Set to 0.0. This may be too low for accurate usage. It is a signal that the truncation is invalid.
        See Also:
        Constant Field Values
      • ROOT_2_PI

        private static final double ROOT_2_PI
        Normalisation constant 2 / sqrt(2 pi) = sqrt(2 / pi).
        See Also:
        Constant Field Values
      • ROOT_PI_2

        private static final double ROOT_PI_2
        Normalisation constant sqrt(2 pi) / 2 = sqrt(pi / 2).
        See Also:
        Constant Field Values
      • REJECTION_THRESHOLD

        private static final double REJECTION_THRESHOLD
        The threshold to switch to a rejection sampler. When the truncated distribution covers more than this fraction of the CDF then rejection sampling will be more efficient than inverse CDF sampling. Performance benchmarks indicate that a normalized Gaussian sampler is up to 10 times faster than inverse transform sampling using a fast random generator. See STATISTICS-55.
        See Also:
        Constant Field Values
      • lower

        private final double lower
        Lower bound of this distribution.
      • upper

        private final double upper
        Upper bound of this distribution.
      • cdfDelta

        private final double cdfDelta
        Stored value of parentNormal.probability(lower, upper). This is used to normalise the probability computations.
      • logCdfDelta

        private final double logCdfDelta
        log(cdfDelta).
      • cdfAlpha

        private final double cdfAlpha
        Stored value of parentNormal.cumulativeProbability(lower). Used to map a probability into the range of the parent normal distribution.
      • sfBeta

        private final double sfBeta
        Stored value of parentNormal.survivalProbability(upper). Used to map a probability into the range of the parent normal distribution.
    • Constructor Detail

      • TruncatedNormalDistribution

        private TruncatedNormalDistribution​(NormalDistribution parent,
                                            double z,
                                            double lower,
                                            double upper)
        Parameters:
        parent - Parent distribution.
        z - Probability of the parent distribution for [lower, upper].
        lower - Lower bound (inclusive) of the distribution, can be Double.NEGATIVE_INFINITY.
        upper - Upper bound (inclusive) of the distribution, can be Double.POSITIVE_INFINITY.
    • Method Detail

      • of

        public static TruncatedNormalDistribution of​(double mean,
                                                     double sd,
                                                     double lower,
                                                     double upper)
        Creates a truncated normal distribution.

        Note that the mean and sd is of the parent normal distribution, and not the true mean and standard deviation of the truncated normal distribution. The lower and upper bounds define the truncation of the parent normal distribution.

        Parameters:
        mean - Mean for the parent distribution.
        sd - Standard deviation for the parent distribution.
        lower - Lower bound (inclusive) of the distribution, can be Double.NEGATIVE_INFINITY.
        upper - Upper bound (inclusive) of the distribution, can be Double.POSITIVE_INFINITY.
        Returns:
        the distribution
        Throws:
        java.lang.IllegalArgumentException - if sd <= 0; if lower >= upper; or if the truncation covers no probability range in the parent distribution.
      • density

        public double density​(double x)
        Returns the probability density function (PDF) of this distribution evaluated at the specified point x. In general, the PDF is the derivative of the CDF. If the derivative does not exist at x, then an appropriate replacement should be returned, e.g. Double.POSITIVE_INFINITY, Double.NaN, or the limit inferior or limit superior of the difference quotient.
        Parameters:
        x - Point at which the PDF is evaluated.
        Returns:
        the value of the probability density function at x.
      • probability

        public double probability​(double x0,
                                  double x1)
        For a random variable X whose values are distributed according to this distribution, this method returns P(x0 < X <= x1). The default implementation uses the identity P(x0 < X <= x1) = P(X <= x1) - P(X <= x0)
        Specified by:
        probability in interface ContinuousDistribution
        Overrides:
        probability in class AbstractContinuousDistribution
        Parameters:
        x0 - Lower bound (exclusive).
        x1 - Upper bound (inclusive).
        Returns:
        the probability that a random variable with this distribution takes a value between x0 and x1, excluding the lower and including the upper endpoint.
      • logDensity

        public double logDensity​(double x)
        Returns the natural logarithm of the probability density function (PDF) of this distribution evaluated at the specified point x.
        Parameters:
        x - Point at which the PDF is evaluated.
        Returns:
        the logarithm of the value of the probability density function at x.
      • cumulativeProbability

        public double cumulativeProbability​(double x)
        For a random variable X whose values are distributed according to this distribution, this method returns P(X <= x). In other words, this method represents the (cumulative) distribution function (CDF) for this distribution.
        Parameters:
        x - Point at which the CDF is evaluated.
        Returns:
        the probability that a random variable with this distribution takes a value less than or equal to x.
      • survivalProbability

        public double survivalProbability​(double x)
        For a random variable X whose values are distributed according to this distribution, this method returns P(X > x). In other words, this method represents the complementary cumulative distribution function.

        By default, this is defined as 1 - cumulativeProbability(x), but the specific implementation may be more accurate.

        Parameters:
        x - Point at which the survival function is evaluated.
        Returns:
        the probability that a random variable with this distribution takes a value greater than x.
      • inverseSurvivalProbability

        public double inverseSurvivalProbability​(double p)
        Computes the inverse survival probability function of this distribution. For a random variable X distributed according to this distribution, the returned value is:

        \[ x = \begin{cases} \inf \{ x \in \mathbb R : P(X \gt x) \le p\} & \text{for } 0 \le p \lt 1 \\ \inf \{ x \in \mathbb R : P(X \gt x) \lt 1 \} & \text{for } p = 1 \end{cases} \]

        By default, this is defined as inverseCumulativeProbability(1 - p), but the specific implementation may be more accurate.

        The default implementation returns:

        Specified by:
        inverseSurvivalProbability in interface ContinuousDistribution
        Overrides:
        inverseSurvivalProbability in class AbstractContinuousDistribution
        Parameters:
        p - Survival probability.
        Returns:
        the smallest (1-p)-quantile of this distribution (largest 0-quantile for p = 1).
      • getMean

        public double getMean()
        Gets the mean of this distribution.

        Represents the true mean of the truncated normal distribution rather than the parent normal distribution mean.

        For \( \mu \) mean of the parent normal distribution, \( \sigma \) standard deviation of the parent normal distribution, and \( a \lt b \) the truncation interval of the parent normal distribution, the mean is:

        \[ \mu + \frac{\phi(a)-\phi(b)}{\Phi(b) - \Phi(a)}\sigma \]

        where \( \phi \) is the probability density function of the standard normal distribution and \( \Phi \) is its cumulative distribution function.

        Returns:
        the mean.
      • getVariance

        public double getVariance()
        Gets the variance of this distribution.

        Represents the true variance of the truncated normal distribution rather than the parent normal distribution variance.

        For \( \mu \) mean of the parent normal distribution, \( \sigma \) standard deviation of the parent normal distribution, and \( a \lt b \) the truncation interval of the parent normal distribution, the variance is:

        \[ \sigma^2 \left[1 + \frac{a\phi(a)-b\phi(b)}{\Phi(b) - \Phi(a)} - \left( \frac{\phi(a)-\phi(b)}{\Phi(b) - \Phi(a)} \right)^2 \right] \]

        where \( \phi \) is the probability density function of the standard normal distribution and \( \Phi \) is its cumulative distribution function.

        Returns:
        the variance.
      • getSupportLowerBound

        public double getSupportLowerBound()
        Gets the lower bound of the support. It must return the same value as inverseCumulativeProbability(0), i.e. \( \inf \{ x \in \mathbb R : P(X \le x) \gt 0 \} \).

        The lower bound of the support is equal to the lower bound parameter of the distribution.

        Returns:
        the lower bound of the support.
      • getSupportUpperBound

        public double getSupportUpperBound()
        Gets the upper bound of the support. It must return the same value as inverseCumulativeProbability(1), i.e. \( \inf \{ x \in \mathbb R : P(X \le x) = 1 \} \).

        The upper bound of the support is equal to the upper bound parameter of the distribution.

        Returns:
        the upper bound of the support.
      • clipToRange

        private double clipToRange​(double x)
        Clip the value to the range [lower, upper]. This is used to handle floating-point error at the support bound.
        Parameters:
        x - Value x
        Returns:
        x clipped to the range
      • clip

        private static double clip​(double x,
                                   double lower,
                                   double upper)
        Clip the value to the range [lower, upper].
        Parameters:
        x - Value x
        lower - Lower bound (inclusive)
        upper - Upper bound (inclusive)
        Returns:
        x clipped to the range
      • moment1

        static double moment1​(double a,
                              double b)
        Compute the first moment (mean) of the truncated standard normal distribution.

        Assumes a <= b.

        Parameters:
        a - Lower bound
        b - Upper bound
        Returns:
        the first moment
      • moment2

        private static double moment2​(double a,
                                      double b)
        Compute the second moment of the truncated standard normal distribution.

        Assumes a <= b.

        Parameters:
        a - Lower bound
        b - Upper bound
        Returns:
        the first moment
      • variance

        static double variance​(double a,
                               double b)
        Compute the variance of the truncated standard normal distribution.

        Assumes a <= b.

        Parameters:
        a - Lower bound
        b - Upper bound
        Returns:
        the first moment