Class KolmogorovSmirnovDistribution.Two

  • Enclosing class:
    KolmogorovSmirnovDistribution

    static final class KolmogorovSmirnovDistribution.Two
    extends java.lang.Object
    Computes the complementary probability P[D_n >= x], or survival function (SF), for the two-sided one-sample Kolmogorov-Smirnov distribution.
     D_n = sup_x |F(x) - CDF_n(x)|
     

    where n is the sample size; CDF_n(x) is an empirical cumulative distribution function; and F(x) is the expected distribution.

    References:

    1. Simard, R., & L’Ecuyer, P. (2011). Computing the Two-Sided Kolmogorov-Smirnov Distribution. Journal of Statistical Software, 39(11), 1–18.
    2. Marsaglia, G., Tsang, W. W., & Wang, J. (2003). Evaluating Kolmogorov's Distribution. Journal of Statistical Software, 8(18), 1–4.

    Note that [2] contains an error in computing h, refer to MATH-437 for details.

    Since:
    1.1
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private static double FOUR_A
      Factor 4a in the quadratic equation to solve max k: log(2^-52) * 8.
      private static double HALF
      1/2.
      private static int LOG_MIN_NORMAL
      Approximate threshold for ln(MIN_NORMAL).
      private static double LOG_PG_MIN
      Threshold for Pelz-Good where the 1 - CDF == 1.
      private static int MAX_FACTORIAL
      Maximum finite factorial.
      private static double MTW_SCALE_THRESHOLD
      The scaling threshold in the MTW algorithm.
      private static double MTW_UP_SCALE
      The up-scaling factor in the MTW algorithm.
      private static int MTW_UP_SCALE_POWER
      The power-of-2 of the up-scaling factor in the MTW algorithm, n if the up-scale factor is 2^n.
      private static int N_100000
      100000, n threshold for large n Durbin matrix sf computation.
      private static int N140
      140, n threshold for small n for the sf computation.
      private static double NX32_1_4
      1.4, nx^(3/2) threshold for large n Durbin matrix sf computation.
      private static double NXX_0_754693
      0.754693, nxx threshold for small n Durbin matrix sf computation.
      private static double NXX_2_2
      2.2, nxx threshold for large n Miller approximation sf computation.
      private static int NXX_4
      4, nxx threshold for small n Pomeranz sf computation.
      private static double P_DOWN_SCALE
      The scaling threshold in the Pomeranz algorithm.
      private static int P_SCALE_POWER
      The power-of-2 of the up-scaling factor in the Pomeranz algorithm, n if the up-scale factor is 2^n.
      private static double P_UP_SCALE
      The up-scaling factor in the Pomeranz algorithm.
      private static double PI2
      pi^2.
      private static double PI4
      pi^4.
      private static double PI6
      pi^6.
      private static double ROOT_HALF_PI
      sqrt(pi/2).
      private static double ROOT_TWO_PI
      sqrt(2*pi).
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      private Two()
      No instances.
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      (package private) static void computeA​(int n, double t, int[] amt, int[] apt)
      Compute the factors floor(A-t) and ceil(A+t).
      private static void computeAP​(double[] p, double z)
      Compute the power factors.
      private static SquareMatrixSupport.RealSquareMatrix createH​(double x, int n)
      Creates H of size m x m as described in [1].
      private static double durbinMTW​(double x, int n)
      Computes the Durbin matrix approximation for P(D_n < d) using the method of Marsaglia, Tsang and Wang (2003).
      (package private) static double pelzGood​(double x, int n)
      Computes the Pelz-Good approximation for P(D_n >= d) as described in Simard and L’Ecuyer (2011).
      private static double pomeranz​(double x, int n)
      Computes the Pomeranz approximation for P(D_n < d) using the method as described in Simard and L’Ecuyer (2011).
      (package private) static double sf​(double x, int n)
      Calculates complementary probability P[D_n >= x] for the two-sided one-sample Kolmogorov-Smirnov distribution.
      private static double sfExact​(double x, int n)
      Calculates exact cases for the complementary probability P[D_n >= x] the two-sided one-sample Kolmogorov-Smirnov distribution.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • LOG_PG_MIN

        private static final double LOG_PG_MIN
        Threshold for Pelz-Good where the 1 - CDF == 1. Occurs when sqrt(2pi/z) exp(-pi^2 / (8 z^2)) is far below 2^-53. Threshold set at exp(-pi^2 / (8 z^2)) = 2^-80.
        See Also:
        Constant Field Values
      • FOUR_A

        private static final double FOUR_A
        Factor 4a in the quadratic equation to solve max k: log(2^-52) * 8.
        See Also:
        Constant Field Values
      • MTW_SCALE_THRESHOLD

        private static final double MTW_SCALE_THRESHOLD
        The scaling threshold in the MTW algorithm. Marsaglia used 1e-140. This uses 2^-400 ~ 3.87e-121.
        See Also:
        Constant Field Values
      • MTW_UP_SCALE

        private static final double MTW_UP_SCALE
        The up-scaling factor in the MTW algorithm. Marsaglia used 1e140. This uses 2^400 ~ 2.58e120.
        See Also:
        Constant Field Values
      • MTW_UP_SCALE_POWER

        private static final int MTW_UP_SCALE_POWER
        The power-of-2 of the up-scaling factor in the MTW algorithm, n if the up-scale factor is 2^n.
        See Also:
        Constant Field Values
      • P_DOWN_SCALE

        private static final double P_DOWN_SCALE
        The scaling threshold in the Pomeranz algorithm.
        See Also:
        Constant Field Values
      • P_UP_SCALE

        private static final double P_UP_SCALE
        The up-scaling factor in the Pomeranz algorithm.
        See Also:
        Constant Field Values
      • P_SCALE_POWER

        private static final int P_SCALE_POWER
        The power-of-2 of the up-scaling factor in the Pomeranz algorithm, n if the up-scale factor is 2^n.
        See Also:
        Constant Field Values
      • MAX_FACTORIAL

        private static final int MAX_FACTORIAL
        Maximum finite factorial.
        See Also:
        Constant Field Values
      • LOG_MIN_NORMAL

        private static final int LOG_MIN_NORMAL
        Approximate threshold for ln(MIN_NORMAL).
        See Also:
        Constant Field Values
      • N140

        private static final int N140
        140, n threshold for small n for the sf computation.
        See Also:
        Constant Field Values
      • NXX_0_754693

        private static final double NXX_0_754693
        0.754693, nxx threshold for small n Durbin matrix sf computation.
        See Also:
        Constant Field Values
      • NXX_4

        private static final int NXX_4
        4, nxx threshold for small n Pomeranz sf computation.
        See Also:
        Constant Field Values
      • NXX_2_2

        private static final double NXX_2_2
        2.2, nxx threshold for large n Miller approximation sf computation.
        See Also:
        Constant Field Values
      • N_100000

        private static final int N_100000
        100000, n threshold for large n Durbin matrix sf computation.
        See Also:
        Constant Field Values
      • NX32_1_4

        private static final double NX32_1_4
        1.4, nx^(3/2) threshold for large n Durbin matrix sf computation.
        See Also:
        Constant Field Values
    • Constructor Detail

      • Two

        private Two()
        No instances.
    • Method Detail

      • sf

        static double sf​(double x,
                         int n)
        Calculates complementary probability P[D_n >= x] for the two-sided one-sample Kolmogorov-Smirnov distribution.
        Parameters:
        x - Statistic.
        n - Sample size (assumed to be positive).
        Returns:
        \(P(D_n ≥ x)\)
      • sfExact

        private static double sfExact​(double x,
                                      int n)
        Calculates exact cases for the complementary probability P[D_n >= x] the two-sided one-sample Kolmogorov-Smirnov distribution.

        Exact cases handle x not in [0, 1]. It is assumed n is positive.

        Parameters:
        x - Statistic.
        n - Sample size (assumed to be positive).
        Returns:
        \(P(D_n ≥ x)\)
      • durbinMTW

        private static double durbinMTW​(double x,
                                        int n)
        Computes the Durbin matrix approximation for P(D_n < d) using the method of Marsaglia, Tsang and Wang (2003).
        Parameters:
        x - Statistic.
        n - Sample size (assumed to be positive).
        Returns:
        \(P(D_n < x)\)
      • createH

        private static SquareMatrixSupport.RealSquareMatrix createH​(double x,
                                                                    int n)
        Creates H of size m x m as described in [1].
        Parameters:
        x - Statistic.
        n - Sample size (assumed to be positive).
        Returns:
        H matrix
      • pomeranz

        private static double pomeranz​(double x,
                                       int n)
        Computes the Pomeranz approximation for P(D_n < d) using the method as described in Simard and L’Ecuyer (2011).

        Modifications have been made to the scaling of the intermediate values.

        Parameters:
        x - Statistic.
        n - Sample size (assumed to be positive).
        Returns:
        \(P(D_n < x)\)
      • computeAP

        private static void computeAP​(double[] p,
                                      double z)
        Compute the power factors.
         factor[j] = z^j / j!
         
        Parameters:
        p - Power factors.
        z - (A[i] - A[i-1]) / n
      • computeA

        static void computeA​(int n,
                             double t,
                             int[] amt,
                             int[] apt)
        Compute the factors floor(A-t) and ceil(A+t). Arrays should have length 2n+3.
        Parameters:
        n - Sample size.
        t - Statistic x multiplied by n.
        amt - floor(A-t)
        apt - ceil(A+t)
      • pelzGood

        static double pelzGood​(double x,
                               int n)
        Computes the Pelz-Good approximation for P(D_n >= d) as described in Simard and L’Ecuyer (2011).

        This has been modified to compute the complementary CDF by subtracting the terms k0, k1, k2, k3 from 1. For use in computing the CDF the method should be updated to return the sum of k0 ... k3.

        Parameters:
        x - Statistic.
        n - Sample size (assumed to be positive).
        Returns:
        \(P(D_n ≥ x)\)
        Throws:
        java.lang.ArithmeticException - if the series does not converge