Class KolmogorovSmirnovDistribution.Two

java.lang.Object
org.apache.commons.statistics.inference.KolmogorovSmirnovDistribution.Two
Enclosing class:
KolmogorovSmirnovDistribution

static final class KolmogorovSmirnovDistribution.Two extends Object
Computes the complementary probability P[D_n >= x], or survival function (SF), for the two-sided one-sample Kolmogorov-Smirnov distribution.
 D_n = sup_x |F(x) - CDF_n(x)|
 

where n is the sample size; CDF_n(x) is an empirical cumulative distribution function; and F(x) is the expected distribution.

References:

  1. Simard, R., & L’Ecuyer, P. (2011). Computing the Two-Sided Kolmogorov-Smirnov Distribution. Journal of Statistical Software, 39(11), 1–18.
  2. Marsaglia, G., Tsang, W. W., & Wang, J. (2003). Evaluating Kolmogorov's Distribution. Journal of Statistical Software, 8(18), 1–4.

Note that [2] contains an error in computing h, refer to MATH-437 for details.

Since:
1.1
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    private static final double
    Factor 4a in the quadratic equation to solve max k: log(2^-52) * 8.
    private static final double
    1/2.
    private static final int
    Approximate threshold for ln(MIN_NORMAL).
    private static final double
    Threshold for Pelz-Good where the 1 - CDF == 1.
    private static final int
    Maximum finite factorial.
    private static final double
    The scaling threshold in the MTW algorithm.
    private static final double
    The up-scaling factor in the MTW algorithm.
    private static final int
    The power-of-2 of the up-scaling factor in the MTW algorithm, n if the up-scale factor is 2^n.
    private static final int
    100000, n threshold for large n Durbin matrix sf computation.
    private static final int
    140, n threshold for small n for the sf computation.
    private static final double
    1.4, nx^(3/2) threshold for large n Durbin matrix sf computation.
    private static final double
    0.754693, nxx threshold for small n Durbin matrix sf computation.
    private static final double
    2.2, nxx threshold for large n Miller approximation sf computation.
    private static final int
    4, nxx threshold for small n Pomeranz sf computation.
    private static final double
    The scaling threshold in the Pomeranz algorithm.
    private static final int
    The power-of-2 of the up-scaling factor in the Pomeranz algorithm, n if the up-scale factor is 2^n.
    private static final double
    The up-scaling factor in the Pomeranz algorithm.
    private static final double
    pi^2.
    private static final double
    pi^4.
    private static final double
    pi^6.
    private static final double
    sqrt(pi/2).
    private static final double
    sqrt(2*pi).
  • Constructor Summary

    Constructors
    Modifier
    Constructor
    Description
    private
    Two()
    No instances.
  • Method Summary

    Modifier and Type
    Method
    Description
    (package private) static void
    computeA(int n, double t, int[] amt, int[] apt)
    Compute the factors floor(A-t) and ceil(A+t).
    private static void
    computeAP(double[] p, double z)
    Compute the power factors.
    createH(double x, int n)
    Creates H of size m x m as described in [1].
    private static double
    durbinMTW(double x, int n)
    Computes the Durbin matrix approximation for P(D_n < d) using the method of Marsaglia, Tsang and Wang (2003).
    (package private) static double
    pelzGood(double x, int n)
    Computes the Pelz-Good approximation for P(D_n >= d) as described in Simard and L’Ecuyer (2011).
    private static double
    pomeranz(double x, int n)
    Computes the Pomeranz approximation for P(D_n < d) using the method as described in Simard and L’Ecuyer (2011).
    (package private) static double
    sf(double x, int n)
    Calculates complementary probability P[D_n >= x] for the two-sided one-sample Kolmogorov-Smirnov distribution.
    private static double
    sfExact(double x, int n)
    Calculates exact cases for the complementary probability P[D_n >= x] the two-sided one-sample Kolmogorov-Smirnov distribution.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • PI2

      private static final double PI2
      pi^2.
      See Also:
    • PI4

      private static final double PI4
      pi^4.
      See Also:
    • PI6

      private static final double PI6
      pi^6.
      See Also:
    • ROOT_TWO_PI

      private static final double ROOT_TWO_PI
      sqrt(2*pi).
      See Also:
    • ROOT_HALF_PI

      private static final double ROOT_HALF_PI
      sqrt(pi/2).
      See Also:
    • LOG_PG_MIN

      private static final double LOG_PG_MIN
      Threshold for Pelz-Good where the 1 - CDF == 1. Occurs when sqrt(2pi/z) exp(-pi^2 / (8 z^2)) is far below 2^-53. Threshold set at exp(-pi^2 / (8 z^2)) = 2^-80.
      See Also:
    • FOUR_A

      private static final double FOUR_A
      Factor 4a in the quadratic equation to solve max k: log(2^-52) * 8.
      See Also:
    • MTW_SCALE_THRESHOLD

      private static final double MTW_SCALE_THRESHOLD
      The scaling threshold in the MTW algorithm. Marsaglia used 1e-140. This uses 2^-400 ~ 3.87e-121.
      See Also:
    • MTW_UP_SCALE

      private static final double MTW_UP_SCALE
      The up-scaling factor in the MTW algorithm. Marsaglia used 1e140. This uses 2^400 ~ 2.58e120.
      See Also:
    • MTW_UP_SCALE_POWER

      private static final int MTW_UP_SCALE_POWER
      The power-of-2 of the up-scaling factor in the MTW algorithm, n if the up-scale factor is 2^n.
      See Also:
    • P_DOWN_SCALE

      private static final double P_DOWN_SCALE
      The scaling threshold in the Pomeranz algorithm.
      See Also:
    • P_UP_SCALE

      private static final double P_UP_SCALE
      The up-scaling factor in the Pomeranz algorithm.
      See Also:
    • P_SCALE_POWER

      private static final int P_SCALE_POWER
      The power-of-2 of the up-scaling factor in the Pomeranz algorithm, n if the up-scale factor is 2^n.
      See Also:
    • MAX_FACTORIAL

      private static final int MAX_FACTORIAL
      Maximum finite factorial.
      See Also:
    • LOG_MIN_NORMAL

      private static final int LOG_MIN_NORMAL
      Approximate threshold for ln(MIN_NORMAL).
      See Also:
    • N140

      private static final int N140
      140, n threshold for small n for the sf computation.
      See Also:
    • NXX_0_754693

      private static final double NXX_0_754693
      0.754693, nxx threshold for small n Durbin matrix sf computation.
      See Also:
    • NXX_4

      private static final int NXX_4
      4, nxx threshold for small n Pomeranz sf computation.
      See Also:
    • NXX_2_2

      private static final double NXX_2_2
      2.2, nxx threshold for large n Miller approximation sf computation.
      See Also:
    • N_100000

      private static final int N_100000
      100000, n threshold for large n Durbin matrix sf computation.
      See Also:
    • NX32_1_4

      private static final double NX32_1_4
      1.4, nx^(3/2) threshold for large n Durbin matrix sf computation.
      See Also:
    • HALF

      private static final double HALF
      1/2.
      See Also:
  • Constructor Details

    • Two

      private Two()
      No instances.
  • Method Details

    • sf

      static double sf(double x, int n)
      Calculates complementary probability P[D_n >= x] for the two-sided one-sample Kolmogorov-Smirnov distribution.
      Parameters:
      x - Statistic.
      n - Sample size (assumed to be positive).
      Returns:
      \(P(D_n ≥ x)\)
    • sfExact

      private static double sfExact(double x, int n)
      Calculates exact cases for the complementary probability P[D_n >= x] the two-sided one-sample Kolmogorov-Smirnov distribution.

      Exact cases handle x not in [0, 1]. It is assumed n is positive.

      Parameters:
      x - Statistic.
      n - Sample size (assumed to be positive).
      Returns:
      \(P(D_n ≥ x)\)
    • durbinMTW

      private static double durbinMTW(double x, int n)
      Computes the Durbin matrix approximation for P(D_n < d) using the method of Marsaglia, Tsang and Wang (2003).
      Parameters:
      x - Statistic.
      n - Sample size (assumed to be positive).
      Returns:
      \(P(D_n < x)\)
    • createH

      private static SquareMatrixSupport.RealSquareMatrix createH(double x, int n)
      Creates H of size m x m as described in [1].
      Parameters:
      x - Statistic.
      n - Sample size (assumed to be positive).
      Returns:
      H matrix
    • pomeranz

      private static double pomeranz(double x, int n)
      Computes the Pomeranz approximation for P(D_n < d) using the method as described in Simard and L’Ecuyer (2011).

      Modifications have been made to the scaling of the intermediate values.

      Parameters:
      x - Statistic.
      n - Sample size (assumed to be positive).
      Returns:
      \(P(D_n < x)\)
    • computeAP

      private static void computeAP(double[] p, double z)
      Compute the power factors.
       factor[j] = z^j / j!
       
      Parameters:
      p - Power factors.
      z - (A[i] - A[i-1]) / n
    • computeA

      static void computeA(int n, double t, int[] amt, int[] apt)
      Compute the factors floor(A-t) and ceil(A+t). Arrays should have length 2n+3.
      Parameters:
      n - Sample size.
      t - Statistic x multiplied by n.
      amt - floor(A-t)
      apt - ceil(A+t)
    • pelzGood

      static double pelzGood(double x, int n)
      Computes the Pelz-Good approximation for P(D_n >= d) as described in Simard and L’Ecuyer (2011).

      This has been modified to compute the complementary CDF by subtracting the terms k0, k1, k2, k3 from 1. For use in computing the CDF the method should be updated to return the sum of k0 ... k3.

      Parameters:
      x - Statistic.
      n - Sample size (assumed to be positive).
      Returns:
      \(P(D_n ≥ x)\)
      Throws:
      ArithmeticException - if the series does not converge