Class KolmogorovSmirnovDistribution.One

  • Enclosing class:
    KolmogorovSmirnovDistribution

    static final class KolmogorovSmirnovDistribution.One
    extends java.lang.Object
    Computes the complementary probability P[D_n^+ >= x] for the one-sided one-sample Kolmogorov-Smirnov distribution.
     D_n^+ = sup_x {CDF_n(x) - F(x)}
     

    where n is the sample size; CDF_n(x) is an empirical cumulative distribution function; and F(x) is the expected distribution. The computation uses Smirnov's stable formula:

                       floor(n(1-x)) (n) ( j     ) (j-1)  (         j ) (n-j)
     P[D_n^+ >= x] = x     Sum       ( ) ( - + x )        ( 1 - x - - )
                           j=0       (j) ( n     )        (         n )
     

    Computing using logs is not as accurate as direct multiplication when n is large. However the terms are very large and small. Multiplication uses a scaled representation with a separate exponent term to support the extreme range. Extended precision representation of the numbers reduces the error in the power terms. Details in van Mulbregt (2018).

    References:

    1. van Mulbregt, P. (2018). Computing the Cumulative Distribution Function and Quantiles of the One-sided Kolmogorov-Smirnov Statistic arxiv:1802.06966.
    2. Magg & Dicaire (1971). On Kolmogorov-Smirnov Type One-Sample Statistics Biometrika 58.3 pp. 653–656.
    Since:
    1.1
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      private One()
      No instances.
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      private static int log2​(int n)
      Returns floor(log2(n)).
      (package private) static double sf​(double x, int n)
      Calculates complementary probability P[D_n^+ >= x], or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.
      (package private) static double sf​(double x, int n, KolmogorovSmirnovDistribution.One.ScaledPower power)
      Calculates complementary probability P[D_n^+ >= x], or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.
      private static double sfAsymptotic​(double x, int n)
      Calculates complementary probability P[D_n^+ >= x], or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.
      private static double sfExact​(double x, int n)
      Calculates exact cases for the complementary probability P[D_n^+ >= x] the one-sided one-sample Kolmogorov-Smirnov distribution.
      (package private) static int splitX​(int n, double x, double[] alpha)
      Compute exactly x = (k + alpha) / n with k an integer and alpha in [0, 1).
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • VERY_LARGE_N

        private static final int VERY_LARGE_N
        "Very large" n to use a asymptotic limiting form. [1] suggests 1e12 but this is reduced to avoid excess computation time.
        See Also:
        Constant Field Values
      • SD_MAX_TERMS

        private static final int SD_MAX_TERMS
        Maximum number of term for the Smirnov-Dwass algorithm.
        See Also:
        Constant Field Values
      • SD_MIN_N

        private static final int SD_MIN_N
        Minimum sample size for the Smirnov-Dwass algorithm.
        See Also:
        Constant Field Values
      • SUM_PRECISION_BITS

        private static final int SUM_PRECISION_BITS
        Number of bits of precision in the sum of terms Aj. This does not have to be the full 106 bits of a double-double as the final result is used as a double. The terms are represented as fractions with an exponent:
          Aj = 2^b * f
          f of sum(A) in [0.5, 1)
          f of Aj in [0.25, 2]
         

        The terms can be added if their exponents overlap. The bits of precision must account for the extra range of the fractional part of Aj by 1 bit. Note that additional bits are added to this dynamically based on the number of terms.

        See Also:
        Constant Field Values
      • SD_SUM_PRECISION_BITS

        private static final int SD_SUM_PRECISION_BITS
        Number of bits of precision in the sum of terms Aj. For Smirnov-Dwass we use the full 106 bits of a double-double due to the summation of terms that cancel. Account for the extra range of the fractional part of Aj by 1 bit.
        See Also:
        Constant Field Values
    • Constructor Detail

      • One

        private One()
        No instances.
    • Method Detail

      • sf

        static double sf​(double x,
                         int n)
        Calculates complementary probability P[D_n^+ >= x], or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.
        Parameters:
        x - Statistic.
        n - Sample size (assumed to be positive).
        Returns:
        \(P(D_n^+ ≥ x)\)
      • sfExact

        private static double sfExact​(double x,
                                      int n)
        Calculates exact cases for the complementary probability P[D_n^+ >= x] the one-sided one-sample Kolmogorov-Smirnov distribution.

        Exact cases handle x not in [0, 1]. It is assumed n is positive.

        Parameters:
        x - Statistic.
        n - Sample size (assumed to be positive).
        Returns:
        \(P(D_n^+ ≥ x)\)
      • sfAsymptotic

        private static double sfAsymptotic​(double x,
                                           int n)
        Calculates complementary probability P[D_n^+ >= x], or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.

        Computes the result using the asymptotic formula Eq 5 in [1].

        Parameters:
        x - Statistic.
        n - Sample size (assumed to be positive).
        Returns:
        \(P(D_n^+ ≥ x)\)
      • sf

        static double sf​(double x,
                         int n,
                         KolmogorovSmirnovDistribution.One.ScaledPower power)
        Calculates complementary probability P[D_n^+ >= x], or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.

        Computes the result using double-double arithmetic. The power function can use a fast approximation or a full power computation.

        This function is safe for x > 1/n. When x approaches sub-normal then division or multiplication by x can under/overflow. The case of x < 1/n can be computed in sfExact.

        Parameters:
        x - Statistic (typically in (1/n, 1 - 1/n)).
        n - Sample size (assumed to be positive).
        power - Function to compute the scaled power (can be null).
        Returns:
        \(P(D_n^+ ≥ x)\)
        See Also:
        DD.pow(int, long[]), DDMath.pow(DD, int, long[])
      • splitX

        static int splitX​(int n,
                          double x,
                          double[] alpha)
        Compute exactly x = (k + alpha) / n with k an integer and alpha in [0, 1). Note that k ~ floor(nx) but may be rounded up if alpha -> 1 within working precision.

        This computation is a significant source of increased error if performed in 64-bit arithmetic. Although the value alpha is only used for the PDF computation a value of alpha == 0 indicates the final term of the SF summation can be dropped due to the cancellation of a power term (x + j/n) to zero with x = (n-j)/n. That is if alpha == 0 then x is the fraction k/n and one Aj term is zero.

        Parameters:
        n - Sample size.
        x - Statistic.
        alpha - Output alpha.
        Returns:
        k
      • log2

        private static int log2​(int n)
        Returns floor(log2(n)).
        Parameters:
        n - Value.
        Returns:
        approximate log2(n)