Class KolmogorovSmirnovDistribution.One
- Enclosing class:
KolmogorovSmirnovDistribution
P[D_n^+ >= x]
for the one-sided
one-sample Kolmogorov-Smirnov distribution.
D_n^+ = sup_x {CDF_n(x) - F(x)}
where n
is the sample size; CDF_n(x)
is an empirical
cumulative distribution function; and F(x)
is the expected
distribution. The computation uses Smirnov's stable formula:
floor(n(1-x)) (n) ( j ) (j-1) ( j ) (n-j) P[D_n^+ >= x] = x Sum ( ) ( - + x ) ( 1 - x - - ) j=0 (j) ( n ) ( n )
Computing using logs is not as accurate as direct multiplication when n is large. However the terms are very large and small. Multiplication uses a scaled representation with a separate exponent term to support the extreme range. Extended precision representation of the numbers reduces the error in the power terms. Details in van Mulbregt (2018).
References:
- van Mulbregt, P. (2018). Computing the Cumulative Distribution Function and Quantiles of the One-sided Kolmogorov-Smirnov Statistic arxiv:1802.06966.
- Magg & Dicaire (1971). On Kolmogorov-Smirnov Type One-Sample Statistics Biometrika 58.3 pp. 653–656.
- Since:
- 1.1
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) static interface
Defines a scaled power function. -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final KolmogorovSmirnovDistribution.One.ScaledPower
Proxy for the default choice of the scaled power function.private static final int
Maximum number of term for the Smirnov-Dwass algorithm.private static final int
Minimum sample size for the Smirnov-Dwass algorithm.private static final int
Number of bits of precision in the sum of terms Aj.private static final int
Number of bits of precision in the sum of terms Aj.private static final int
"Very large" n to use a asymptotic limiting form. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate static int
log2
(int n) Returnsfloor(log2(n))
.(package private) static double
sf
(double x, int n) Calculates complementary probabilityP[D_n^+ >= x]
, or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.(package private) static double
sf
(double x, int n, KolmogorovSmirnovDistribution.One.ScaledPower power) Calculates complementary probabilityP[D_n^+ >= x]
, or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.private static double
sfAsymptotic
(double x, int n) Calculates complementary probabilityP[D_n^+ >= x]
, or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.private static double
sfExact
(double x, int n) Calculates exact cases for the complementary probabilityP[D_n^+ >= x]
the one-sided one-sample Kolmogorov-Smirnov distribution.(package private) static int
splitX
(int n, double x, double[] alpha) Compute exactlyx = (k + alpha) / n
withk
an integer andalpha in [0, 1)
.
-
Field Details
-
VERY_LARGE_N
private static final int VERY_LARGE_N"Very large" n to use a asymptotic limiting form. [1] suggests 1e12 but this is reduced to avoid excess computation time.- See Also:
-
SD_MAX_TERMS
private static final int SD_MAX_TERMSMaximum number of term for the Smirnov-Dwass algorithm.- See Also:
-
SD_MIN_N
private static final int SD_MIN_NMinimum sample size for the Smirnov-Dwass algorithm.- See Also:
-
SUM_PRECISION_BITS
private static final int SUM_PRECISION_BITSNumber of bits of precision in the sum of terms Aj. This does not have to be the full 106 bits of a double-double as the final result is used as a double. The terms are represented as fractions with an exponent:Aj = 2^b * f f of sum(A) in [0.5, 1) f of Aj in [0.25, 2]
The terms can be added if their exponents overlap. The bits of precision must account for the extra range of the fractional part of Aj by 1 bit. Note that additional bits are added to this dynamically based on the number of terms.
- See Also:
-
SD_SUM_PRECISION_BITS
private static final int SD_SUM_PRECISION_BITSNumber of bits of precision in the sum of terms Aj. For Smirnov-Dwass we use the full 106 bits of a double-double due to the summation of terms that cancel. Account for the extra range of the fractional part of Aj by 1 bit.- See Also:
-
POWER_DEFAULT
Proxy for the default choice of the scaled power function. The actual choice is based on the chosen algorithm.
-
-
Constructor Details
-
One
private One()No instances.
-
-
Method Details
-
sf
static double sf(double x, int n) Calculates complementary probabilityP[D_n^+ >= x]
, or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.- Parameters:
x
- Statistic.n
- Sample size (assumed to be positive).- Returns:
- \(P(D_n^+ ≥ x)\)
-
sfExact
private static double sfExact(double x, int n) Calculates exact cases for the complementary probabilityP[D_n^+ >= x]
the one-sided one-sample Kolmogorov-Smirnov distribution.Exact cases handle x not in [0, 1]. It is assumed n is positive.
- Parameters:
x
- Statistic.n
- Sample size (assumed to be positive).- Returns:
- \(P(D_n^+ ≥ x)\)
-
sfAsymptotic
private static double sfAsymptotic(double x, int n) Calculates complementary probabilityP[D_n^+ >= x]
, or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.Computes the result using the asymptotic formula Eq 5 in [1].
- Parameters:
x
- Statistic.n
- Sample size (assumed to be positive).- Returns:
- \(P(D_n^+ ≥ x)\)
-
sf
Calculates complementary probabilityP[D_n^+ >= x]
, or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.Computes the result using double-double arithmetic. The power function can use a fast approximation or a full power computation.
This function is safe for
x > 1/n
. Whenx
approaches sub-normal then division or multiplication by x can under/overflow. The case ofx < 1/n
can be computed insfExact
.- Parameters:
x
- Statistic (typically in (1/n, 1 - 1/n)).n
- Sample size (assumed to be positive).power
- Function to compute the scaled power (can be null).- Returns:
- \(P(D_n^+ ≥ x)\)
- See Also:
-
splitX
static int splitX(int n, double x, double[] alpha) Compute exactlyx = (k + alpha) / n
withk
an integer andalpha in [0, 1)
. Note thatk ~ floor(nx)
but may be rounded up ifalpha -> 1
within working precision.This computation is a significant source of increased error if performed in 64-bit arithmetic. Although the value alpha is only used for the PDF computation a value of
alpha == 0
indicates the final term of the SF summation can be dropped due to the cancellation of a power term(x + j/n)
to zero withx = (n-j)/n
. That is ifalpha == 0
then x is the fractionk/n
and one Aj term is zero.- Parameters:
n
- Sample size.x
- Statistic.alpha
- Output alpha.- Returns:
- k
-
log2
private static int log2(int n) Returnsfloor(log2(n))
.- Parameters:
n
- Value.- Returns:
- approximate log2(n)
-