Class KolmogorovSmirnovDistribution.One
- java.lang.Object
-
- org.apache.commons.statistics.inference.KolmogorovSmirnovDistribution.One
-
- Enclosing class:
- KolmogorovSmirnovDistribution
static final class KolmogorovSmirnovDistribution.One extends java.lang.Object
Computes the complementary probabilityP[D_n^+ >= x]
for the one-sided one-sample Kolmogorov-Smirnov distribution.D_n^+ = sup_x {CDF_n(x) - F(x)}
where
n
is the sample size;CDF_n(x)
is an empirical cumulative distribution function; andF(x)
is the expected distribution. The computation uses Smirnov's stable formula:floor(n(1-x)) (n) ( j ) (j-1) ( j ) (n-j) P[D_n^+ >= x] = x Sum ( ) ( - + x ) ( 1 - x - - ) j=0 (j) ( n ) ( n )
Computing using logs is not as accurate as direct multiplication when n is large. However the terms are very large and small. Multiplication uses a scaled representation with a separate exponent term to support the extreme range. Extended precision representation of the numbers reduces the error in the power terms. Details in van Mulbregt (2018).
References:
- van Mulbregt, P. (2018). Computing the Cumulative Distribution Function and Quantiles of the One-sided Kolmogorov-Smirnov Statistic arxiv:1802.06966.
- Magg & Dicaire (1971). On Kolmogorov-Smirnov Type One-Sample Statistics Biometrika 58.3 pp. 653–656.
- Since:
- 1.1
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static interface
KolmogorovSmirnovDistribution.One.ScaledPower
Defines a scaled power function.
-
Field Summary
Fields Modifier and Type Field Description private static KolmogorovSmirnovDistribution.One.ScaledPower
POWER_DEFAULT
Proxy for the default choice of the scaled power function.private static int
SD_MAX_TERMS
Maximum number of term for the Smirnov-Dwass algorithm.private static int
SD_MIN_N
Minimum sample size for the Smirnov-Dwass algorithm.private static int
SD_SUM_PRECISION_BITS
Number of bits of precision in the sum of terms Aj.private static int
SUM_PRECISION_BITS
Number of bits of precision in the sum of terms Aj.private static int
VERY_LARGE_N
"Very large" n to use a asymptotic limiting form.
-
Constructor Summary
Constructors Modifier Constructor Description private
One()
No instances.
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description private static int
log2(int n)
Returnsfloor(log2(n))
.(package private) static double
sf(double x, int n)
Calculates complementary probabilityP[D_n^+ >= x]
, or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.(package private) static double
sf(double x, int n, KolmogorovSmirnovDistribution.One.ScaledPower power)
Calculates complementary probabilityP[D_n^+ >= x]
, or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.private static double
sfAsymptotic(double x, int n)
Calculates complementary probabilityP[D_n^+ >= x]
, or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.private static double
sfExact(double x, int n)
Calculates exact cases for the complementary probabilityP[D_n^+ >= x]
the one-sided one-sample Kolmogorov-Smirnov distribution.(package private) static int
splitX(int n, double x, double[] alpha)
Compute exactlyx = (k + alpha) / n
withk
an integer andalpha in [0, 1)
.
-
-
-
Field Detail
-
VERY_LARGE_N
private static final int VERY_LARGE_N
"Very large" n to use a asymptotic limiting form. [1] suggests 1e12 but this is reduced to avoid excess computation time.- See Also:
- Constant Field Values
-
SD_MAX_TERMS
private static final int SD_MAX_TERMS
Maximum number of term for the Smirnov-Dwass algorithm.- See Also:
- Constant Field Values
-
SD_MIN_N
private static final int SD_MIN_N
Minimum sample size for the Smirnov-Dwass algorithm.- See Also:
- Constant Field Values
-
SUM_PRECISION_BITS
private static final int SUM_PRECISION_BITS
Number of bits of precision in the sum of terms Aj. This does not have to be the full 106 bits of a double-double as the final result is used as a double. The terms are represented as fractions with an exponent:Aj = 2^b * f f of sum(A) in [0.5, 1) f of Aj in [0.25, 2]
The terms can be added if their exponents overlap. The bits of precision must account for the extra range of the fractional part of Aj by 1 bit. Note that additional bits are added to this dynamically based on the number of terms.
- See Also:
- Constant Field Values
-
SD_SUM_PRECISION_BITS
private static final int SD_SUM_PRECISION_BITS
Number of bits of precision in the sum of terms Aj. For Smirnov-Dwass we use the full 106 bits of a double-double due to the summation of terms that cancel. Account for the extra range of the fractional part of Aj by 1 bit.- See Also:
- Constant Field Values
-
POWER_DEFAULT
private static final KolmogorovSmirnovDistribution.One.ScaledPower POWER_DEFAULT
Proxy for the default choice of the scaled power function. The actual choice is based on the chosen algorithm.
-
-
Method Detail
-
sf
static double sf(double x, int n)
Calculates complementary probabilityP[D_n^+ >= x]
, or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.- Parameters:
x
- Statistic.n
- Sample size (assumed to be positive).- Returns:
- \(P(D_n^+ ≥ x)\)
-
sfExact
private static double sfExact(double x, int n)
Calculates exact cases for the complementary probabilityP[D_n^+ >= x]
the one-sided one-sample Kolmogorov-Smirnov distribution.Exact cases handle x not in [0, 1]. It is assumed n is positive.
- Parameters:
x
- Statistic.n
- Sample size (assumed to be positive).- Returns:
- \(P(D_n^+ ≥ x)\)
-
sfAsymptotic
private static double sfAsymptotic(double x, int n)
Calculates complementary probabilityP[D_n^+ >= x]
, or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.Computes the result using the asymptotic formula Eq 5 in [1].
- Parameters:
x
- Statistic.n
- Sample size (assumed to be positive).- Returns:
- \(P(D_n^+ ≥ x)\)
-
sf
static double sf(double x, int n, KolmogorovSmirnovDistribution.One.ScaledPower power)
Calculates complementary probabilityP[D_n^+ >= x]
, or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.Computes the result using double-double arithmetic. The power function can use a fast approximation or a full power computation.
This function is safe for
x > 1/n
. Whenx
approaches sub-normal then division or multiplication by x can under/overflow. The case ofx < 1/n
can be computed insfExact
.- Parameters:
x
- Statistic (typically in (1/n, 1 - 1/n)).n
- Sample size (assumed to be positive).power
- Function to compute the scaled power (can be null).- Returns:
- \(P(D_n^+ ≥ x)\)
- See Also:
DD.pow(int, long[])
,DDMath.pow(DD, int, long[])
-
splitX
static int splitX(int n, double x, double[] alpha)
Compute exactlyx = (k + alpha) / n
withk
an integer andalpha in [0, 1)
. Note thatk ~ floor(nx)
but may be rounded up ifalpha -> 1
within working precision.This computation is a significant source of increased error if performed in 64-bit arithmetic. Although the value alpha is only used for the PDF computation a value of
alpha == 0
indicates the final term of the SF summation can be dropped due to the cancellation of a power term(x + j/n)
to zero withx = (n-j)/n
. That is ifalpha == 0
then x is the fractionk/n
and one Aj term is zero.- Parameters:
n
- Sample size.x
- Statistic.alpha
- Output alpha.- Returns:
- k
-
log2
private static int log2(int n)
Returnsfloor(log2(n))
.- Parameters:
n
- Value.- Returns:
- approximate log2(n)
-
-