Enum Quantile.EstimationMethod
- All Implemented Interfaces:
Serializable
,Comparable<Quantile.EstimationMethod>
- Enclosing class:
Quantile
HF1 - HF9
.
Samples quantiles are defined by:
\[ Q(p) = (1 - \gamma) x_j + \gamma x_{j+1} \]
where \( \frac{j-m}{n} \leq p \le \frac{j-m+1}{n} \), \( x_j \) is the \( j \)th order statistic, \( n \) is the sample size, the value of \( \gamma \) is a function of \( j = \lfloor np+m \rfloor \) and \( g = np + m - j \), and \( m \) is a constant determined by the sample quantile type.
Note that the real-valued position \( np + m \) is a 1-based index and \( j \in [1, n] \). If the real valued position is computed as beyond the lowest or highest values in the sample, this implementation will return the minimum or maximum observation respectively.
Types 1, 2, and 3 are discontinuous functions of \( p \); types 4 to 9 are continuous functions of \( p \).
For the continuous functions, the probability \( p_k \) is provided for the \( k \)-th order statistic in size \( n \). Samples quantiles are equivalently obtained to \( Q(p) \) by linear interpolation between points \( (p_k, x_k) \) and \( (p_{k+1}, x_{k+1}) \) for any \( p_k \leq p \leq p_{k+1} \).
- Hyndman and Fan (1996) Sample Quantiles in Statistical Packages. The American Statistician, 50, 361-365. doi.org/10.2307/2684934
- Quantile (Wikipedia)
-
Enum Constant Summary
Enum ConstantsEnum ConstantDescriptionInverse of the empirical distribution function.Similar toHF1
with averaging at discontinuities.The observation closest to \( np \).Linear interpolation of the inverse of the empirical CDF.A piecewise linear function where the knots are the values midway through the steps of the empirical CDF.Linear interpolation of the expectations for the order statistics for the uniform distribution on [0,1].Linear interpolation of the modes for the order statistics for the uniform distribution on [0,1].Linear interpolation of the approximate medians for order statistics.Quantile estimates are approximately unbiased for the expected order statistics if \( x \) is normally distributed. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescription(package private) final double
index
(double p, int n) Finds the indexi
and fractional partg
of a real-valued position to interpolate the quantile.(package private) abstract double
position0
(double p, int n) Finds the real-valued position for calculation of the quantile.static Quantile.EstimationMethod
Returns the enum constant of this type with the specified name.static Quantile.EstimationMethod[]
values()
Returns an array containing the constants of this enum type, in the order they are declared.
-
Enum Constant Details
-
HF1
Inverse of the empirical distribution function.\( m = 0 \). \( \gamma = 0 \) if \( g = 0 \), and 1 otherwise.
-
HF2
Similar toHF1
with averaging at discontinuities.\( m = 0 \). \( \gamma = 0.5 \) if \( g = 0 \), and 1 otherwise.
-
HF3
The observation closest to \( np \). Ties are resolved to the nearest even order statistic.\( m = -1/2 \). \( \gamma = 0 \) if \( g = 0 \) and \( j \) is even, and 1 otherwise.
-
HF4
Linear interpolation of the inverse of the empirical CDF.\( m = 0 \). \( p_k = \frac{k}{n} \).
-
HF5
A piecewise linear function where the knots are the values midway through the steps of the empirical CDF. Proposed by Hazen (1914) and popular amongst hydrologists.\( m = 1/2 \). \( p_k = \frac{k - 1/2}{n} \).
-
HF6
Linear interpolation of the expectations for the order statistics for the uniform distribution on [0,1]. Proposed by Weibull (1939).\( m = p \). \( p_k = \frac{k}{n + 1} \).
This method computes the quantile as per the Apache Commons Math Percentile legacy implementation.
-
HF7
Linear interpolation of the modes for the order statistics for the uniform distribution on [0,1]. Proposed by Gumbull (1939).\( m = 1 - p \). \( p_k = \frac{k - 1}{n - 1} \).
-
HF8
Linear interpolation of the approximate medians for order statistics.\( m = (p + 1)/3 \). \( p_k = \frac{k - 1/3}{n + 1/3} \).
As per Hyndman and Fan (1996) this approach is most recommended as it provides an approximate median-unbiased estimate regardless of distribution.
-
HF9
Quantile estimates are approximately unbiased for the expected order statistics if \( x \) is normally distributed.\( m = p/4 + 3/8 \). \( p_k = \frac{k - 3/8}{n + 1/4} \).
-
-
Constructor Details
-
EstimationMethod
private EstimationMethod()
-
-
Method Details
-
values
Returns an array containing the constants of this enum type, in the order they are declared.- Returns:
- an array containing the constants of this enum type, in the order they are declared
-
valueOf
Returns the enum constant of this type with the specified name. The string must match exactly an identifier used to declare an enum constant in this type. (Extraneous whitespace characters are not permitted.)- Parameters:
name
- the name of the enum constant to be returned.- Returns:
- the enum constant with the specified name
- Throws:
IllegalArgumentException
- if this enum type has no constant with the specified nameNullPointerException
- if the argument is null
-
position0
abstract double position0(double p, int n) Finds the real-valued position for calculation of the quantile.Return
i + g
such that the quantile value from sorted data is:value = data[i] + g * (data[i+1] - data[i])
Warning: Interpolation should not use
data[i+1]
unlessg != 0
.Note: In contrast to the definition of Hyndman and Fan in the class header which uses a 1-based position, this is a zero based index. This change is for convenience when addressing array positions.
- Parameters:
p
- pth quantile.n
- Size.- Returns:
- a real-valued position (0-based) into the range
[0, n)
-
index
final double index(double p, int n) Finds the indexi
and fractional partg
of a real-valued position to interpolate the quantile.Return
i + g
such that the quantile value from sorted data is:value = data[i] + g * (data[i+1] - data[i])
Note: Interpolation should not use
data[i+1]
unlessg != 0
.- Parameters:
p
- pth quantile.n
- Size.- Returns:
- index (in [0, n-1])
-