Class UnknownDoubleQuantileEstimator

All Implemented Interfaces:
DoubleQuantileFinder, Serializable, Cloneable

class UnknownDoubleQuantileEstimator extends DoubleQuantileEstimator
Approximate quantile finding algorithm for unknown N requiring only one pass and little main memory; computes quantiles over a sequence of double elements. This algorithm requires at most two times the memory of a corresponding approx. quantile finder knowing N.

Needs as input the following parameters:

1. quantiles - the number of quantiles to be computed.
2. epsilon - the allowed approximation error on quantiles. The approximation guarantee of this algorithm is explicit.

It is also possible to couple the approximation algorithm with random sampling to further reduce memory requirements. With sampling, the approximation guarantees are explicit but probabilistic, i.e. they apply with respect to a (user controlled) confidence parameter "delta".

3. delta - the probability allowed that the approximation error fails to be smaller than epsilon. Set delta to zero for explicit non probabilistic guarantees. You usually don't instantiate quantile finders by using the constructor. Instead use the factory QuantileFinderFactor to do so. It will set up the right parametrization for you.

After Gurmeet Singh Manku, Sridhar Rajagopalan and Bruce G. Lindsay, Random Sampling Techniques for Space Efficient Online Computation of Order Statistics of Large Datasets. Accepted for Proc. of the 1999 ACM SIGMOD Int. Conf. on Management of Data, Paper (soon) available here.

Version:
1.0, 09/24/99
See Also:
  • Field Details

    • currentTreeHeight

      protected int currentTreeHeight
    • treeHeightStartingSampling

      protected final int treeHeightStartingSampling
    • sampler

      protected WeightedRandomSampler sampler
    • precomputeEpsilon

      protected double precomputeEpsilon
  • Constructor Details

    • UnknownDoubleQuantileEstimator

      public UnknownDoubleQuantileEstimator(int b, int k, int h, double precomputeEpsilon, RandomEngine generator)
      Constructs an approximate quantile finder with b buffers, each having k elements.
      Parameters:
      b - the number of buffers
      k - the number of elements per buffer
      h - the tree height at which sampling shall start.
      precomputeEpsilon - the epsilon for which quantiles shall be precomputed; set this value invalid input: '<'=0.0 if nothing shall be precomputed.
      generator - a uniform random number generator.
  • Method Details