Package cern.jet.stat.quantile
Class UnknownDoubleQuantileEstimator
- java.lang.Object
-
- cern.colt.PersistentObject
-
- cern.jet.stat.quantile.DoubleQuantileEstimator
-
- cern.jet.stat.quantile.UnknownDoubleQuantileEstimator
-
- All Implemented Interfaces:
DoubleQuantileFinder
,java.io.Serializable
,java.lang.Cloneable
class UnknownDoubleQuantileEstimator extends DoubleQuantileEstimator
Approximate quantile finding algorithm for unknown N requiring only one pass and little main memory; computes quantiles over a sequence of double elements. This algorithm requires at most two times the memory of a corresponding approx. quantile finder knowing N.Needs as input the following parameters:
- 1. quantiles - the number of quantiles to be computed.
- 2. epsilon - the allowed approximation error on quantiles. The approximation guarantee of this algorithm is explicit.
It is also possible to couple the approximation algorithm with random sampling to further reduce memory requirements. With sampling, the approximation guarantees are explicit but probabilistic, i.e. they apply with respect to a (user controlled) confidence parameter "delta".
- 3. delta - the probability allowed that the approximation error fails to be smaller than epsilon. Set delta to zero for explicit non probabilistic guarantees. You usually don't instantiate quantile finders by using the constructor. Instead use the factory QuantileFinderFactor to do so. It will set up the right parametrization for you.
After Gurmeet Singh Manku, Sridhar Rajagopalan and Bruce G. Lindsay, Random Sampling Techniques for Space Efficient Online Computation of Order Statistics of Large Datasets. Accepted for Proc. of the 1999 ACM SIGMOD Int. Conf. on Management of Data, Paper (soon) available here.
- Version:
- 1.0, 09/24/99
- See Also:
QuantileFinderFactory
,KnownApproximateDoubleQuantileFinder
-
-
Field Summary
Fields Modifier and Type Field Description protected int
currentTreeHeight
protected double
precomputeEpsilon
protected WeightedRandomSampler
sampler
protected int
treeHeightStartingSampling
-
Fields inherited from class cern.jet.stat.quantile.DoubleQuantileEstimator
bufferSet, currentBufferToFill, totalElementsFilled
-
Fields inherited from class cern.colt.PersistentObject
serialVersionUID
-
-
Constructor Summary
Constructors Constructor Description UnknownDoubleQuantileEstimator(int b, int k, int h, double precomputeEpsilon, RandomEngine generator)
Constructs an approximate quantile finder with b buffers, each having k elements.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected DoubleBuffer[]
buffersToCollapse()
Not yet commented.void
clear()
Removes all elements from the receiver.java.lang.Object
clone()
Returns a deep copy of the receiver.protected void
newBuffer()
Not yet commented.protected void
postCollapse(DoubleBuffer[] toCollapse)
Not yet commented.DoubleArrayList
quantileElements(DoubleArrayList phis)
Computes the specified quantile elements over the values previously added.protected boolean
sampleNextElement()
Not yet commented.protected static void
sortAscendingByLevel(DoubleBuffer[] fullBuffers)
To do.java.lang.String
toString()
Returns a String representation of the receiver.-
Methods inherited from class cern.jet.stat.quantile.DoubleQuantileEstimator
add, addAllOf, addAllOfFromTo, collapse, contains, forEach, memory, phi, preProcessPhis, setUp, size, totalMemory
-
-
-
-
Field Detail
-
currentTreeHeight
protected int currentTreeHeight
-
treeHeightStartingSampling
protected final int treeHeightStartingSampling
-
sampler
protected WeightedRandomSampler sampler
-
precomputeEpsilon
protected double precomputeEpsilon
-
-
Constructor Detail
-
UnknownDoubleQuantileEstimator
public UnknownDoubleQuantileEstimator(int b, int k, int h, double precomputeEpsilon, RandomEngine generator)
Constructs an approximate quantile finder with b buffers, each having k elements.- Parameters:
b
- the number of buffersk
- the number of elements per bufferh
- the tree height at which sampling shall start.precomputeEpsilon
- the epsilon for which quantiles shall be precomputed; set this value <=0.0 if nothing shall be precomputed.generator
- a uniform random number generator.
-
-
Method Detail
-
buffersToCollapse
protected DoubleBuffer[] buffersToCollapse()
Not yet commented.- Overrides:
buffersToCollapse
in classDoubleQuantileEstimator
-
clear
public void clear()
Removes all elements from the receiver. The receiver will be empty after this call returns, and its memory requirements will be close to zero.- Specified by:
clear
in interfaceDoubleQuantileFinder
- Overrides:
clear
in classDoubleQuantileEstimator
-
clone
public java.lang.Object clone()
Returns a deep copy of the receiver.- Specified by:
clone
in interfaceDoubleQuantileFinder
- Overrides:
clone
in classDoubleQuantileEstimator
- Returns:
- a deep copy of the receiver.
-
newBuffer
protected void newBuffer()
Not yet commented.- Specified by:
newBuffer
in classDoubleQuantileEstimator
-
postCollapse
protected void postCollapse(DoubleBuffer[] toCollapse)
Not yet commented.- Specified by:
postCollapse
in classDoubleQuantileEstimator
-
quantileElements
public DoubleArrayList quantileElements(DoubleArrayList phis)
Computes the specified quantile elements over the values previously added.- Specified by:
quantileElements
in interfaceDoubleQuantileFinder
- Overrides:
quantileElements
in classDoubleQuantileEstimator
- Parameters:
phis
- the quantiles for which elements are to be computed. Each phi must be in the interval (0.0,1.0]. phis must be sorted ascending.- Returns:
- the approximate quantile elements.
-
sampleNextElement
protected boolean sampleNextElement()
Not yet commented.- Specified by:
sampleNextElement
in classDoubleQuantileEstimator
-
sortAscendingByLevel
protected static void sortAscendingByLevel(DoubleBuffer[] fullBuffers)
To do. This could faster be done without sorting (min and second min).
-
toString
public java.lang.String toString()
Returns a String representation of the receiver.- Overrides:
toString
in classDoubleQuantileEstimator
-
-