Class QuantileCalc


  • class QuantileCalc
    extends java.lang.Object
    Computes b and k vor various parameters.
    • Constructor Summary

      Constructors 
      Constructor Description
      QuantileCalc()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static double binomial​(long n, long k)
      Efficiently computes the binomial coefficient, often also referred to as "n over k" or "n choose k".
      static long ceiling​(double value)
      Returns the smallest long >= value.
      static long[] known_N_compute_B_and_K​(long N, double epsilon, double delta, int quantiles, double[] returnSamplingRate)
      Computes the number of buffers and number of values per buffer such that quantiles can be determined with an approximation error no more than epsilon with a certain probability.
      protected static long[] known_N_compute_B_and_K_quick​(long N, double epsilon)
      Computes the number of buffers and number of values per buffer such that quantiles can be determined with a guaranteed approximation error no more than epsilon.
      protected static long[] known_N_compute_B_and_K_slow​(long N, double epsilon, double delta, int quantiles, double[] returnSamplingRate)
      Computes the number of buffers and number of values per buffer such that quantiles can be determined with an approximation error no more than epsilon with a certain probability.
      static void main​(java.lang.String[] args)  
      static void test_B_and_K_Calculation​(java.lang.String[] args)
      Computes b and k for different parameters.
      static long[] unknown_N_compute_B_and_K​(double epsilon, double delta, int quantiles)
      Computes the number of buffers and number of values per buffer such that quantiles can be determined with an approximation error no more than epsilon with a certain probability.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • QuantileCalc

        QuantileCalc()
    • Method Detail

      • binomial

        public static double binomial​(long n,
                                      long k)
        Efficiently computes the binomial coefficient, often also referred to as "n over k" or "n choose k". The binomial coefficient is defined as n!/((n-k)!*k!). Tries to avoid numeric overflows.
        Returns:
        the binomial coefficient.
      • ceiling

        public static long ceiling​(double value)
        Returns the smallest long >= value.
        Examples: 1.0 -> 1, 1.2 -> 2, 1.9 -> 2. This method is safer than using (long) Math.ceil(value), because of possible rounding error.
      • known_N_compute_B_and_K

        public static long[] known_N_compute_B_and_K​(long N,
                                                     double epsilon,
                                                     double delta,
                                                     int quantiles,
                                                     double[] returnSamplingRate)
        Computes the number of buffers and number of values per buffer such that quantiles can be determined with an approximation error no more than epsilon with a certain probability. Assumes that quantiles are to be computed over N values. The required sampling rate is computed and stored in the first element of the provided returnSamplingRate array, which, therefore must be at least of length 1.
        Parameters:
        N - the number of values over which quantiles shall be computed (e.g 10^6).
        epsilon - the approximation error which is guaranteed not to be exceeded (e.g. 0.001) (0 <= epsilon <= 1). To get exact result, set epsilon=0.0;
        delta - the probability that the approximation error is more than than epsilon (e.g. 0.0001) (0 <= delta <= 1). To avoid probabilistic answers, set delta=0.0.
        quantiles - the number of quantiles to be computed (e.g. 100) (quantiles >= 1). If unknown in advance, set this number large, e.g. quantiles >= 10000.
        samplingRate - a double[1] where the sampling rate is to be filled in.
        Returns:
        long[2] - long[0]=the number of buffers, long[1]=the number of elements per buffer, returnSamplingRate[0]=the required sampling rate.
      • known_N_compute_B_and_K_quick

        protected static long[] known_N_compute_B_and_K_quick​(long N,
                                                              double epsilon)
        Computes the number of buffers and number of values per buffer such that quantiles can be determined with a guaranteed approximation error no more than epsilon. Assumes that quantiles are to be computed over N values.
        Parameters:
        N - the anticipated number of values over which quantiles shall be determined.
        epsilon - the approximation error which is guaranteed not to be exceeded (e.g. 0.001) (0 <= epsilon <= 1). To get exact result, set epsilon=0.0;
        Returns:
        long[2] - long[0]=the number of buffers, long[1]=the number of elements per buffer.
      • known_N_compute_B_and_K_slow

        protected static long[] known_N_compute_B_and_K_slow​(long N,
                                                             double epsilon,
                                                             double delta,
                                                             int quantiles,
                                                             double[] returnSamplingRate)
        Computes the number of buffers and number of values per buffer such that quantiles can be determined with an approximation error no more than epsilon with a certain probability. Assumes that quantiles are to be computed over N values. The required sampling rate is computed and stored in the first element of the provided returnSamplingRate array, which, therefore must be at least of length 1.
        Parameters:
        N - the anticipated number of values over which quantiles shall be computed (e.g 10^6).
        epsilon - the approximation error which is guaranteed not to be exceeded (e.g. 0.001) (0 <= epsilon <= 1). To get exact result, set epsilon=0.0;
        delta - the probability that the approximation error is more than than epsilon (e.g. 0.0001) (0 <= delta <= 1). To avoid probabilistic answers, set delta=0.0.
        quantiles - the number of quantiles to be computed (e.g. 100) (quantiles >= 1). If unknown in advance, set this number large, e.g. quantiles >= 10000.
        samplingRate - a double[1] where the sampling rate is to be filled in.
        Returns:
        long[2] - long[0]=the number of buffers, long[1]=the number of elements per buffer, returnSamplingRate[0]=the required sampling rate.
      • main

        public static void main​(java.lang.String[] args)
      • test_B_and_K_Calculation

        public static void test_B_and_K_Calculation​(java.lang.String[] args)
        Computes b and k for different parameters.
      • unknown_N_compute_B_and_K

        public static long[] unknown_N_compute_B_and_K​(double epsilon,
                                                       double delta,
                                                       int quantiles)
        Computes the number of buffers and number of values per buffer such that quantiles can be determined with an approximation error no more than epsilon with a certain probability.
        Parameters:
        epsilon - the approximation error which is guaranteed not to be exceeded (e.g. 0.001) (0 <= epsilon <= 1). To get exact results, set epsilon=0.0;
        delta - the probability that the approximation error is more than than epsilon (e.g. 0.0001) (0 <= delta <= 1). To get exact results, set delta=0.0.
        quantiles - the number of quantiles to be computed (e.g. 100) (quantiles >= 1). If unknown in advance, set this number large, e.g. quantiles >= 10000.
        Returns:
        long[3] - long[0]=the number of buffers, long[1]=the number of elements per buffer, long[2]=the tree height where sampling shall start.