Package org.uncommons.maths.statistics
Class DataSet
- java.lang.Object
-
- org.uncommons.maths.statistics.DataSet
-
public class DataSet extends java.lang.Object
Utility class for calculating statistics for a finite data set.- See Also:
- How To Analyze Data Using the Average
-
-
Field Summary
Fields Modifier and Type Field Description private double[]
dataSet
private int
dataSetSize
private static int
DEFAULT_CAPACITY
private static double
GROWTH_RATE
private double
maximum
private double
minimum
private double
product
private double
reciprocalSum
private double
total
-
Constructor Summary
Constructors Constructor Description DataSet()
Creates an empty data set with a default initial capacity.DataSet(double[] dataSet)
Creates a data set and populates it with the specified values.DataSet(int capacity)
Creates an empty data set with the specified initial capacity.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addValue(double value)
Adds a single value to the data set and updates any statistics that are calculated cumulatively.private void
assertNotEmpty()
double
getAggregate()
double
getArithmeticMean()
The arithemthic mean of an n-element set is the sum of all the elements divided by n.double
getGeometricMean()
The geometric mean of an n-element set is the nth-root of the product of all the elements.double
getHarmonicMean()
The harmonic mean of an n-element set is n divided by the sum of the reciprocals of the values (where the reciprocal of a value x is 1/x).double
getMaximum()
double
getMeanDeviation()
Calculates the mean absolute deviation of the data set.double
getMedian()
Determines the median value of the data set.double
getMinimum()
double
getProduct()
double
getSampleStandardDeviation()
The sample standard deviation is the square root of the sample variance.double
getSampleVariance()
Calculates the variance (a measure of statistical dispersion) of the data set.int
getSize()
Returns the number of values in this data set.double
getStandardDeviation()
The standard deviation is the square root of the variance.double
getVariance()
Calculates the variance (a measure of statistical dispersion) of the data set.private double
sumSquaredDiffs()
Helper method for variance calculations.private void
updateStatsWithNewValue(double value)
-
-
-
Field Detail
-
DEFAULT_CAPACITY
private static final int DEFAULT_CAPACITY
- See Also:
- Constant Field Values
-
GROWTH_RATE
private static final double GROWTH_RATE
- See Also:
- Constant Field Values
-
dataSet
private double[] dataSet
-
dataSetSize
private int dataSetSize
-
total
private double total
-
product
private double product
-
reciprocalSum
private double reciprocalSum
-
minimum
private double minimum
-
maximum
private double maximum
-
-
Constructor Detail
-
DataSet
public DataSet()
Creates an empty data set with a default initial capacity.
-
DataSet
public DataSet(int capacity)
Creates an empty data set with the specified initial capacity.- Parameters:
capacity
- The initial capacity for the data set (this number of values will be able to be added without needing to resize the internal data storage).
-
DataSet
public DataSet(double[] dataSet)
Creates a data set and populates it with the specified values.- Parameters:
dataSet
- The values to add to this data set.
-
-
Method Detail
-
addValue
public void addValue(double value)
Adds a single value to the data set and updates any statistics that are calculated cumulatively.- Parameters:
value
- The value to add.
-
updateStatsWithNewValue
private void updateStatsWithNewValue(double value)
-
assertNotEmpty
private void assertNotEmpty()
-
getSize
public final int getSize()
Returns the number of values in this data set.- Returns:
- The size of the data set.
-
getMinimum
public final double getMinimum()
- Returns:
- The smallest value in the data set.
- Throws:
EmptyDataSetException
- If the data set is empty.- Since:
- 1.0.1
-
getMaximum
public final double getMaximum()
- Returns:
- The biggest value in the data set.
- Throws:
EmptyDataSetException
- If the data set is empty.- Since:
- 1.0.1
-
getMedian
public final double getMedian()
Determines the median value of the data set.- Returns:
- If the number of elements is odd, returns the middle element. If the number of elements is even, returns the midpoint of the two middle elements.
- Since:
- 1.0.1
-
getAggregate
public final double getAggregate()
- Returns:
- The sum of all values.
- Throws:
EmptyDataSetException
- If the data set is empty.
-
getProduct
public final double getProduct()
- Returns:
- The product of all values.
- Throws:
EmptyDataSetException
- If the data set is empty.
-
getArithmeticMean
public final double getArithmeticMean()
The arithemthic mean of an n-element set is the sum of all the elements divided by n. The arithmetic mean is often referred to simply as the "mean" or "average" of a data set.- Returns:
- The arithmetic mean of all elements in the data set.
- Throws:
EmptyDataSetException
- If the data set is empty.- See Also:
getGeometricMean()
-
getGeometricMean
public final double getGeometricMean()
The geometric mean of an n-element set is the nth-root of the product of all the elements. The geometric mean is used for finding the average factor (e.g. an average interest rate).- Returns:
- The geometric mean of all elements in the data set.
- Throws:
EmptyDataSetException
- If the data set is empty.- See Also:
getArithmeticMean()
,getHarmonicMean()
-
getHarmonicMean
public final double getHarmonicMean()
The harmonic mean of an n-element set is n divided by the sum of the reciprocals of the values (where the reciprocal of a value x is 1/x). The harmonic mean is used to calculate an average rate (e.g. an average speed).- Returns:
- The harmonic mean of all the elements in the data set.
- Throws:
EmptyDataSetException
- If the data set is empty.- Since:
- 1.1
- See Also:
getArithmeticMean()
,getGeometricMean()
-
getMeanDeviation
public final double getMeanDeviation()
Calculates the mean absolute deviation of the data set. This is the average (absolute) amount that a single value deviates from the arithmetic mean.- Returns:
- The mean absolute deviation of the data set.
- Throws:
EmptyDataSetException
- If the data set is empty.- See Also:
getArithmeticMean()
,getVariance()
,getStandardDeviation()
-
getVariance
public final double getVariance()
Calculates the variance (a measure of statistical dispersion) of the data set. There are different measures of variance depending on whether the data set is itself a finite population or is a sample from some larger population. For large data sets the difference is negligible. This method calculates the population variance.- Returns:
- The population variance of the data set.
- Throws:
EmptyDataSetException
- If the data set is empty.- See Also:
getSampleVariance()
,getStandardDeviation()
,getMeanDeviation()
-
sumSquaredDiffs
private double sumSquaredDiffs()
Helper method for variance calculations.- Returns:
- The sum of the squares of the differences between each value and the arithmetic mean.
- Throws:
EmptyDataSetException
- If the data set is empty.
-
getStandardDeviation
public final double getStandardDeviation()
The standard deviation is the square root of the variance. This method calculates the population standard deviation as opposed to the sample standard deviation. For large data sets the difference is negligible.- Returns:
- The standard deviation of the population.
- Throws:
EmptyDataSetException
- If the data set is empty.- See Also:
getSampleStandardDeviation()
,getVariance()
,getMeanDeviation()
-
getSampleVariance
public final double getSampleVariance()
Calculates the variance (a measure of statistical dispersion) of the data set. There are different measures of variance depending on whether the data set is itself a finite population or is a sample from some larger population. For large data sets the difference is negligible. This method calculates the sample variance.- Returns:
- The sample variance of the data set.
- Throws:
EmptyDataSetException
- If the data set is empty.- See Also:
getVariance()
,getSampleStandardDeviation()
,getMeanDeviation()
-
getSampleStandardDeviation
public final double getSampleStandardDeviation()
The sample standard deviation is the square root of the sample variance. For large data sets the difference between sample standard deviation and population standard deviation is negligible.- Returns:
- The sample standard deviation of the data set.
- Throws:
EmptyDataSetException
- If the data set is empty.- See Also:
getStandardDeviation()
,getSampleVariance()
,getMeanDeviation()
-
-