Class GreedyClustering<T>

  • All Implemented Interfaces:
    ClusteringAlgorithm<T>

    public final class GreedyClustering<T>
    extends java.lang.Object
    implements ClusteringAlgorithm<T>
    Greedy clustering algorithm. Assigns each item to the nearest centroid, creating new centroids as needed. Will only pass through the data once. The centroids are recalculated as the clusters are updated (not with every single update, but continuously during the process).
    • Constructor Summary

      Constructors 
      Constructor Description
      GreedyClustering​(java.util.function.Function<java.util.Collection<T>,​T> centroidUpdater, java.util.function.ToDoubleBiFunction<T,​T> distanceCalculator, double distanceThreshold)  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.util.List<java.util.Set<T>> cluster​(java.util.Collection<T> input)  
      (package private) java.util.List<T> getCentroids()  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • myCentroids

        private final java.util.List<T> myCentroids
      • myUpdates

        private final java.util.List<java.util.concurrent.atomic.AtomicInteger> myUpdates
      • myCentroidUpdater

        private final java.util.function.Function<java.util.Collection<T>,​T> myCentroidUpdater
      • myDistanceCalculator

        private final java.util.function.ToDoubleBiFunction<T,​T> myDistanceCalculator
      • myDistanceThreshold

        private final double myDistanceThreshold
    • Constructor Detail

      • GreedyClustering

        public GreedyClustering​(java.util.function.Function<java.util.Collection<T>,​T> centroidUpdater,
                                java.util.function.ToDoubleBiFunction<T,​T> distanceCalculator,
                                double distanceThreshold)
        Parameters:
        centroidUpdater - The update function should return a new centroid based on a collection of points (the set of items in a cluster).
        distanceCalculator - A function that calculates the distance between two points.
        distanceThreshold - The maximum distance between a point and a centroid for the point to be assigned to that cluster. The points are always assigned to the cluster of the nearest centroid among the already existing clusters. This threshold determines when a new cluster should be created.
    • Method Detail

      • cluster

        public java.util.List<java.util.Set<T>> cluster​(java.util.Collection<T> input)
        Specified by:
        cluster in interface ClusteringAlgorithm<T>
      • getCentroids

        java.util.List<T> getCentroids()