Class KMeansPlusPlusClusterer<T extends Clusterable<T>>

  • Type Parameters:
    T - type of the points to cluster

    @Deprecated
    public class KMeansPlusPlusClusterer<T extends Clusterable<T>>
    extends java.lang.Object
    Deprecated.
    As of 3.2 (to be removed in 4.0), use KMeansPlusPlusClusterer instead
    Clustering algorithm based on David Arthur and Sergei Vassilvitski k-means++ algorithm.
    Since:
    2.0
    See Also:
    K-means++ (wikipedia)
    • Field Detail

      • random

        private final java.util.Random random
        Deprecated.
        Random generator for choosing initial centers.
    • Constructor Detail

      • KMeansPlusPlusClusterer

        public KMeansPlusPlusClusterer​(java.util.Random random)
        Deprecated.
        Build a clusterer.

        The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.

        Parameters:
        random - random generator to use for choosing initial centers
      • KMeansPlusPlusClusterer

        public KMeansPlusPlusClusterer​(java.util.Random random,
                                       KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)
        Deprecated.
        Build a clusterer.
        Parameters:
        random - random generator to use for choosing initial centers
        emptyStrategy - strategy to use for handling empty clusters that may appear during algorithm iterations
        Since:
        2.2
    • Method Detail

      • cluster

        public java.util.List<Cluster<T>> cluster​(java.util.Collection<T> points,
                                                  int k,
                                                  int numTrials,
                                                  int maxIterationsPerTrial)
                                           throws MathIllegalArgumentException,
                                                  ConvergenceException
        Deprecated.
        Runs the K-means++ clustering algorithm.
        Parameters:
        points - the points to cluster
        k - the number of clusters to split the data into
        numTrials - number of trial runs
        maxIterationsPerTrial - the maximum number of iterations to run the algorithm for at each trial run. If negative, no maximum will be used
        Returns:
        a list of clusters containing the points
        Throws:
        MathIllegalArgumentException - if the data points are null or the number of clusters is larger than the number of data points
        ConvergenceException - if an empty cluster is encountered and the emptyStrategy is set to ERROR
      • cluster

        public java.util.List<Cluster<T>> cluster​(java.util.Collection<T> points,
                                                  int k,
                                                  int maxIterations)
                                           throws MathIllegalArgumentException,
                                                  ConvergenceException
        Deprecated.
        Runs the K-means++ clustering algorithm.
        Parameters:
        points - the points to cluster
        k - the number of clusters to split the data into
        maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used
        Returns:
        a list of clusters containing the points
        Throws:
        MathIllegalArgumentException - if the data points are null or the number of clusters is larger than the number of data points
        ConvergenceException - if an empty cluster is encountered and the emptyStrategy is set to ERROR
      • assignPointsToClusters

        private static <T extends Clusterable<T>> int assignPointsToClusters​(java.util.List<Cluster<T>> clusters,
                                                                             java.util.Collection<T> points,
                                                                             int[] assignments)
        Deprecated.
        Adds the given points to the closest Cluster.
        Type Parameters:
        T - type of the points to cluster
        Parameters:
        clusters - the Clusters to add the points to
        points - the points to add to the given Clusters
        assignments - points assignments to clusters
        Returns:
        the number of points assigned to different clusters as the iteration before
      • chooseInitialCenters

        private static <T extends Clusterable<T>> java.util.List<Cluster<T>> chooseInitialCenters​(java.util.Collection<T> points,
                                                                                                  int k,
                                                                                                  java.util.Random random)
        Deprecated.
        Use K-means++ to choose the initial centers.
        Type Parameters:
        T - type of the points to cluster
        Parameters:
        points - the points to choose the initial centers from
        k - the number of centers to choose
        random - random generator to use
        Returns:
        the initial centers
      • getPointFromLargestVarianceCluster

        private T getPointFromLargestVarianceCluster​(java.util.Collection<Cluster<T>> clusters)
                                              throws ConvergenceException
        Deprecated.
        Get a random point from the Cluster with the largest distance variance.
        Parameters:
        clusters - the Clusters to search
        Returns:
        a random point from the selected cluster
        Throws:
        ConvergenceException - if clusters are all empty
      • getPointFromLargestNumberCluster

        private T getPointFromLargestNumberCluster​(java.util.Collection<Cluster<T>> clusters)
                                            throws ConvergenceException
        Deprecated.
        Get a random point from the Cluster with the largest number of points
        Parameters:
        clusters - the Clusters to search
        Returns:
        a random point from the selected cluster
        Throws:
        ConvergenceException - if clusters are all empty
      • getFarthestPoint

        private T getFarthestPoint​(java.util.Collection<Cluster<T>> clusters)
                            throws ConvergenceException
        Deprecated.
        Get the point farthest to its cluster center
        Parameters:
        clusters - the Clusters to search
        Returns:
        point farthest to its cluster center
        Throws:
        ConvergenceException - if clusters are all empty
      • getNearestCluster

        private static <T extends Clusterable<T>> int getNearestCluster​(java.util.Collection<Cluster<T>> clusters,
                                                                        T point)
        Deprecated.
        Returns the nearest Cluster to the given point
        Type Parameters:
        T - type of the points to cluster
        Parameters:
        clusters - the Clusters to search
        point - the point to find the nearest Cluster for
        Returns:
        the index of the nearest Cluster to the given point