Class KMeansClusterer<T>

java.lang.Object
edu.uci.ics.jung.algorithms.util.KMeansClusterer<T>

public class KMeansClusterer<T> extends Object
Groups items into a specified number of clusters, based on their proximity in d-dimensional space, using the k-means algorithm. Calls to cluster will terminate when either of the two following conditions is true:
  • the number of iterations is > max_iterations
  • none of the centroids has moved as much as convergence_threshold since the previous iteration
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static class 
    An exception that indicates that the specified data points cannot be clustered into the number of clusters requested by the user.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected double
     
    protected int
     
    protected Random
     
  • Constructor Summary

    Constructors
    Constructor
    Description
    Creates an instance with max iterations of 100 and convergence threshold of 0.001.
    KMeansClusterer(int max_iterations, double convergence_threshold)
    Creates an instance which will terminate when either the maximum number of iterations has been reached, or all changes are smaller than the convergence threshold.
  • Method Summary

    Modifier and Type
    Method
    Description
    protected Map<double[],Map<T,double[]>>
    assignToClusters(Map<T,double[]> object_locations, Set<double[]> centroids)
    Assigns each object to the cluster whose centroid is closest to the object.
    Collection<Map<T,double[]>>
    cluster(Map<T,double[]> object_locations, int num_clusters)
    Returns a Collection of clusters, where each cluster is represented as a Map of Objects to locations in d-dimensional space.
    double
     
    int
     
    void
    setConvergenceThreshold(double convergence_threshold)
     
    void
    setMaxIterations(int max_iterations)
     
    void
    setSeed(int random_seed)
    Sets the seed used by the internal random number generator.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • max_iterations

      protected int max_iterations
    • convergence_threshold

      protected double convergence_threshold
    • rand

      protected Random rand
  • Constructor Details

    • KMeansClusterer

      public KMeansClusterer(int max_iterations, double convergence_threshold)
      Creates an instance which will terminate when either the maximum number of iterations has been reached, or all changes are smaller than the convergence threshold.
      Parameters:
      max_iterations - the maximum number of iterations to employ
      convergence_threshold - the smallest change we want to track
    • KMeansClusterer

      public KMeansClusterer()
      Creates an instance with max iterations of 100 and convergence threshold of 0.001.
  • Method Details

    • getMaxIterations

      public int getMaxIterations()
      Returns:
      the maximum number of iterations
    • setMaxIterations

      public void setMaxIterations(int max_iterations)
      Parameters:
      max_iterations - the maximum number of iterations
    • getConvergenceThreshold

      public double getConvergenceThreshold()
      Returns:
      the convergence threshold
    • setConvergenceThreshold

      public void setConvergenceThreshold(double convergence_threshold)
      Parameters:
      convergence_threshold - the convergence threshold
    • cluster

      public Collection<Map<T,double[]>> cluster(Map<T,double[]> object_locations, int num_clusters)
      Returns a Collection of clusters, where each cluster is represented as a Map of Objects to locations in d-dimensional space.
      Parameters:
      object_locations - a map of the items to cluster, to double arrays that specify their locations in d-dimensional space.
      num_clusters - the number of clusters to create
      Returns:
      a clustering of the input objects in d-dimensional space
      Throws:
      KMeansClusterer.NotEnoughClustersException - if num_clusters is larger than the number of distinct points in object_locations
    • assignToClusters

      protected Map<double[],Map<T,double[]>> assignToClusters(Map<T,double[]> object_locations, Set<double[]> centroids)
      Assigns each object to the cluster whose centroid is closest to the object.
      Parameters:
      object_locations - a map of objects to locations
      centroids - the centroids of the clusters to be formed
      Returns:
      a map of objects to assigned clusters
    • setSeed

      public void setSeed(int random_seed)
      Sets the seed used by the internal random number generator. Enables consistent outputs.
      Parameters:
      random_seed - the random seed to use