Package edu.uci.ics.jung.algorithms.util
Class KMeansClusterer<T>
java.lang.Object
edu.uci.ics.jung.algorithms.util.KMeansClusterer<T>
Groups items into a specified number of clusters, based on their proximity in
d-dimensional space, using the k-means algorithm. Calls to
cluster
will terminate when either of the two following
conditions is true:
- the number of iterations is >
max_iterations
- none of the centroids has moved as much as
convergence_threshold
since the previous iteration
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
An exception that indicates that the specified data points cannot be clustered into the number of clusters requested by the user. -
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionCreates an instance with max iterations of 100 and convergence threshold of 0.001.KMeansClusterer
(int max_iterations, double convergence_threshold) Creates an instance which will terminate when either the maximum number of iterations has been reached, or all changes are smaller than the convergence threshold. -
Method Summary
Modifier and TypeMethodDescriptionassignToClusters
(Map<T, double[]> object_locations, Set<double[]> centroids) Assigns each object to the cluster whose centroid is closest to the object.Collection
<Map<T, double[]>> Returns aCollection
of clusters, where each cluster is represented as aMap
ofObjects
to locations in d-dimensional space.double
int
void
setConvergenceThreshold
(double convergence_threshold) void
setMaxIterations
(int max_iterations) void
setSeed
(int random_seed) Sets the seed used by the internal random number generator.
-
Field Details
-
max_iterations
protected int max_iterations -
convergence_threshold
protected double convergence_threshold -
rand
-
-
Constructor Details
-
KMeansClusterer
public KMeansClusterer(int max_iterations, double convergence_threshold) Creates an instance which will terminate when either the maximum number of iterations has been reached, or all changes are smaller than the convergence threshold.- Parameters:
max_iterations
- the maximum number of iterations to employconvergence_threshold
- the smallest change we want to track
-
KMeansClusterer
public KMeansClusterer()Creates an instance with max iterations of 100 and convergence threshold of 0.001.
-
-
Method Details
-
getMaxIterations
public int getMaxIterations()- Returns:
- the maximum number of iterations
-
setMaxIterations
public void setMaxIterations(int max_iterations) - Parameters:
max_iterations
- the maximum number of iterations
-
getConvergenceThreshold
public double getConvergenceThreshold()- Returns:
- the convergence threshold
-
setConvergenceThreshold
public void setConvergenceThreshold(double convergence_threshold) - Parameters:
convergence_threshold
- the convergence threshold
-
cluster
Returns aCollection
of clusters, where each cluster is represented as aMap
ofObjects
to locations in d-dimensional space.- Parameters:
object_locations
- a map of the items to cluster, todouble
arrays that specify their locations in d-dimensional space.num_clusters
- the number of clusters to create- Returns:
- a clustering of the input objects in d-dimensional space
- Throws:
KMeansClusterer.NotEnoughClustersException
- ifnum_clusters
is larger than the number of distinct points in object_locations
-
assignToClusters
protected Map<double[],Map<T, assignToClustersdouble[]>> (Map<T, double[]> object_locations, Set<double[]> centroids) Assigns each object to the cluster whose centroid is closest to the object.- Parameters:
object_locations
- a map of objects to locationscentroids
- the centroids of the clusters to be formed- Returns:
- a map of objects to assigned clusters
-
setSeed
public void setSeed(int random_seed) Sets the seed used by the internal random number generator. Enables consistent outputs.- Parameters:
random_seed
- the random seed to use
-