Package edu.uci.ics.jung.algorithms.util
Class KMeansClusterer<T>
- java.lang.Object
-
- edu.uci.ics.jung.algorithms.util.KMeansClusterer<T>
-
public class KMeansClusterer<T> extends java.lang.Object
Groups items into a specified number of clusters, based on their proximity in d-dimensional space, using the k-means algorithm. Calls tocluster
will terminate when either of the two following conditions is true:- the number of iterations is >
max_iterations
- none of the centroids has moved as much as
convergence_threshold
since the previous iteration
- the number of iterations is >
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
KMeansClusterer.NotEnoughClustersException
An exception that indicates that the specified data points cannot be clustered into the number of clusters requested by the user.
-
Field Summary
Fields Modifier and Type Field Description protected double
convergence_threshold
protected int
max_iterations
protected java.util.Random
rand
-
Constructor Summary
Constructors Constructor Description KMeansClusterer()
Creates an instance with max iterations of 100 and convergence threshold of 0.001.KMeansClusterer(int max_iterations, double convergence_threshold)
Creates an instance which will terminate when either the maximum number of iterations has been reached, or all changes are smaller than the convergence threshold.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected java.util.Map<double[],java.util.Map<T,double[]>>
assignToClusters(java.util.Map<T,double[]> object_locations, java.util.Set<double[]> centroids)
Assigns each object to the cluster whose centroid is closest to the object.java.util.Collection<java.util.Map<T,double[]>>
cluster(java.util.Map<T,double[]> object_locations, int num_clusters)
Returns aCollection
of clusters, where each cluster is represented as aMap
ofObjects
to locations in d-dimensional space.double
getConvergenceThreshold()
int
getMaxIterations()
void
setConvergenceThreshold(double convergence_threshold)
void
setMaxIterations(int max_iterations)
void
setSeed(int random_seed)
Sets the seed used by the internal random number generator.
-
-
-
Constructor Detail
-
KMeansClusterer
public KMeansClusterer(int max_iterations, double convergence_threshold)
Creates an instance which will terminate when either the maximum number of iterations has been reached, or all changes are smaller than the convergence threshold.- Parameters:
max_iterations
- the maximum number of iterations to employconvergence_threshold
- the smallest change we want to track
-
KMeansClusterer
public KMeansClusterer()
Creates an instance with max iterations of 100 and convergence threshold of 0.001.
-
-
Method Detail
-
getMaxIterations
public int getMaxIterations()
- Returns:
- the maximum number of iterations
-
setMaxIterations
public void setMaxIterations(int max_iterations)
- Parameters:
max_iterations
- the maximum number of iterations
-
getConvergenceThreshold
public double getConvergenceThreshold()
- Returns:
- the convergence threshold
-
setConvergenceThreshold
public void setConvergenceThreshold(double convergence_threshold)
- Parameters:
convergence_threshold
- the convergence threshold
-
cluster
public java.util.Collection<java.util.Map<T,double[]>> cluster(java.util.Map<T,double[]> object_locations, int num_clusters)
Returns aCollection
of clusters, where each cluster is represented as aMap
ofObjects
to locations in d-dimensional space.- Parameters:
object_locations
- a map of the items to cluster, todouble
arrays that specify their locations in d-dimensional space.num_clusters
- the number of clusters to create- Returns:
- a clustering of the input objects in d-dimensional space
- Throws:
KMeansClusterer.NotEnoughClustersException
- ifnum_clusters
is larger than the number of distinct points in object_locations
-
assignToClusters
protected java.util.Map<double[],java.util.Map<T,double[]>> assignToClusters(java.util.Map<T,double[]> object_locations, java.util.Set<double[]> centroids)
Assigns each object to the cluster whose centroid is closest to the object.- Parameters:
object_locations
- a map of objects to locationscentroids
- the centroids of the clusters to be formed- Returns:
- a map of objects to assigned clusters
-
setSeed
public void setSeed(int random_seed)
Sets the seed used by the internal random number generator. Enables consistent outputs.- Parameters:
random_seed
- the random seed to use
-
-