Click or drag to resize
Accord.NET (logo)

KMeans Class

Lloyd's k-Means clustering algorithm.
Inheritance Hierarchy
SystemObject
  Accord.MachineLearningParallelLearningBase
    Accord.MachineLearningKMeans
      Accord.MachineLearningBalancedKMeans
      Accord.MachineLearningBinarySplit

Namespace:  Accord.MachineLearning
Assembly:  Accord.MachineLearning (in Accord.MachineLearning.dll) Version: 3.5.0
Syntax
[SerializableAttribute]
public class KMeans : ParallelLearningBase, IClusteringAlgorithm<double[], double>, 
	IClusteringAlgorithm<double[]>, IUnsupervisedLearning<IClusterCollection<double[]>, double[], int>, 
	IUnsupervisedLearning<KMeansClusterCollection, double[], int>
Request Example View Source

The KMeans type exposes the following members.

Constructors
Properties
  NameDescription
Public propertyCentroids
Gets or sets the cluster centroids. Setting this property is equivalent to setting KMeans.Clusters.Centroids.
Public propertyClusters
Gets the clusters found by K-means.
Public propertyComputeCovariances
Gets or sets whether covariance matrices for the clusters should be computed at the end of an iteration. Default is true.
Public propertyComputeError
Gets or sets whether the clustering distortion error (the average distance between all data points and the cluster centroids) should be computed at the end of the algorithm. The result will be stored in Error. Default is true.
Public propertyDimension
Gets the dimensionality of the data space.
Public propertyDistance
Gets or sets the distance function used as a distance metric between data points.
Public propertyError
Gets the cluster distortion error after the last call to this class' Compute methods.
Public propertyIterations
Gets the number of iterations performed in the last call to this class' Compute methods.
Public propertyK
Gets the number of clusters.
Public propertyMaxIterations
Gets or sets the maximum number of iterations to be performed by the method. If set to zero, no iteration limit will be imposed. Default is 0.
Public propertyParallelOptions
Gets or sets the parallelization options for this algorithm.
(Inherited from ParallelLearningBase.)
Public propertyToken
Gets or sets a cancellation token that can be used to cancel the algorithm while it is running.
(Inherited from ParallelLearningBase.)
Public propertyTolerance
Gets or sets the relative convergence threshold for stopping the algorithm. Default is 1e-5.
Public propertyUseSeeding
Gets or sets the strategy used to initialize the centroids of the clustering algorithm. Default is KMeansPlusPlus.
Top
Methods
  NameDescription
Public methodCompute(Double) Obsolete.
Divides the input data into K clusters.
Public methodCompute(Double, Double) Obsolete.
Divides the input data into K clusters.
Public methodCompute(Double, Double) Obsolete.
Divides the input data into K clusters.
Protected methodComputeInformation
Computes the information about each cluster (covariance, proportions and error).
Protected methodconverged
Determines if the algorithm has converged by comparing the centroids between two consecutive iterations.
Public methodEquals
Determines whether the specified object is equal to the current object.
(Inherited from Object.)
Protected methodFinalize
Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection.
(Inherited from Object.)
Public methodGetHashCode
Serves as the default hash function.
(Inherited from Object.)
Public methodGetType
Gets the Type of the current instance.
(Inherited from Object.)
Public methodLearn
Learns a model that can map the given inputs to the desired outputs.
Protected methodMemberwiseClone
Creates a shallow copy of the current Object.
(Inherited from Object.)
Public methodRandomize
Randomizes the clusters inside a dataset.
Public methodToString
Returns a string that represents the current object.
(Inherited from Object.)
Top
Extension Methods
  NameDescription
Public Extension MethodHasMethod
Checks whether an object implements a method with the given name.
(Defined by ExtensionMethods.)
Public Extension MethodIsEqual
Compares two objects for equality, performing an elementwise comparison if the elements are vectors or matrices.
(Defined by Matrix.)
Public Extension MethodToTOverloaded.
Converts an object into another type, irrespective of whether the conversion can be done at compile time or not. This can be used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.)
Public Extension MethodToTOverloaded.
Converts an object into another type, irrespective of whether the conversion can be done at compile time or not. This can be used to convert generic types to numeric types during runtime.
(Defined by Matrix.)
Top
Remarks

In statistics and machine learning, k-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean.

It is similar to the expectation-maximization algorithm for mixtures of Gaussians in that they both attempt to find the centers of natural clusters in the data as well as in the iterative refinement approach employed by both algorithms.

The algorithm is composed of the following steps:

  1. Place K points into the space represented by the objects that are being clustered. These points represent initial group centroids.
  2. Assign each object to the group that has the closest centroid.
  3. When all objects have been assigned, recalculate the positions of the K centroids.
  4. Repeat Steps 2 and 3 until the centroids no longer move. This produces a separation of the objects into groups from which the metric to be minimized can be calculated.

This particular implementation uses the squared Euclidean distance as a similarity measure in order to form clusters.

References:

  • Wikipedia, The Free Encyclopedia. K-means clustering. Available on: http://en.wikipedia.org/wiki/K-means_clustering
  • Matteo Matteucci. A Tutorial on Clustering Algorithms. Available on: http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/kmeans.html

Examples
How to perform clustering with K-Means.
Accord.Math.Random.Generator.Seed = 0;

// Declare some observations
double[][] observations =
{
    new double[] { -5, -2, -1 },
    new double[] { -5, -5, -6 },
    new double[] {  2,  1,  1 },
    new double[] {  1,  1,  2 },
    new double[] {  1,  2,  2 },
    new double[] {  3,  1,  2 },
    new double[] { 11,  5,  4 },
    new double[] { 15,  5,  6 },
    new double[] { 10,  5,  6 },
};

double[][] orig = observations.MemberwiseClone();

// Create a new K-Means algorithm with 3 clusters 
KMeans kmeans = new KMeans(3);

// Compute the algorithm, retrieving an integer array
//  containing the labels for each of the observations
KMeansClusterCollection clusters = kmeans.Learn(observations);

// As a result, the first two observations should belong to the
//  same cluster (thus having the same label). The same should
//  happen to the next four observations and to the last three.
int[] labels = clusters.Decide(observations);

The following example demonstrates how to use the K-Means algorithm for color clustering. It is the same code which can be found in the color clustering sample application.

 int k = 5; 

 // Load a test image (shown below)
 Bitmap image = ...

 // Create converters
 ImageToArray imageToArray = new ImageToArray(min: -1, max: +1);
 ArrayToImage arrayToImage = new ArrayToImage(image.Width, image.Height, min: -1, max: +1);

 // Transform the image into an array of pixel values
 double[][] pixels; imageToArray.Convert(image, out pixels);


 // Create a K-Means algorithm using given k and a
 //  square Euclidean distance as distance metric.
 KMeans kmeans = new KMeans(k, Distance.SquareEuclidean);

 // We will compute the K-Means algorithm until cluster centroids
 // change less than 0.5 between two iterations of the algorithm
 kmeans.Tolerance = 0.05;

// Learn the clusters in the data
 var clustering = kmeans.Learn(pixels);

 // Use clusters to decide class labels
 int[] idx = clustering.Decide(pixels);

 // Replace every pixel with its corresponding centroid
 pixels.ApplyInPlace((x, i) => kmeans.Clusters.Centroids[idx[i]]);

 // Retrieve the resulting image in a picture box
 Bitmap result; arrayToImage.Convert(pixels, out result);

The original image is shown below:

The resulting image will be:

See Also