This is documentation for an old release of SciPy (version 0.10.1). Read this page Search for this page in the documentation of the latest stable release (version 1.15.1).

scipy.cluster.vq.kmeans2¶

scipy.cluster.vq.kmeans2(data, k, iter=10, thresh=1e-05, minit='random', missing='warn')¶

Classify a set of observations into k clusters using the k-means algorithm.

The algorithm attempts to minimize the Euclidian distance between observations and centroids. Several initialization methods are included.

Parameters :

data : ndarray

A ‘M’ by ‘N’ array of ‘M’ observations in ‘N’ dimensions or a length ‘M’ array of ‘M’ one-dimensional observations.

k : int or ndarray

The number of clusters to form as well as the number of centroids to generate. If minit initialization string is ‘matrix’, or if a ndarray is given instead, it is interpreted as initial cluster to use instead.

iter : int

Number of iterations of the k-means algrithm to run. Note that this differs in meaning from the iters parameter to the kmeans function.

thresh : float

(not used yet)

minit : string

Method for initialization. Available methods are ‘random’, ‘points’, ‘uniform’, and ‘matrix’:

‘random’: generate k centroids from a Gaussian with mean and variance estimated from the data.

‘points’: choose k observations (rows) at random from data for the initial centroids.

‘uniform’: generate k observations from the data from a uniform distribution defined by the data set (unsupported).

‘matrix’: interpret the k parameter as a k by M (or length k array for one-dimensional data) array of initial centroids.

Returns :

centroid : ndarray

A ‘k’ by ‘N’ array of centroids found at the last iteration of k-means.

label : ndarray

label[i] is the code or index of the centroid the i’th observation is closest to.

scipy.cluster.vq.kmeans2¶

Previous topic

Next topic

This Page

Navigation

scipy.cluster.vq.kmeans2¶

Previous topic

Next topic

This Page

Quick search

Navigation