SciPy

scipy.cluster.vq.vq

scipy.cluster.vq.vq(obs, code_book, check_finite=True)[source]

Assign codes from a code book to observations.

Assigns a code from a code book to each observation. Each observation vector in the ‘M’ by ‘N’ obs array is compared with the centroids in the code book and assigned the code of the closest centroid.

The features in obs should have unit variance, which can be achieved by passing them through the whiten function. The code book can be created with the k-means algorithm or a different encoding algorithm.

Parameters:

obs : ndarray

Each row of the ‘M’ x ‘N’ array is an observation. The columns are the “features” seen during each observation. The features must be whitened first using the whiten function or something equivalent.

code_book : ndarray

The code book is usually generated using the k-means algorithm. Each row of the array holds a different code, and the columns are the features of the code.

>>> #              f0    f1    f2   f3
>>> code_book = [
...             [  1.,   2.,   3.,   4.],  #c0
...             [  1.,   2.,   3.,   4.],  #c1
...             [  1.,   2.,   3.,   4.]]  #c2

check_finite : bool, optional

Whether to check that the input matrices contain only finite numbers. Disabling may give a performance gain, but may result in problems (crashes, non-termination) if the inputs do contain infinities or NaNs. Default: True

Returns:

code : ndarray

A length M array holding the code book index for each observation.

dist : ndarray

The distortion (distance) between the observation and its nearest code.

Examples

>>> from numpy import array
>>> from scipy.cluster.vq import vq
>>> code_book = array([[1.,1.,1.],
...                    [2.,2.,2.]])
>>> features  = array([[  1.9,2.3,1.7],
...                    [  1.5,2.5,2.2],
...                    [  0.8,0.6,1.7]])
>>> vq(features,code_book)
(array([1, 1, 0],'i'), array([ 0.43588989,  0.73484692,  0.83066239]))