scipy.cluster.vq.whiten

scipy.cluster.vq.whiten(obs)[source]

Normalize a group of observations on a per feature basis.

Before running k-means, it is beneficial to rescale each feature dimension of the observation set with whitening. Each feature is divided by its standard deviation across all observations to give it unit variance.

Parameters :

obs : ndarray

Each row of the array is an observation. The columns are the features seen during each observation.

>>> #         f0    f1    f2
>>> obs = [[  1.,   1.,   1.],  #o0
...        [  2.,   2.,   2.],  #o1
...        [  3.,   3.,   3.],  #o2
...        [  4.,   4.,   4.]]) #o3
Returns :

result : ndarray

Contains the values in obs scaled by the standard deviation of each column.

Examples

>>> from scipy.cluster.vq import whiten
>>> features  = np.array([[1.9, 2.3, 1.7],
...                       [1.5, 2.5, 2.2],
...                       [0.8, 0.6, 1.7,]])
>>> whiten(features)
array([[ 4.17944278,  2.69811351,  7.21248917],
       [ 3.29956009,  2.93273208,  9.33380951],
       [ 1.75976538,  0.7038557 ,  7.21248917]])

Previous topic

scipy.cluster.vq.kmeans2

Next topic

scipy.cluster.vq.vq