scipy.cluster.vq.whiten

scipy.cluster.vq.whiten(obs)[source]

Normalize a group of observations on a per feature basis.

Before running k-means, it is beneficial to rescale each feature dimension of the observation set with whitening. Each feature is divided by its standard deviation across all observations to give it unit variance.

Parameters :

obs : ndarray

Each row of the array is an observation. The columns are the features seen during each observation.

>>> #         f0    f1    f2
>>> obs = [[  1.,   1.,   1.],  #o0
...        [  2.,   2.,   2.],  #o1
...        [  3.,   3.,   3.],  #o2
...        [  4.,   4.,   4.]]) #o3
Returns :

result : ndarray

Contains the values in obs scaled by the standard devation of each column.

Examples

>>> from numpy import array
>>> from scipy.cluster.vq import whiten
>>> features  = array([[  1.9,2.3,1.7],
...                    [  1.5,2.5,2.2],
...                    [  0.8,0.6,1.7,]])
>>> whiten(features)
array([[ 3.41250074,  2.20300046,  5.88897275],
       [ 2.69407953,  2.39456571,  7.62102355],
       [ 1.43684242,  0.57469577,  5.88897275]])

Previous topic

scipy.cluster.vq.kmeans2

Next topic

scipy.cluster.vq.vq