scipy.cluster.hierarchy.is_valid_im¶
-
scipy.cluster.hierarchy.
is_valid_im
(R, warning=False, throw=False, name=None)[source]¶ Return True if the inconsistency matrix passed is valid.
It must be a \(n\) by 4 array of doubles. The standard deviations
R[:,1]
must be nonnegative. The link countsR[:,2]
must be positive and no greater than \(n-1\).Parameters: - R : ndarray
The inconsistency matrix to check for validity.
- warning : bool, optional
When True, issues a Python warning if the linkage matrix passed is invalid.
- throw : bool, optional
When True, throws a Python exception if the linkage matrix passed is invalid.
- name : str, optional
This string refers to the variable name of the invalid linkage matrix.
Returns: - b : bool
True if the inconsistency matrix is valid.
See also
linkage
- for a description of what a linkage matrix is.
inconsistent
- for the creation of a inconsistency matrix.
Examples
>>> from scipy.cluster.hierarchy import ward, inconsistent, is_valid_im >>> from scipy.spatial.distance import pdist
Given a data set
X
, we can apply a clustering method to obtain a linkage matrixZ
.scipy.cluster.hierarchy.inconsistent
can be also used to obtain the inconsistency matrixR
associated to this clustering process:>>> X = [[0, 0], [0, 1], [1, 0], ... [0, 4], [0, 3], [1, 4], ... [4, 0], [3, 0], [4, 1], ... [4, 4], [3, 4], [4, 3]]
>>> Z = ward(pdist(X)) >>> R = inconsistent(Z) >>> Z array([[ 0. , 1. , 1. , 2. ], [ 3. , 4. , 1. , 2. ], [ 6. , 7. , 1. , 2. ], [ 9. , 10. , 1. , 2. ], [ 2. , 12. , 1.29099445, 3. ], [ 5. , 13. , 1.29099445, 3. ], [ 8. , 14. , 1.29099445, 3. ], [11. , 15. , 1.29099445, 3. ], [16. , 17. , 5.77350269, 6. ], [18. , 19. , 5.77350269, 6. ], [20. , 21. , 8.16496581, 12. ]]) >>> R array([[1. , 0. , 1. , 0. ], [1. , 0. , 1. , 0. ], [1. , 0. , 1. , 0. ], [1. , 0. , 1. , 0. ], [1.14549722, 0.20576415, 2. , 0.70710678], [1.14549722, 0.20576415, 2. , 0.70710678], [1.14549722, 0.20576415, 2. , 0.70710678], [1.14549722, 0.20576415, 2. , 0.70710678], [2.78516386, 2.58797734, 3. , 1.15470054], [2.78516386, 2.58797734, 3. , 1.15470054], [6.57065706, 1.38071187, 3. , 1.15470054]])
Now we can use
scipy.cluster.hierarchy.is_valid_im
to verify thatR
is correct:>>> is_valid_im(R) True
However, if
R
is wrongly constructed (e.g one of the standard deviations is set to a negative value) then the check will fail:>>> R[-1,1] = R[-1,1] * -1 >>> is_valid_im(R) False