scipy.cluster.hierarchy.is_valid_linkage#
- scipy.cluster.hierarchy.is_valid_linkage(Z, warning=False, throw=False, name=None)[source]#
Check the validity of a linkage matrix.
A linkage matrix is valid if it is a 2-D array (type double) with \(n\) rows and 4 columns. The first two columns must contain indices between 0 and \(2n-1\). For a given row
i
, the following two expressions have to hold:\[0 \leq \mathtt{Z[i,0]} \leq i+n-1 0 \leq Z[i,1] \leq i+n-1\]I.e., a cluster cannot join another cluster unless the cluster being joined has been generated.
- Parameters:
- Zarray_like
Linkage matrix.
- warningbool, optional
When True, issues a Python warning if the linkage matrix passed is invalid.
- throwbool, optional
When True, throws a Python exception if the linkage matrix passed is invalid.
- namestr, optional
This string refers to the variable name of the invalid linkage matrix.
- Returns:
- bbool
True if the inconsistency matrix is valid.
See also
linkage
for a description of what a linkage matrix is.
Examples
>>> from scipy.cluster.hierarchy import ward, is_valid_linkage >>> from scipy.spatial.distance import pdist
All linkage matrices generated by the clustering methods in this module will be valid (i.e., they will have the appropriate dimensions and the two required expressions will hold for all the rows).
We can check this using
scipy.cluster.hierarchy.is_valid_linkage
:>>> X = [[0, 0], [0, 1], [1, 0], ... [0, 4], [0, 3], [1, 4], ... [4, 0], [3, 0], [4, 1], ... [4, 4], [3, 4], [4, 3]]
>>> Z = ward(pdist(X)) >>> Z array([[ 0. , 1. , 1. , 2. ], [ 3. , 4. , 1. , 2. ], [ 6. , 7. , 1. , 2. ], [ 9. , 10. , 1. , 2. ], [ 2. , 12. , 1.29099445, 3. ], [ 5. , 13. , 1.29099445, 3. ], [ 8. , 14. , 1.29099445, 3. ], [11. , 15. , 1.29099445, 3. ], [16. , 17. , 5.77350269, 6. ], [18. , 19. , 5.77350269, 6. ], [20. , 21. , 8.16496581, 12. ]]) >>> is_valid_linkage(Z) True
However, if we create a linkage matrix in a wrong way - or if we modify a valid one in a way that any of the required expressions don’t hold anymore, then the check will fail:
>>> Z[3][1] = 20 # the cluster number 20 is not defined at this point >>> is_valid_linkage(Z) False