scipy.cluster.hierarchy.leaders#
- scipy.cluster.hierarchy.leaders(Z, T)[source]#
- Return the root nodes in a hierarchical clustering. - Returns the root nodes in a hierarchical clustering corresponding to a cut defined by a flat cluster assignment vector - T. See the- fclusterfunction for more information on the format of- T.- For each flat cluster \(j\) of the \(k\) flat clusters represented in the n-sized flat cluster assignment vector - T, this function finds the lowest cluster node \(i\) in the linkage tree Z, such that:- leaf descendants belong only to flat cluster j (i.e., - T[p]==jfor all \(p\) in \(S(i)\), where \(S(i)\) is the set of leaf ids of descendant leaf nodes with cluster node \(i\))
- there does not exist a leaf that is not a descendant with \(i\) that also belongs to cluster \(j\) (i.e., - T[q]!=jfor all \(q\) not in \(S(i)\)). If this condition is violated,- Tis not a valid cluster assignment vector, and an exception will be thrown.
 - Parameters
- Zndarray
- The hierarchical clustering encoded as a matrix. See - linkagefor more information.
- Tndarray
- The flat cluster assignment vector. 
 
- Returns
- Lndarray
- The leader linkage node id’s stored as a k-element 1-D array, where - kis the number of flat clusters found in- T.- L[j]=iis the linkage cluster node id that is the leader of flat cluster with id M[j]. If- i < n,- icorresponds to an original observation, otherwise it corresponds to a non-singleton cluster.
- Mndarray
- The leader linkage node id’s stored as a k-element 1-D array, where - kis the number of flat clusters found in- T. This allows the set of flat cluster ids to be any arbitrary set of- kintegers.- For example: if - L[3]=2and- M[3]=8, the flat cluster with id 8’s leader is linkage node 2.
 
 - See also - fcluster
- for the creation of flat cluster assignments. 
 - Examples - >>> from scipy.cluster.hierarchy import ward, fcluster, leaders >>> from scipy.spatial.distance import pdist - Given a linkage matrix - Z- obtained after apply a clustering method to a dataset- X- and a flat cluster assignment array- T:- >>> X = [[0, 0], [0, 1], [1, 0], ... [0, 4], [0, 3], [1, 4], ... [4, 0], [3, 0], [4, 1], ... [4, 4], [3, 4], [4, 3]] - >>> Z = ward(pdist(X)) >>> Z array([[ 0. , 1. , 1. , 2. ], [ 3. , 4. , 1. , 2. ], [ 6. , 7. , 1. , 2. ], [ 9. , 10. , 1. , 2. ], [ 2. , 12. , 1.29099445, 3. ], [ 5. , 13. , 1.29099445, 3. ], [ 8. , 14. , 1.29099445, 3. ], [11. , 15. , 1.29099445, 3. ], [16. , 17. , 5.77350269, 6. ], [18. , 19. , 5.77350269, 6. ], [20. , 21. , 8.16496581, 12. ]]) - >>> T = fcluster(Z, 3, criterion='distance') >>> T array([1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4], dtype=int32) - scipy.cluster.hierarchy.leadersreturns the indices of the nodes in the dendrogram that are the leaders of each flat cluster:- >>> L, M = leaders(Z, T) >>> L array([16, 17, 18, 19], dtype=int32) - (remember that indices 0-11 point to the 12 data points in - X, whereas indices 12-22 point to the 11 rows of- Z)- scipy.cluster.hierarchy.leadersalso returns the indices of the flat clusters in- T:- >>> M array([1, 2, 3, 4], dtype=int32)