Hierarchical clustering (scipy.cluster.hierarchy
)¶
These functions cut hierarchical clusterings into flat clusterings or find the roots of the forest formed by a cut by providing the flat cluster ids of each observation.
|
Form flat clusters from the hierarchical clustering defined by the given linkage matrix. |
|
Cluster observation data using a given metric. |
|
Return the root nodes in a hierarchical clustering. |
These are routines for agglomerative clustering.
|
Perform hierarchical/agglomerative clustering. |
|
Perform single/min/nearest linkage on the condensed distance matrix |
|
Perform complete/max/farthest point linkage on a condensed distance matrix. |
|
Perform average/UPGMA linkage on a condensed distance matrix. |
|
Perform weighted/WPGMA linkage on the condensed distance matrix. |
|
Perform centroid/UPGMC linkage. |
|
Perform median/WPGMC linkage. |
|
Perform Ward’s linkage on a condensed distance matrix. |
These routines compute statistics on hierarchies.
|
Calculate the cophenetic distances between each observation in the hierarchical clustering defined by the linkage |
Convert a linkage matrix generated by MATLAB(TM) to a new linkage matrix compatible with this module. |
|
|
Calculate inconsistency statistics on a linkage matrix. |
|
Return the maximum inconsistency coefficient for each non-singleton cluster and its children. |
|
Return the maximum distance between any non-singleton cluster. |
|
Return the maximum statistic for each non-singleton cluster and its children. |
Convert a linkage matrix to a MATLAB(TM) compatible one. |
Routines for visualizing flat clusters.
|
Plot the hierarchical clustering as a dendrogram. |
These are data structures and routines for representing hierarchies as tree objects.
|
A tree node class for representing a cluster. |
|
Return a list of leaf node ids. |
|
Convert a linkage matrix into an easy-to-use tree object. |
|
Given a linkage matrix Z, return the cut tree. |
|
Given a linkage matrix Z and distance, reorder the cut tree. |
These are predicates for checking the validity of linkage and inconsistency matrices as well as for checking isomorphism of two flat cluster assignments.
|
Return True if the inconsistency matrix passed is valid. |
|
Check the validity of a linkage matrix. |
|
Determine if two different cluster assignments are equivalent. |
|
Return True if the linkage passed is monotonic. |
|
Check for correspondence between linkage and condensed distance matrices. |
Return the number of original observations of the linkage matrix passed. |
Utility routines for plotting:
|
Set list of matplotlib color codes for use by dendrogram. |
References¶
- 1
“Statistics toolbox.” API Reference Documentation. The MathWorks. https://www.mathworks.com/access/helpdesk/help/toolbox/stats/. Accessed October 1, 2007.
- 2
“Hierarchical clustering.” API Reference Documentation. The Wolfram Research, Inc. https://reference.wolfram.com/language/HierarchicalClustering/tutorial/HierarchicalClustering.html. Accessed October 1, 2007.
- 3
Gower, JC and Ross, GJS. “Minimum Spanning Trees and Single Linkage Cluster Analysis.” Applied Statistics. 18(1): pp. 54–64. 1969.
- 4
Ward Jr, JH. “Hierarchical grouping to optimize an objective function.” Journal of the American Statistical Association. 58(301): pp. 236–44. 1963.
- 5
Johnson, SC. “Hierarchical clustering schemes.” Psychometrika. 32(2): pp. 241–54. 1966.
- 6
Sneath, PH and Sokal, RR. “Numerical taxonomy.” Nature. 193: pp. 855–60. 1962.
- 7
Batagelj, V. “Comparing resemblance measures.” Journal of Classification. 12: pp. 73–90. 1995.
- 8
Sokal, RR and Michener, CD. “A statistical method for evaluating systematic relationships.” Scientific Bulletins. 38(22): pp. 1409–38. 1958.
- 9
Edelbrock, C. “Mixture model tests of hierarchical clustering algorithms: the problem of classifying everybody.” Multivariate Behavioral Research. 14: pp. 367–84. 1979.
- 10
Jain, A., and Dubes, R., “Algorithms for Clustering Data.” Prentice-Hall. Englewood Cliffs, NJ. 1988.
- 11
Fisher, RA “The use of multiple measurements in taxonomic problems.” Annals of Eugenics, 7(2): 179-188. 1936
MATLAB and MathWorks are registered trademarks of The MathWorks, Inc.
Mathematica is a registered trademark of The Wolfram Research, Inc.