SciPy

scipy.cluster.hierarchy.fcluster

scipy.cluster.hierarchy.fcluster(Z, t, criterion='inconsistent', depth=2, R=None, monocrit=None)[source]

Forms flat clusters from the hierarchical clustering defined by the linkage matrix Z.

Parameters:

Z : ndarray

The hierarchical clustering encoded with the matrix returned by the linkage function.

t : float

The threshold to apply when forming flat clusters.

criterion : str, optional

The criterion to use in forming flat clusters. This can be any of the following values:

inconsistent : If a cluster node and all its

descendants have an inconsistent value less than or equal to t then all its leaf descendants belong to the same flat cluster. When no non-singleton cluster meets this criterion, every node is assigned to its own cluster. (Default)

distance : Forms flat clusters so that the original

observations in each flat cluster have no greater a cophenetic distance than t.

maxclust : Finds a minimum threshold r so that

the cophenetic distance between any two original observations in the same flat cluster is no more than r and no more than t flat clusters are formed.

monocrit : Forms a flat cluster from a cluster node c

with index i when monocrit[j] <= t.

For example, to threshold on the maximum mean distance as computed in the inconsistency matrix R with a threshold of 0.8 do:

MR = maxRstat(Z, R, 3)
cluster(Z, t=0.8, criterion='monocrit', monocrit=MR)
maxclust_monocrit : Forms a flat cluster from a

non-singleton cluster node c when monocrit[i] <= r for all cluster indices i below and including c. r is minimized such that no more than t flat clusters are formed. monocrit must be monotonic. For example, to minimize the threshold t on maximum inconsistency values so that no more than 3 flat clusters are formed, do:

MI = maxinconsts(Z, R)
cluster(Z, t=3, criterion='maxclust_monocrit', monocrit=MI)

depth : int, optional

The maximum depth to perform the inconsistency calculation. It has no meaning for the other criteria. Default is 2.

R : ndarray, optional

The inconsistency matrix to use for the ‘inconsistent’ criterion. This matrix is computed if not provided.

monocrit : ndarray, optional

An array of length n-1. monocrit[i] is the statistics upon which non-singleton i is thresholded. The monocrit vector must be monotonic, i.e. given a node c with index i, for all node indices j corresponding to nodes below c, monocrit[i] >= monocrit[j].

Returns:

fcluster : ndarray

An array of length n. T[i] is the flat cluster number to which original observation i belongs.