scipy.cluster.hierarchy.fcluster¶
-
scipy.cluster.hierarchy.
fcluster
(Z, t, criterion='inconsistent', depth=2, R=None, monocrit=None)[source]¶ Form flat clusters from the hierarchical clustering defined by the given linkage matrix.
Parameters: Z : ndarray
The hierarchical clustering encoded with the matrix returned by the
linkage
function.t : float
The threshold to apply when forming flat clusters.
criterion : str, optional
The criterion to use in forming flat clusters. This can be any of the following values:
inconsistent
: If a cluster node and all itsdescendants have an inconsistent value less than or equal to t then all its leaf descendants belong to the same flat cluster. When no non-singleton cluster meets this criterion, every node is assigned to its own cluster. (Default)
distance
: Forms flat clusters so that the originalobservations in each flat cluster have no greater a cophenetic distance than t.
maxclust
: Finds a minimum thresholdr
so thatthe cophenetic distance between any two original observations in the same flat cluster is no more than
r
and no more than t flat clusters are formed.monocrit
: Forms a flat cluster from a cluster node cwith index i when
monocrit[j] <= t
.For example, to threshold on the maximum mean distance as computed in the inconsistency matrix R with a threshold of 0.8 do:
MR = maxRstat(Z, R, 3) cluster(Z, t=0.8, criterion='monocrit', monocrit=MR)
maxclust_monocrit
: Forms a flat cluster from anon-singleton cluster node
c
whenmonocrit[i] <= r
for all cluster indicesi
below and includingc
.r
is minimized such that no more thant
flat clusters are formed. monocrit must be monotonic. For example, to minimize the threshold t on maximum inconsistency values so that no more than 3 flat clusters are formed, do:MI = maxinconsts(Z, R) cluster(Z, t=3, criterion='maxclust_monocrit', monocrit=MI)
depth : int, optional
The maximum depth to perform the inconsistency calculation. It has no meaning for the other criteria. Default is 2.
R : ndarray, optional
The inconsistency matrix to use for the ‘inconsistent’ criterion. This matrix is computed if not provided.
monocrit : ndarray, optional
An array of length n-1. monocrit[i] is the statistics upon which non-singleton i is thresholded. The monocrit vector must be monotonic, i.e. given a node c with index i, for all node indices j corresponding to nodes below c,
monocrit[i] >= monocrit[j]
.Returns: fcluster : ndarray
An array of length
n
.T[i]
is the flat cluster number to which original observationi
belongs.