SciPy

This is documentation for an old release of SciPy (version 0.19.0). Read this page in the documentation of the latest stable release (version 1.15.1).

Distance computations (scipy.spatial.distance)

Function Reference

Distance matrix computation from a collection of raw observation vectors stored in a rectangular array.

pdist(X[, metric, p, w, V, VI]) Pairwise distances between observations in n-dimensional space.
cdist(XA, XB[, metric, p, V, VI, w]) Computes distance between each pair of the two collections of inputs.
squareform(X[, force, checks]) Converts a vector-form distance vector to a square-form distance matrix, and vice-versa.
directed_hausdorff(u, v[, seed]) Computes the directed Hausdorff distance between two N-D arrays.

Predicates for checking the validity of distance matrices, both condensed and redundant. Also contained in this module are functions for computing the number of observations in a distance matrix.

is_valid_dm(D[, tol, throw, name, warning]) Returns True if input array is a valid distance matrix.
is_valid_y(y[, warning, throw, name]) Returns True if the input array is a valid condensed distance matrix.
num_obs_dm(d) Returns the number of original observations that correspond to a square, redundant distance matrix.
num_obs_y(Y) Returns the number of original observations that correspond to a condensed distance matrix.

Distance functions between two numeric vectors u and v. Computing distances over a large collection of vectors is inefficient for these functions. Use pdist for this purpose.

braycurtis(u, v) Computes the Bray-Curtis distance between two 1-D arrays.
canberra(u, v) Computes the Canberra distance between two 1-D arrays.
chebyshev(u, v) Computes the Chebyshev distance.
cityblock(u, v) Computes the City Block (Manhattan) distance.
correlation(u, v) Computes the correlation distance between two 1-D arrays.
cosine(u, v) Computes the Cosine distance between 1-D arrays.
euclidean(u, v) Computes the Euclidean distance between two 1-D arrays.
mahalanobis(u, v, VI) Computes the Mahalanobis distance between two 1-D arrays.
minkowski(u, v, p) Computes the Minkowski distance between two 1-D arrays.
seuclidean(u, v, V) Returns the standardized Euclidean distance between two 1-D arrays.
sqeuclidean(u, v) Computes the squared Euclidean distance between two 1-D arrays.
wminkowski(u, v, p, w) Computes the weighted Minkowski distance between two 1-D arrays.

Distance functions between two boolean vectors (representing sets) u and v. As in the case of numerical vectors, pdist is more efficient for computing the distances between all pairs.

dice(u, v) Computes the Dice dissimilarity between two boolean 1-D arrays.
hamming(u, v) Computes the Hamming distance between two 1-D arrays.
jaccard(u, v) Computes the Jaccard-Needham dissimilarity between two boolean 1-D arrays.
kulsinski(u, v) Computes the Kulsinski dissimilarity between two boolean 1-D arrays.
matching(u, v) Computes the Hamming distance between two boolean 1-D arrays.
rogerstanimoto(u, v) Computes the Rogers-Tanimoto dissimilarity between two boolean 1-D arrays.
russellrao(u, v) Computes the Russell-Rao dissimilarity between two boolean 1-D arrays.
sokalmichener(u, v) Computes the Sokal-Michener dissimilarity between two boolean 1-D arrays.
sokalsneath(u, v) Computes the Sokal-Sneath dissimilarity between two boolean 1-D arrays.
yule(u, v) Computes the Yule dissimilarity between two boolean 1-D arrays.

hamming also operates over discrete numerical vectors.