This is documentation for an old release of SciPy (version 0.10.1). Read this page Search for this page in the documentation of the latest stable release (version 1.15.1).
Function Reference
Distance matrix computation from a collection of raw observation vectors
stored in a rectangular array.
pdist(X[, metric, p, w, V, VI]) |
Computes the pairwise distances between m original observations in n-dimensional space. |
cdist(XA, XB[, metric, p, V, VI, w]) |
Computes distance between each pair of observation vectors in the |
squareform(X[, force, checks]) |
Converts a vector-form distance vector to a square-form distance matrix, and vice-versa. |
Predicates for checking the validity of distance matrices, both
condensed and redundant. Also contained in this module are functions
for computing the number of observations in a distance matrix.
is_valid_dm(D[, tol, throw, name, warning]) |
Returns True if the variable D passed is a valid distance matrix. |
is_valid_y(y[, warning, throw, name]) |
Returns True if the variable y passed is a valid condensed |
num_obs_dm(d) |
Returns the number of original observations that correspond to a |
num_obs_y(Y) |
Returns the number of original observations that correspond to a |
Distance functions between two vectors u and v. Computing
distances over a large collection of vectors is inefficient for these
functions. Use pdist for this purpose.
braycurtis(u, v) |
Computes the Bray-Curtis distance between two n-vectors u and |
canberra(u, v) |
Computes the Canberra distance between two n-vectors u and v, |
chebyshev(u, v) |
Computes the Chebyshev distance between two n-vectors u and v, |
cityblock(u, v) |
Computes the Manhattan distance between two n-vectors u and v, |
correlation(u, v) |
Computes the correlation distance between two n-vectors u and v, which is defined as .. |
cosine(u, v) |
Computes the Cosine distance between two n-vectors u and v, which |
dice(u, v) |
Computes the Dice dissimilarity between two boolean n-vectors |
euclidean(u, v) |
Computes the Euclidean distance between two n-vectors u and v, |
hamming(u, v) |
Computes the Hamming distance between two n-vectors u and |
jaccard(u, v) |
Computes the Jaccard-Needham dissimilarity between two boolean |
kulsinski(u, v) |
Computes the Kulsinski dissimilarity between two boolean n-vectors |
mahalanobis(u, v, VI) |
Computes the Mahalanobis distance between two n-vectors u and v, |
matching(u, v) |
Computes the Matching dissimilarity between two boolean n-vectors |
minkowski(u, v, p) |
Computes the Minkowski distance between two vectors u and v, |
rogerstanimoto(u, v) |
Computes the Rogers-Tanimoto dissimilarity between two boolean |
russellrao(u, v) |
Computes the Russell-Rao dissimilarity between two boolean n-vectors |
seuclidean(u, v, V) |
Returns the standardized Euclidean distance between two n-vectors |
sokalmichener(u, v) |
Computes the Sokal-Michener dissimilarity between two boolean vectors |
sokalsneath(u, v) |
Computes the Sokal-Sneath dissimilarity between two boolean vectors |
sqeuclidean(u, v) |
Computes the squared Euclidean distance between two n-vectors u and v, |
yule(u, v) |
Computes the Yule dissimilarity between two boolean n-vectors u and v, |
References
[Gow69] | Gower, JC and Ross, GJS. “Minimum Spanning Trees and Single Linkage
Cluster Analysis.” Applied Statistics. 18(1): pp. 54–64. 1969. |
[War63] | Ward Jr, JH. “Hierarchical grouping to optimize an objective
function.” Journal of the American Statistical Association. 58(301):
pp. 236–44. 1963. |
[Joh66] | Johnson, SC. “Hierarchical clustering schemes.” Psychometrika.
32(2): pp. 241–54. 1966. |
[Sne62] | Sneath, PH and Sokal, RR. “Numerical taxonomy.” Nature. 193: pp.
855–60. 1962. |
[Bat95] | Batagelj, V. “Comparing resemblance measures.” Journal of
Classification. 12: pp. 73–90. 1995. |
[Sok58] | Sokal, RR and Michener, CD. “A statistical method for evaluating
systematic relationships.” Scientific Bulletins. 38(22):
pp. 1409–38. 1958. |
[Ede79] | Edelbrock, C. “Mixture model tests of hierarchical clustering
algorithms: the problem of classifying everybody.” Multivariate
Behavioral Research. 14: pp. 367–84. 1979. |
[Jai88] | Jain, A., and Dubes, R., “Algorithms for Clustering Data.”
Prentice-Hall. Englewood Cliffs, NJ. 1988. |
[Fis36] | Fisher, RA “The use of multiple measurements in taxonomic
problems.” Annals of Eugenics, 7(2): 179-188. 1936 |
Copyright Notice
Copyright (C) Damian Eads, 2007-2008. New BSD License.