Distance matrix computation from a collection of raw observation vectors stored in a rectangular array.

pdist(X[, metric, p, w, V, VI]) |
Computes the pairwise distances between m original observations in n-dimensional space. |

cdist(XA, XB[, metric, p, V, VI, w]) |
Computes distance between each pair of observation vectors in the |

squareform(X[, force, checks]) |
Converts a vector-form distance vector to a square-form distance matrix, and vice-versa. |

Predicates for checking the validity of distance matrices, both condensed and redundant. Also contained in this module are functions for computing the number of observations in a distance matrix.

is_valid_dm(D[, tol, throw, name, warning]) |
Returns True if the variable D passed is a valid distance matrix. |

is_valid_y(y[, warning, throw, name]) |
Returns True if the variable y passed is a valid condensed |

num_obs_dm(d) |
Returns the number of original observations that correspond to a |

num_obs_y(Y) |
Returns the number of original observations that correspond to a |

Distance functions between two vectors `u` and `v`. Computing
distances over a large collection of vectors is inefficient for these
functions. Use `pdist` for this purpose.

braycurtis(u, v) |
Computes the Bray-Curtis distance between two n-vectors u and |

canberra(u, v) |
Computes the Canberra distance between two n-vectors u and v, |

chebyshev(u, v) |
Computes the Chebyshev distance between two n-vectors u and v, |

cityblock(u, v) |
Computes the Manhattan distance between two n-vectors u and v, |

correlation(u, v) |
Computes the correlation distance between two n-vectors u and v, which is defined as .. |

cosine(u, v) |
Computes the Cosine distance between two n-vectors u and v, which |

dice(u, v) |
Computes the Dice dissimilarity between two boolean n-vectors |

euclidean(u, v) |
Computes the Euclidean distance between two n-vectors u and v, |

hamming(u, v) |
Computes the Hamming distance between two n-vectors u and |

jaccard(u, v) |
Computes the Jaccard-Needham dissimilarity between two boolean |

kulsinski(u, v) |
Computes the Kulsinski dissimilarity between two boolean n-vectors |

mahalanobis(u, v, VI) |
Computes the Mahalanobis distance between two n-vectors u and v, |

matching(u, v) |
Computes the Matching dissimilarity between two boolean n-vectors |

minkowski(u, v, p) |
Computes the Minkowski distance between two vectors u and v, |

rogerstanimoto(u, v) |
Computes the Rogers-Tanimoto dissimilarity between two boolean |

russellrao(u, v) |
Computes the Russell-Rao dissimilarity between two boolean n-vectors |

seuclidean(u, v, V) |
Returns the standardized Euclidean distance between two n-vectors |

sokalmichener(u, v) |
Computes the Sokal-Michener dissimilarity between two boolean vectors |

sokalsneath(u, v) |
Computes the Sokal-Sneath dissimilarity between two boolean vectors |

sqeuclidean(u, v) |
Computes the squared Euclidean distance between two n-vectors u and v, |

yule(u, v) |
Computes the Yule dissimilarity between two boolean n-vectors u and v, |

[Sta07] | “Statistics toolbox.” API Reference Documentation. The MathWorks. http://www.mathworks.com/access/helpdesk/help/toolbox/stats/. Accessed October 1, 2007. |

[Mti07] | “Hierarchical clustering.” API Reference Documentation. The Wolfram Research, Inc. http://reference.wolfram.com/mathematica/HierarchicalClustering/tutorial/HierarchicalClustering.html. Accessed October 1, 2007. |

[Gow69] | Gower, JC and Ross, GJS. “Minimum Spanning Trees and Single Linkage Cluster Analysis.” Applied Statistics. 18(1): pp. 54–64. 1969. |

[War63] | Ward Jr, JH. “Hierarchical grouping to optimize an objective function.” Journal of the American Statistical Association. 58(301): pp. 236–44. 1963. |

[Joh66] | Johnson, SC. “Hierarchical clustering schemes.” Psychometrika. 32(2): pp. 241–54. 1966. |

[Sne62] | Sneath, PH and Sokal, RR. “Numerical taxonomy.” Nature. 193: pp. 855–60. 1962. |

[Bat95] | Batagelj, V. “Comparing resemblance measures.” Journal of Classification. 12: pp. 73–90. 1995. |

[Sok58] | Sokal, RR and Michener, CD. “A statistical method for evaluating systematic relationships.” Scientific Bulletins. 38(22): pp. 1409–38. 1958. |

[Ede79] | Edelbrock, C. “Mixture model tests of hierarchical clustering algorithms: the problem of classifying everybody.” Multivariate Behavioral Research. 14: pp. 367–84. 1979. |

[Jai88] | Jain, A., and Dubes, R., “Algorithms for Clustering Data.” Prentice-Hall. Englewood Cliffs, NJ. 1988. |

[Fis36] | Fisher, RA “The use of multiple measurements in taxonomic problems.” Annals of Eugenics, 7(2): 179-188. 1936 |

Copyright (C) Damian Eads, 2007-2008. New BSD License.