scipy.stats.mstats.

spearmanr#

scipy.stats.mstats.spearmanr(x, y=None, use_ties=True, axis=None, nan_policy='propagate', alternative='two-sided')[source]#

Calculates a Spearman rank-order correlation coefficient and the p-value to test for non-correlation.

The Spearman correlation is a nonparametric measure of the linear relationship between two datasets. Unlike the Pearson correlation, the Spearman correlation does not assume that both datasets are normally distributed. Like other correlation coefficients, this one varies between -1 and +1 with 0 implying no correlation. Correlations of -1 or +1 imply a monotonic relationship. Positive correlations imply that as x increases, so does y. Negative correlations imply that as x increases, y decreases.

Missing values are discarded pair-wise: if a value is missing in x, the corresponding value in y is masked.

The p-value roughly indicates the probability of an uncorrelated system producing datasets that have a Spearman correlation at least as extreme as the one computed from these datasets. The p-values are not entirely reliable but are probably reasonable for datasets larger than 500 or so.

Parameters:

x, y1D or 2D array_like, y is optional

One or two 1-D or 2-D arrays containing multiple variables and observations. When these are 1-D, each represents a vector of observations of a single variable. For the behavior in the 2-D case, see under axis, below.

use_tiesbool, optional

DO NOT USE. Does not do anything, keyword is only left in place for backwards compatibility reasons.

axisint or None, optional

If axis=0 (default), then each column represents a variable, with observations in the rows. If axis=1, the relationship is transposed: each row represents a variable, while the columns contain observations. If axis=None, then both arrays will be raveled.

nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional

Defines how to handle when input contains nan. ‘propagate’ returns nan, ‘raise’ throws an error, ‘omit’ performs the calculations ignoring nan values. Default is ‘propagate’.

alternative{‘two-sided’, ‘less’, ‘greater’}, optional

Defines the alternative hypothesis. Default is ‘two-sided’. The following options are available:

‘two-sided’: the correlation is nonzero
‘less’: the correlation is negative (less than zero)
‘greater’: the correlation is positive (greater than zero)

Added in version 1.7.0.

Returns:

resSignificanceResult

An object containing attributes:

statisticfloat or ndarray (2-D square): Spearman correlation matrix or correlation coefficient (if only 2 variables are given as parameters). Correlation matrix is square with length equal to total number of variables (columns or rows) in a and b combined.
pvaluefloat: The p-value for a hypothesis test whose null hypothesis is that two sets of data are linearly uncorrelated. See alternative above for alternative hypotheses. pvalue has the same shape as statistic.

References

[CRCProbStat2000] section 14.7