sen_seasonal_slopes#
- scipy.stats.mstats.sen_seasonal_slopes(x)[source]#
Computes seasonal Theil-Sen and Kendall slope estimators.
The seasonal generalization of Sen’s slope computes the slopes between all pairs of values within a “season” (column) of a 2D array. It returns an array containing the median of these “within-season” slopes for each season (the Theil-Sen slope estimator of each season), and it returns the median of the within-season slopes across all seasons (the seasonal Kendall slope estimator).
- Parameters:
- x2D array_like
Each column of x contains measurements of the dependent variable within a season. The independent variable (usually time) of each season is assumed to be
np.arange(x.shape[0])
.
- Returns:
- result
SenSeasonalSlopesResult
instance The return value is an object with the following attributes:
- intra_slopendarray
For each season, the Theil-Sen slope estimator: the median of within-season slopes.
- inter_slopefloat
The seasonal Kendall slope estimateor: the median of within-season slopes across all seasons.
- result
See also
theilslopes
the analogous function for non-seasonal data
scipy.stats.theilslopes
non-seasonal slopes for non-masked arrays
Notes
The slopes \(d_{ijk}\) within season \(i\) are:
\[d_{ijk} = \frac{x_{ij} - x_{ik}} {j - k}\]for pairs of distinct integer indices \(j, k\) of \(x\).
Element \(i\) of the returned intra_slope array is the median of the \(d_{ijk}\) over all \(j < k\); this is the Theil-Sen slope estimator of season \(i\). The returned inter_slope value, better known as the seasonal Kendall slope estimator, is the median of the \(d_{ijk}\) over all \(i, j, k\).
References
[1]Hirsch, Robert M., James R. Slack, and Richard A. Smith. “Techniques of trend analysis for monthly water quality data.” Water Resources Research 18.1 (1982): 107-121.
Examples
Suppose we have 100 observations of a dependent variable for each of four seasons:
>>> import numpy as np >>> rng = np.random.default_rng() >>> x = rng.random(size=(100, 4))
We compute the seasonal slopes as:
>>> from scipy import stats >>> intra_slope, inter_slope = stats.mstats.sen_seasonal_slopes(x)
If we define a function to compute all slopes between observations within a season:
>>> def dijk(yi): ... n = len(yi) ... x = np.arange(n) ... dy = yi - yi[:, np.newaxis] ... dx = x - x[:, np.newaxis] ... # we only want unique pairs of distinct indices ... mask = np.triu(np.ones((n, n), dtype=bool), k=1) ... return dy[mask]/dx[mask]
then element
i
ofintra_slope
is the median ofdijk[x[:, i]]
:>>> i = 2 >>> np.allclose(np.median(dijk(x[:, i])), intra_slope[i]) True
and
inter_slope
is the median of the values returned bydijk
for all seasons:>>> all_slopes = np.concatenate([dijk(x[:, i]) for i in range(x.shape[1])]) >>> np.allclose(np.median(all_slopes), inter_slope) True
Because the data are randomly generated, we would expect the median slopes to be nearly zero both within and across all seasons, and indeed they are:
>>> intra_slope.data array([ 0.00124504, -0.00277761, -0.00221245, -0.00036338]) >>> inter_slope -0.0010511779872922058