nrt package

Subpackages

Submodules

nrt.fit_methods module

Model fitting

Functions defined in this module always use a 2D array containing the dependant variables (y) and return both coefficient (beta) and residuals matrices. These functions are meant to be called in nrt.BaseNrt._fit().

The RIRLS fit is derived from Chris Holden’s yatsm package. See the copyright statement below.

nrt.fit_methods.ols(X, y)

Fit simple OLS model

Parameters

X ((M, N) np.ndarray) – Matrix of independant variables
y ({(M,), (M, K)} np.ndarray) – Matrix of dependant variables

Returns

The array of regression estimators residuals (numpy.ndarray): The array of residuals

Return type

beta (numpy.ndarray)

nrt.fit_methods.weighted_ols(X, y, w)

Apply a weighted OLS fit to 1D data

Parameters

X (np.ndarray) – independent variables
y (np.ndarray) – dependent variable
w (np.ndarray) – observation weights

Returns

coefficients and residual vector

Return type

tuple

nrt.log module

nrt.outliers module

Removing outliers

Functions defined in this module always use a 2D array containing the dependant variables (y) and return y with outliers set to np.nan. These functions are meant to be called in nrt.BaseNrt._fit()

Citations:

Brooks, E.B., Wynne, R.H., Thomas, V.A., Blinn, C.E. and Coulston, J.W., 2013. On-the-fly massively multitemporal change detection using statistical quality control charts and Landsat data. IEEE Transactions on Geoscience and Remote Sensing, 52(6), pp.3316-3332.
Zhu, Zhe, and Curtis E. Woodcock. 2014. “Continuous Change Detection and Classification of Land Cover Using All Available Landsat Data.” Remote Sensing of Environment 144 (March): 152–71. https://doi.org/10.1016/j.rse.2014.01.011.

nrt.outliers.ccdc_rirls(X, y, green, swir, scaling_factor=1, **kwargs)

Screen for missed clouds and other outliers using green and SWIR band

Parameters

X ((M, N) np.ndarray) – Matrix of independant variables
y ((M, K) np.ndarray) – Matrix of dependant variables
green (np.ndarray) – 2D array containing spectral values
swir (np.ndarray) – 2D array containing spectral values (~1.55-1.75um)
scaling_factor (int) – Scaling factor to bring green and swir values to reflectance values between 0 and 1

Returns

y with outliers set to np.nan

Return type

np.ndarray

nrt.outliers.shewhart(X, y, L=5, **kwargs)

Remove outliers using a Shewhart control chart

As described in Brooks et al. 2014, following an initial OLS fit, outliers are identified using a shewhart control chart and removed.

Parameters

X ((M, N) np.ndarray) – Matrix of independant variables
y ({(M,), (M, K)} np.ndarray) – Matrix of dependant variables
L (float) – control limit used for outlier filtering. Must be a positive float. Lower values indicate stricter filtering. Residuals larger than L*sigma will get screened out
**kwargs – not used

Returns

Dependant variables with outliers set to np.nan

Return type

y(np.ndarray)

nrt.stats module

nrt.stats.bisquare(resid, c=4.685)

Weight residuals using bisquare weight function

Parameters

resid (np.ndarray) – residuals to be weighted
c (float) – tuning constant for Tukey’s Biweight (default: 4.685)

Returns

weights for residuals

Return type

weight (ndarray)

Reference:: http://statsmodels.sourceforge.net/stable/generated/statsmodels.robust.norms.TukeyBiweight.html

nrt.stats.erfcc(x): Complementary error function.

nrt.stats.mad(resid, c=0.6745)

Returns Median-Absolute-Deviation (MAD) for residuals

Parameters

resid (np.ndarray) – residuals
c (float) – scale factor to get to ~standard normal (default: 0.6745) (i.e. 1 / 0.75iCDF ~= 1.4826 = 1 / 0.6745)

Returns

MAD ‘robust’ variance estimate

Return type

float

Reference:: http://en.wikipedia.org/wiki/Median_absolute_deviation

nrt.stats.nan_percentile_axis0(arr, percentiles)

Faster implementation of np.nanpercentile

This implementation always takes the percentile along axis 0. Uses numba to speed up the calculation by more than 7x.

Function is equivalent to np.nanpercentile(arr, <percentiles>, axis=0)

Parameters

arr (np.ndarray) – 2D array to calculate percentiles for
percentiles (np.ndarray) – 1D array of percentiles to calculate

Returns

Array with first dimension corresponding to values passed in percentiles

Return type

np.ndarray

nrt.stats.nanlstsq(X, y)

Return the least-squares solution to a linear matrix equation

Analog to numpy.linalg.lstsq for dependant variable containing Nan

Parameters

X ((M, N) np.ndarray) – Matrix of independant variables
y ({(M,), (M, K)} np.ndarray) – Matrix of dependant variables

Examples

>>> import numpy as np
>>> from sklearn.datasets import make_regression
>>> from nrt.stats import nanlstsq
>>> # Generate random data
>>> n_targets = 1000
>>> n_features = 2
>>> X, y = make_regression(n_samples=200, n_features=n_features,
...                        n_targets=n_targets)
>>> # Add random nan to y array
>>> y.ravel()[np.random.choice(y.size, 5*n_targets, replace=False)] = np.nan
>>> # Run the regression
>>> beta = nanlstsq(X, y)
>>> assert beta.shape == (n_features, n_targets)

Returns: Least-squares solution, ignoring Nan
Return type: np.ndarray

nrt.stats.ncdf(x): Normal cumulative distribution function Source: Stackoverflow Unknown, https://stackoverflow.com/a/809402/12819237

nrt.utils module

nrt.utils.build_regressors(dates, trend=True, harmonic_order=3)

Build the design matrix (X) from a list or an array of datetimes

Trend assumes temporal resolution no finer than daily Harmonics assume annual cycles

Parameters

dates (pandas.DatetimeIndex) – The dates to use for building regressors
trend (bool) – Whether to add a trend component
harmonic_order (int) – The order of the harmonic component

Returns

A design matrix

Return type

numpy.ndarray

nrt.utils.datetimeIndex_to_decimal_dates(dates): Convert a pandas datetime index to decimal dates

nrt.utils.dt_to_decimal(dt): Helper to build a decimal date from a datetime object

nrt.utils.numba_kwargs(func): Decorator which enables passing of kwargs to jitted functions by selecting only those kwargs that are available in the decorated functions signature

nrt.utils_efp module

CUSUM utility functions

Functions defined in this module implement functionality necessary for CUSUM and MOSUM monitoring as implemented in the R packages strucchange and bFast.

Portions of this module are derived from Chris Holden’s pybreakpoints package. See the copyright statement below.

nrt.utils_efp.history_roc(X, y, alpha=0.05, crit=0.9478982340418134)

Reverse Ordered Rec-CUSUM check for stable periods

Checks for stable periods by calculating recursive OLS-Residuals (see _recresid()) on the reversed X and y matrices. If the cumulative sum of the residuals crosses a boundary, the index of y where this structural change occured is returned.

Parameters

X ((M, ) np.ndarray) – Matrix of independant variables
y ((M, K) np.ndarray) – Matrix of dependant variables
alpha (float) – Significance level for the boundary (probability of type I error)
crit (float) – Critical value corresponding to the chosen alpha. Can be calculated with _cusum_rec_test_crit. Default is the value for alpha=0.05

Returns

(int) Index of structural change in y.: 0: y completely stable

>0: y stable after this index

nrt package

Subpackages

Submodules

nrt.fit_methods module

nrt.log module

nrt.outliers module

nrt.stats module

nrt.utils module

nrt.utils_efp module

Module contents