nrt package
Subpackages
Submodules
nrt.fit_methods module
Model fitting
Functions defined in this module always use a 2D array containing the dependant
variables (y) and return both coefficient (beta) and residuals matrices.
These functions are meant to be called in nrt.BaseNrt._fit()
.
The RIRLS fit is derived from Chris Holden’s yatsm package. See the copyright statement below.
- nrt.fit_methods.ols(X, y)
Fit simple OLS model
- Parameters
X ((M, N) np.ndarray) – Matrix of independant variables
y ({(M,), (M, K)} np.ndarray) – Matrix of dependant variables
- Returns
The array of regression estimators residuals (numpy.ndarray): The array of residuals
- Return type
beta (numpy.ndarray)
- nrt.fit_methods.weighted_ols(X, y, w)
Apply a weighted OLS fit to 1D data
- Parameters
X (np.ndarray) – independent variables
y (np.ndarray) – dependent variable
w (np.ndarray) – observation weights
- Returns
coefficients and residual vector
- Return type
tuple
nrt.log module
nrt.outliers module
Removing outliers
Functions defined in this module always use a 2D array containing the dependant
variables (y) and return y with outliers set to np.nan.
These functions are meant to be called in nrt.BaseNrt._fit()
Citations:
Brooks, E.B., Wynne, R.H., Thomas, V.A., Blinn, C.E. and Coulston, J.W., 2013. On-the-fly massively multitemporal change detection using statistical quality control charts and Landsat data. IEEE Transactions on Geoscience and Remote Sensing, 52(6), pp.3316-3332.
Zhu, Zhe, and Curtis E. Woodcock. 2014. “Continuous Change Detection and Classification of Land Cover Using All Available Landsat Data.” Remote Sensing of Environment 144 (March): 152–71. https://doi.org/10.1016/j.rse.2014.01.011.
- nrt.outliers.ccdc_rirls(X, y, green, swir, scaling_factor=1, **kwargs)
Screen for missed clouds and other outliers using green and SWIR band
- Parameters
X ((M, N) np.ndarray) – Matrix of independant variables
y ((M, K) np.ndarray) – Matrix of dependant variables
green (np.ndarray) – 2D array containing spectral values
swir (np.ndarray) – 2D array containing spectral values (~1.55-1.75um)
scaling_factor (int) – Scaling factor to bring green and swir values to reflectance values between 0 and 1
- Returns
y with outliers set to np.nan
- Return type
np.ndarray
- nrt.outliers.shewhart(X, y, L=5, **kwargs)
Remove outliers using a Shewhart control chart
As described in Brooks et al. 2014, following an initial OLS fit, outliers are identified using a shewhart control chart and removed.
- Parameters
X ((M, N) np.ndarray) – Matrix of independant variables
y ({(M,), (M, K)} np.ndarray) – Matrix of dependant variables
L (float) – control limit used for outlier filtering. Must be a positive float. Lower values indicate stricter filtering. Residuals larger than L*sigma will get screened out
**kwargs – not used
- Returns
Dependant variables with outliers set to np.nan
- Return type
y(np.ndarray)
nrt.stats module
- nrt.stats.bisquare(resid, c=4.685)
Weight residuals using bisquare weight function
- Parameters
resid (np.ndarray) – residuals to be weighted
c (float) – tuning constant for Tukey’s Biweight (default: 4.685)
- Returns
weights for residuals
- Return type
weight (ndarray)
- nrt.stats.erfcc(x)
Complementary error function.
- nrt.stats.mad(resid, c=0.6745)
Returns Median-Absolute-Deviation (MAD) for residuals
- Parameters
resid (np.ndarray) – residuals
c (float) – scale factor to get to ~standard normal (default: 0.6745) (i.e. 1 / 0.75iCDF ~= 1.4826 = 1 / 0.6745)
- Returns
MAD ‘robust’ variance estimate
- Return type
float
- nrt.stats.nan_percentile_axis0(arr, percentiles)
Faster implementation of np.nanpercentile
This implementation always takes the percentile along axis 0. Uses numba to speed up the calculation by more than 7x.
Function is equivalent to np.nanpercentile(arr, <percentiles>, axis=0)
- Parameters
arr (np.ndarray) – 2D array to calculate percentiles for
percentiles (np.ndarray) – 1D array of percentiles to calculate
- Returns
Array with first dimension corresponding to values passed in percentiles
- Return type
np.ndarray
- nrt.stats.nanlstsq(X, y)
Return the least-squares solution to a linear matrix equation
Analog to
numpy.linalg.lstsq
for dependant variable containingNan
- Parameters
X ((M, N) np.ndarray) – Matrix of independant variables
y ({(M,), (M, K)} np.ndarray) – Matrix of dependant variables
Examples
>>> import numpy as np >>> from sklearn.datasets import make_regression >>> from nrt.stats import nanlstsq >>> # Generate random data >>> n_targets = 1000 >>> n_features = 2 >>> X, y = make_regression(n_samples=200, n_features=n_features, ... n_targets=n_targets) >>> # Add random nan to y array >>> y.ravel()[np.random.choice(y.size, 5*n_targets, replace=False)] = np.nan >>> # Run the regression >>> beta = nanlstsq(X, y) >>> assert beta.shape == (n_features, n_targets)
- Returns
Least-squares solution, ignoring
Nan
- Return type
np.ndarray
- nrt.stats.ncdf(x)
Normal cumulative distribution function Source: Stackoverflow Unknown, https://stackoverflow.com/a/809402/12819237
nrt.utils module
- nrt.utils.build_regressors(dates, trend=True, harmonic_order=3)
Build the design matrix (X) from a list or an array of datetimes
Trend assumes temporal resolution no finer than daily Harmonics assume annual cycles
- Parameters
dates (pandas.DatetimeIndex) – The dates to use for building regressors
trend (bool) – Whether to add a trend component
harmonic_order (int) – The order of the harmonic component
- Returns
A design matrix
- Return type
numpy.ndarray
- nrt.utils.datetimeIndex_to_decimal_dates(dates)
Convert a pandas datetime index to decimal dates
- nrt.utils.dt_to_decimal(dt)
Helper to build a decimal date from a datetime object
- nrt.utils.numba_kwargs(func)
Decorator which enables passing of kwargs to jitted functions by selecting only those kwargs that are available in the decorated functions signature
nrt.utils_efp module
CUSUM utility functions
Functions defined in this module implement functionality necessary for CUSUM and MOSUM monitoring as implemented in the R packages strucchange and bFast.
Portions of this module are derived from Chris Holden’s pybreakpoints package. See the copyright statement below.
- nrt.utils_efp.history_roc(X, y, alpha=0.05, crit=0.9478982340418134)
Reverse Ordered Rec-CUSUM check for stable periods
Checks for stable periods by calculating recursive OLS-Residuals (see
_recresid()
) on the reversed X and y matrices. If the cumulative sum of the residuals crosses a boundary, the index of y where this structural change occured is returned.- Parameters
X ((M, ) np.ndarray) – Matrix of independant variables
y ((M, K) np.ndarray) – Matrix of dependant variables
alpha (float) – Significance level for the boundary (probability of type I error)
crit (float) – Critical value corresponding to the chosen alpha. Can be calculated with
_cusum_rec_test_crit
. Default is the value for alpha=0.05
- Returns
- (int) Index of structural change in y.
0: y completely stable
>0: y stable after this index