nrt package
Subpackages
- nrt.data package
- nrt.monitor package
- Submodules
- nrt.monitor.ccdc module
- nrt.monitor.cusum module
- nrt.monitor.ewma module
- nrt.monitor.iqr module
- nrt.monitor.mosum module
- Module contents
BaseNrt
BaseNrt.mask
BaseNrt.trend
BaseNrt.harmonic_order
BaseNrt.x
BaseNrt.y
BaseNrt.process
BaseNrt.boundary
BaseNrt.detection_date
BaseNrt.fit_start
BaseNrt.update_mask
BaseNrt.build_design_matrix()
BaseNrt.fit()
BaseNrt.from_netcdf()
BaseNrt.monitor()
BaseNrt.predict()
BaseNrt.report()
BaseNrt.set_xy()
BaseNrt.to_netcdf()
BaseNrt.transform
Submodules
nrt.fit_methods module
Model fitting
Functions defined in this module always use a 2D array containing the dependant
variables (y) and return both coefficient (beta) and residuals matrices.
These functions are meant to be called in nrt.BaseNrt._fit()
.
The RIRLS fit is derived from Chris Holden’s yatsm package. See the copyright statement below.
- nrt.fit_methods.ols(X, y)
Fit simple OLS model
- Parameters:
X ((M, N) np.ndarray) – Matrix of independant variables
y ({(M,), (M, K)} np.ndarray) – Matrix of dependant variables
- Returns:
The array of regression estimators residuals (numpy.ndarray): The array of residuals
- Return type:
beta (numpy.ndarray)
- nrt.fit_methods.weighted_ols(X, y, w)
Apply a weighted OLS fit to 1D data
- Parameters:
X (np.ndarray) – independent variables
y (np.ndarray) – dependent variable
w (np.ndarray) – observation weights
- Returns:
coefficients and residual vector
- Return type:
tuple
nrt.log module
nrt.outliers module
Removing outliers
Functions defined in this module always use a 2D array containing the dependant
variables (y) and return y with outliers set to np.nan.
These functions are meant to be called in nrt.BaseNrt._fit()
Citations:
Brooks, E.B., Wynne, R.H., Thomas, V.A., Blinn, C.E. and Coulston, J.W., 2013. On-the-fly massively multitemporal change detection using statistical quality control charts and Landsat data. IEEE Transactions on Geoscience and Remote Sensing, 52(6), pp.3316-3332.
Zhu, Zhe, and Curtis E. Woodcock. 2014. “Continuous Change Detection and Classification of Land Cover Using All Available Landsat Data.” Remote Sensing of Environment 144 (March): 152–71. https://doi.org/10.1016/j.rse.2014.01.011.
- nrt.outliers.ccdc_rirls(X, y, green, swir, scaling_factor=1, **kwargs)
Screen for missed clouds and other outliers using green and SWIR band
- Parameters:
X ((M, N) np.ndarray) – Matrix of independant variables
y ((M, K) np.ndarray) – Matrix of dependant variables
green (np.ndarray) – 2D array containing spectral values
swir (np.ndarray) – 2D array containing spectral values (~1.55-1.75um)
scaling_factor (int) – Scaling factor to bring green and swir values to reflectance values between 0 and 1
- Returns:
y with outliers set to np.nan
- Return type:
np.ndarray
- nrt.outliers.shewhart(X, y, L=5, **kwargs)
Remove outliers using a Shewhart control chart
As described in Brooks et al. 2014, following an initial OLS fit, outliers are identified using a shewhart control chart and removed.
- Parameters:
X ((M, N) np.ndarray) – Matrix of independant variables
y ({(M,), (M, K)} np.ndarray) – Matrix of dependant variables
L (float) – control limit used for outlier filtering. Must be a positive float. Lower values indicate stricter filtering. Residuals larger than L*sigma will get screened out
**kwargs – not used
- Returns:
Dependant variables with outliers set to np.nan
- Return type:
y(np.ndarray)
nrt.stats module
- nrt.stats.bisquare(resid, c=4.685)
Weight residuals using bisquare weight function
- Parameters:
resid (np.ndarray) – residuals to be weighted
c (float) – tuning constant for Tukey’s Biweight (default: 4.685)
- Returns:
weights for residuals
- Return type:
weight (ndarray)
- nrt.stats.erfcc(x)
Complementary error function.
- nrt.stats.mad(resid, c=0.6745)
Returns Median-Absolute-Deviation (MAD) for residuals
- Parameters:
resid (np.ndarray) – residuals
c (float) – scale factor to get to ~standard normal (default: 0.6745) (i.e. 1 / 0.75iCDF ~= 1.4826 = 1 / 0.6745)
- Returns:
MAD ‘robust’ variance estimate
- Return type:
float
- nrt.stats.nan_percentile_axis0(arr, percentiles)
Faster implementation of np.nanpercentile
This implementation always takes the percentile along axis 0. Uses numba to speed up the calculation by more than 7x.
Function is equivalent to np.nanpercentile(arr, <percentiles>, axis=0)
- Parameters:
arr (np.ndarray) – 2D array to calculate percentiles for
percentiles (np.ndarray) – 1D array of percentiles to calculate
- Returns:
Array with first dimension corresponding to values passed in percentiles
- Return type:
np.ndarray
- nrt.stats.nanlstsq(X, y)
Return the least-squares solution to a linear matrix equation
Analog to
numpy.linalg.lstsq
for dependant variable containingNan
Note
For best performances of the multithreaded implementation, it is recommended to limit the number of threads used by MKL or OpenBLAS to 1. This avoids over-subscription, and improves performances. By default the function will use all cores available; the number of cores used can be controled using the
numba.set_num_threads
function or by modifying theNUMBA_NUM_THREADS
environment variable- Parameters:
X ((M, N) np.ndarray) – Matrix of independant variables
y ({(M,), (M, K)} np.ndarray) – Matrix of dependant variables
Examples
>>> import os >>> # Adjust linear algebra configuration (only one should be required >>> # depending on how numpy was installed/compiled) >>> os.environ['OPENBLAS_NUM_THREADS'] = '1' >>> os.environ['MKL_NUM_THREADS'] = '1' >>> import numpy as np >>> from sklearn.datasets import make_regression >>> from nrt.stats import nanlstsq >>> # Generate random data >>> n_targets = 1000 >>> n_features = 2 >>> X, y = make_regression(n_samples=200, n_features=n_features, ... n_targets=n_targets) >>> # Add random nan to y array >>> y.ravel()[np.random.choice(y.size, 5*n_targets, replace=False)] = np.nan >>> # Run the regression >>> beta = nanlstsq(X, y) >>> assert beta.shape == (n_features, n_targets)
- Returns:
Least-squares solution, ignoring
Nan
- Return type:
np.ndarray
- nrt.stats.ncdf(x)
Normal cumulative distribution function Source: Stackoverflow Unknown, https://stackoverflow.com/a/809402/12819237
nrt.utils module
- nrt.utils.build_regressors(dates, trend=True, harmonic_order=3)
Build the design matrix (X) from a list or an array of datetimes
Trend assumes temporal resolution no finer than daily Harmonics assume annual cycles
- Parameters:
dates (pandas.DatetimeIndex) – The dates to use for building regressors
trend (bool) – Whether to add a trend component
harmonic_order (int) – The order of the harmonic component
- Returns:
A design matrix
- Return type:
numpy.ndarray
- nrt.utils.datetimeIndex_to_decimal_dates(dates)
Convert a pandas datetime index to decimal dates
- nrt.utils.dt_to_decimal(dt)
Helper to build a decimal date from a datetime object
- nrt.utils.numba_kwargs(func)
Decorator which enables passing of kwargs to jitted functions by selecting only those kwargs that are available in the decorated functions signature
nrt.utils_efp module
CUSUM utility functions
Functions defined in this module implement functionality necessary for CUSUM and MOSUM monitoring as implemented in the R packages strucchange and bFast.
Portions of this module are derived from Chris Holden’s pybreakpoints package. See the copyright statement below.
- nrt.utils_efp.history_roc(X, y, alpha=0.05, crit=0.9478982340418134)
Reverse Ordered Rec-CUSUM check for stable periods
Checks for stable periods by calculating recursive OLS-Residuals (see
_recresid()
) on the reversed X and y matrices. If the cumulative sum of the residuals crosses a boundary, the index of y where this structural change occured is returned.- Parameters:
X ((M, ) np.ndarray) – Matrix of independant variables
y ((M, K) np.ndarray) – Matrix of dependant variables
alpha (float) – Significance level for the boundary (probability of type I error)
crit (float) – Critical value corresponding to the chosen alpha. Can be calculated with
_cusum_rec_test_crit
. Default is the value for alpha=0.05
- Returns:
- (int) Index of structural change in y.
0: y completely stable
>0: y stable after this index