API Reference
nrt.data package
- nrt.data.germany_sample_points(return_meta=False)
Feature collection of 300 sample points from the Germany test area.
These points represent pixel centers sampled using a reclassified version of the
germany_stratificationlayer. In this reclassification, all disturbance strata were combined into a single stratum, resulting in stratified random sampling. An equal allocation method was employed, with 100 samples per stratum.The feature collection includes a
stratumproperty with values 1, 2, or 3, representing different strata:1: Non-forested stratum (2,333,767 pixels)
2: Stable forest stratum (972,184 pixels)
3: Disturbed forest stratum (694,049 pixels)
- Parameters:
return_meta (bool, optional) – If True, returns both the feature collection and associated metadata. Defaults to False.
Examples
>>> from nrt import data >>> fc = data.germany_sample_points() >>> set([feat['properties']['stratum'] for feat in fc]) {1,2,3} >>> len(fc) 300
- Returns:
If return_meta is False, returns a list containing the feature collection of sample points. If return_meta is True, returns a tuple of (feature collection, metadata).
- Return type:
list or tuple
- nrt.data.germany_stratification(return_meta=False)
Simple rule based classification of forest dynamics for germany_zarr
Raster layer in the EPSG:3035 CRS, spatially aligned with the
germany_zarrdata cube. The classification is based on a simple rule-based approach applied to a pair of manually selected cloud-free time steps (2019-06-27 and 2022-06-16). NDVI and the second SWIR band are used to distinguish various tree cover dynamics, based on the assumption that healthy forests are generally green in summer (indicated by high NDVI values) and dark in the SWIR region. “Unhealthy” forests, such as standing dead trees following a bark beetle attack, may still be dark in the SWIR region but with lower greenness. Non-forested land lacks these combined characteristics.Note that this approach is relatively simple and, as a result, produces an imperfect classification. Although there may be some confusion between land cover elements that are difficult to discriminate solely based on NDVI and SWIR values, the layer is well-suited for purposes such as stratification in a stratified sampling design.
Classes are encoded as follows:
1: Non-forested land 2: Stable Forest (Constantly green and dark in the SWIR region) 3: Forest Cover Loss (Conversion from healthy forest to non-forested land) 4: Forest Canopy Dieback (Loss of greenness but still dark in SWIR) 5: Salvage Logging (From 'unhealthy' forest to non-forested land)
- Returns:
2D array with encoded 2019-2022 tree cover dynamics or (data, metadata) if
return_metais True.- Return type:
numpy.ndarray or (numpy.ndarray, dict)
- nrt.data.germany_zarr(**kwargs)
Sentinel 2 datacube of a forest area affected by Bark Beetle in Germany
The data covers an area of 400 km2 located East of the city Cologne in the North Rhine-Westphalia state of Germany. It spans a five year period from 2018 to 2022 and includes the sentinel bands B02, B03, B04, B08, B11, B12 and SCL. Native 10m resolution is preserved for the visible and near-infrared bands while B11, B12 and SCL were resampled to 10m from their original 20m resolution. The data is organized as an online zarr store on the JRC’s Open Data Repository, so that running the
germany_zarr()function only creates a symbolic representation of the dataset (dask based). Lazy chunks can be eagerly loaded by invoking the.compute()method. Beware that loading the entire dataset may take a fairly long time due to its size (4.8 GB). Note that although data is stored as int16, the scaling factor is automatically applied to convert spectral channels to unscaled surface reflectance values and the corresponding data variables are casted to float64. No data pixels are also automatically converted tonp.nan- Parameters:
**kwargs – Additional keyword arguments passed to
xarray.open_zarr()
Examples
>>> import sys >>> from nrt import data >>> ds = data.germany_zarr() >>> print(ds) <xarray.Dataset> Dimensions: (time: 172, y: 2000, x: 2000) Coordinates: spatial_ref int32 ... * time (time) datetime64[ns] 2018-02-14T10:31:31.026000 ... 2022-12... * x (x) float64 4.133e+06 4.133e+06 ... 4.153e+06 4.153e+06 * y (y) float64 3.113e+06 3.113e+06 ... 3.093e+06 3.093e+06 Data variables: B02 (time, y, x) float32 dask.array<chunksize=(10, 100, 100), meta=np.ndarray> B03 (time, y, x) float32 dask.array<chunksize=(10, 100, 100), meta=np.ndarray> B04 (time, y, x) float32 dask.array<chunksize=(10, 100, 100), meta=np.ndarray> B08 (time, y, x) float32 dask.array<chunksize=(10, 100, 100), meta=np.ndarray> B11 (time, y, x) float32 dask.array<chunksize=(10, 100, 100), meta=np.ndarray> B12 (time, y, x) float32 dask.array<chunksize=(10, 100, 100), meta=np.ndarray> SCL (time, y, x) uint8 dask.array<chunksize=(10, 100, 100), meta=np.ndarray> >>> # Load a subset in memory >>> print(sys.getsizeof(ds.B02.data)) 80 >>> ds_sub = ds.isel(x=slice(100,110), y=slice(200,210), time=slice(10,20)).compute() >>> print(sys.getsizeof(ds_sub.B02.data)) 2144
- Returns:
A dask based Dataset of dimension
(time: 172, y: 2000, x: 2000)with 7 data variables.- Return type:
xarray.Dataset
- nrt.data.make_cube(*args, **kwargs)
- nrt.data.make_cube_parameters(*args, **kwargs)
- nrt.data.make_ts(*args, **kwargs)
- nrt.data.mre_crit_table()
Contains a dictionary equivalent to strucchange’s
mreCritValTableThe key ‘sig_level’ is a list of the available pre-computed significance (1-alpha) values.The other keys contain nested dictionaries, where the keys are the available relative window sizes (0.25, 0.5, 1), the second keys are the available periods (2, 4, 6, 8, 10) and the third keys are the functional types (“max”, “range”).
Example
>>> from nrt import data >>> crit_table = data.mre_crit_table() >>> win_size = 0.5 >>> period = 10 >>> functional = "max" >>> alpha=0.025 >>> crit_values = crit_table.get(str(win_size)) .get(str(period)) .get(functional) >>> sig_level = crit_table.get('sig_levels') >>> crit_level = np.interp(1-alpha, sig_level, crit_values)
- nrt.data.romania_10m(**kwargs)
Sentinel 2 datacube of a small forested area in Romania at 10 m resolution
Examples
>>> from nrt import data
>>> s2_cube = data.romania_10m() >>> # Compute NDVI >>> s2_cube['ndvi'] = (s2_cube.B8 - s2_cube.B4) / (s2_cube.B8 + s2_cube.B4) >>> # Filter clouds >>> s2_cube = s2_cube.where(s2_cube.SCL.isin([4,5,7]))
- nrt.data.romania_20m(**kwargs)
Sentinel 2 datacube of a small forested area in Romania at 20 m resolution
Examples
>>> from nrt import data
>>> s2_cube = data.romania_20m() >>> # Compute NDVI >>> s2_cube['ndvi'] = (s2_cube.B8A - s2_cube.B4) / (s2_cube.B8A + s2_cube.B4) >>> # Filter clouds >>> s2_cube = s2_cube.where(s2_cube.SCL.isin([4,5,7]))
- nrt.data.romania_forest_cover_percentage(return_meta=False)
Subset of Copernicus HR layer tree cover percentage - 20 m - Romania
- Returns:
The data or (data, metadata) if
return_metais True.- Return type:
numpy.ndarray or (numpy.ndarray, dict)
nrt.monitor package
- class nrt.monitor.BaseNrt(mask=None, trend=True, harmonic_order=3, save_fit_start=False, beta=None, x_coords=None, y_coords=None, process=None, boundary=None, detection_date=None, fit_start=None, update_mask=True, **kwargs)
Bases:
objectAbstract class for Near Real Time change detection
Every new change monitoring approach should inherit from this abstract class and must implement the abstract methods
fit(),monitor()andreport(). It contains generic method to fit time-series harmonic-trend regression models, backward test for stability, dump the instance to a netcdf file and reload a dump.- mask
A 2D numpy array containing pixels that should be monitored (1) and not (0). The mask may be updated following historing period stability check, and after a call to monitor following a confirmed break. Values are as follow:
{0: 'Not monitored', 1: 'monitored', 2: 'Unstable history', 3: 'Confirmed break - no longer monitored', 4: 'Not enough observations - not monitored'}
- Type:
numpy.ndarray
- trend
Indicate whether stable period fit is performed with trend or not
- Type:
bool
- harmonic_order
The harmonic order of the time-series regression
- Type:
int
- x
array of x coordinates
- Type:
numpy.ndarray
- y
array of y coordinates
- Type:
numpy.ndarray
- process
2D numpy array containing the process value for every pixel
- Type:
numpy.ndarray
- boundary
Process boundary for all pixels or every pixel individually
- Type:
Union[numpy.ndarray, int, float]
- detection_date
2D array signalling detection date of disturbances in days since 1970-01-01
- Type:
numpy.ndarray
- fit_start
2D integer array reporting start of history period in days since UNIX Epoch. Start of history period only varies when using stable fitting algorithms
- Type:
numpy.ndarray
- update_mask
Specifies whether to update the mask and halt further monitoring when values exceed boundary limits during a
.monitor()call. ATruevalue indicates that crossing the boundary limits will trigger a mask update and stop successive observation monitoring.- Type:
bool
- Parameters:
mask (numpy.ndarray) – A 2D numpy array containing pixels that should be monitored marked as
1and pixels that should be excluded (marked as0). Typically a stable forest mask when doing forest disturbance monitoring. If no mask is supplied all pixels are considered and a mask is created following thefit()calltrend (bool) – Indicate whether stable period fit is performed with trend or not
harmonic_order (int) – The harmonic order of the time-series regression
save_fit_start (bool) – If start of the fit should be reported in the model. Only applicable to stable fits (e.g. ‘ROC’, ‘CCDC-stable’). If true, the data will be saved in the attribute
fit_startx_coords (numpy.ndarray) – x coordinates
y_coords (numpy.ndarray) – y coordinates
process (numpy.ndarray) – 2D numpy array containing the process value for every pixel
boundary (Union[numpy.ndarray, int, float]) – Process boundary for all pixels or every pixel individually
detection_date (numpy.ndarray) – 2D array signalling detection date of disturbances in days since 1970-01-01
fit_start (numpy.ndarray) – 2D integer array reporting start of history period in days since UNIX Epoch. Start of history period only varies when using stable fitting algorithms
update_mask (bool) – Specifies whether to update the mask and halt further monitoring when values exceed boundary limits during a
.monitor()call. ATruevalue (default) indicates that crossing the boundary limits will trigger a mask update and stop successive observation monitoring.
- static build_design_matrix(dataarray, trend=True, harmonic_order=3)
Build a design matrix for temporal regression from xarray DataArray
- Parameters:
trend (bool) – Whether to include a temporal trend or not
harmonic_order (int) – The harmonic order to use (
1corresponds to annual cycles,2to annual and biannual cycles, etc)
- Returns:
- A design matrix to be passed to be passed to e.g. the
_fit()method
- Return type:
numpy.ndarray
- abstract fit()
- classmethod from_netcdf(filename, **kwargs)
- monitor(array, date, update_mask=None)
Monitor given a new acquisition
The method takes care of (1) predicting the expected pixels values, (2) updating the process value, and (3) updating self.mask in case a break is confirmed
- Parameters:
array (np.ndarray) – 2D array containing the new acquisition to be monitored
date (datetime.datetime) – Date of acquisition of data contained in the array
update_mask (bool) – Override
update_maskinstance attribute
- predict(date)
Predict the expected values for a given date
- Parameters:
date (datetime.datetime) – The date to predict
- Returns:
A 2D array of predicted values
- Return type:
numpy.ndarray
- report(filename, layers=['mask', 'detection_date'], driver='GTiff', crs=CRS.from_epsg(3035), dtype=<class 'numpy.int16'>)
Write the result of reporting to a raster geospatial file
- Parameters:
layers (list) – A list of strings indicating the layers to include in the report. Valid options are
'mask'(the main output layer containing confirmed breaks, non-monitored pixels, etc),'detection_date'(an integer value matching each confirmed break and indicating the date the break was confirmed in days since epoch),'process'(the process value). The process value has a different meaning and interpretation for each monitoring method.dtype (type) – The datatype of the stacked layers. Note that when returning process value for MoSum, CuSum or EWMA the
dtypeshould be set to a float type to retain values
- set_xy(dataarray)
- to_netcdf(filename)
- property transform
nrt.monitor.ccdc module
- class nrt.monitor.ccdc.CCDC(trend=True, harmonic_order=2, sensitivity=3, mask=None, boundary=3, **kwargs)
Bases:
BaseNrtMonitoring using CCDC-like implementation
Implementation loosely following method described in Zhu & Woodcock 2014.
Zhu, Zhe, and Curtis E. Woodcock. 2014. “Continuous Change Detection and Classification of Land Cover Using All Available Landsat Data.” Remote Sensing of Environment 144 (March): 152–71. https://doi.org/10.1016/j.rse.2014.01.011.
- mask
A 2D numpy array containing pixels that should be monitored (1) and not (0). The mask may be updated following history period stability check, and after a call to monitor following a confirmed break. Values are as follow.
{0: 'Not monitored', 1: 'monitored', 2: 'Unstable history', 3: 'Confirmed break - no longer monitored'}- Type:
numpy.ndarray
- trend
Indicate whether stable period fit is performed with trend or not
- Type:
bool
- harmonic_order
The harmonic order of the time-series regression
- Type:
int
- beta
3D array containing the model coefficients
- Type:
np.ndarray
- x
array of x coordinates
- Type:
numpy.ndarray
- y
array of y coordinates
- Type:
numpy.ndarray
- sensitivity
sensitivity of the monitoring. Lower numbers are high sensitivity. Value can’t be zero.
- Type:
float
- boundary
Number of consecutive observations identified as outliers to signal as disturbance
- Type:
int
- rmse
2D float array indicating RMSE for each pixel
- Type:
np.ndarray
- detection_date
2D array signalling detection date of disturbances in days since 1970-01-01
- Type:
numpy.ndarray
- Parameters:
mask (numpy.ndarray) – A 2D numpy array containing pixels that should be monitored marked as
1and pixels that should be excluded (marked as0). Typically a stable forest mask when doing forest disturbance monitoring. If no mask is supplied all pixels are considered and a mask is created following thefit()calltrend (bool) – Indicate whether stable period fit is performed with trend or not
harmonic_order (int) – The harmonic order of the time-series regression
sensitivity (float) – sensitivity of the monitoring. Lower numbers are high sensitivity. Value can’t be zero.
boundary (int) – Number of consecutive observations identified as outliers to signal as disturbance
**kwargs – Used to set internal attributes when initializing with
.from_netcdf()
- fit(dataarray, method='CCDC-stable', screen_outliers='CCDC_RIRLS', green=None, swir=None, scaling_factor=1, **kwargs)
Stable history model fitting
If screen outliers is required, green and swir bands must be passed.
The stability check will use the same sensitivity as is later used for detecting changes (default: 3*RMSE)
- Parameters:
dataarray (xr.DataArray) – xarray Dataarray including the historic data to be fitted
method (string) – Regression to use. See
_fit()for info.screen_outliers (string) – Outlier screening to use. See
_fit()for info.green (xr.DataArray) – Green reflectance values to be used by
screen_outliers.swir (xr.DataArray) – Short wave infrared (SWIR) reflectance values to be used by
screen_outliers.scaling_factor (int) – Optional Scaling factor to be applied to
greenandswir. Whenscreen_outliersis'CCDC_RIRLS'(default for CCDC), the outlier screening algorithm expects green and swir reflectance values in the [0,1] range. EO data are often scaled and stored as integer, with a scaling factor to convert between scaled and actual reflectance values. As an example, if scaled reflectance values are in the [0,10000] range, setscaling_factorto10000.**kwargs – to be passed to
_fit
Examples
>>> from nrt.monitor.ccdc import CCDC >>> from nrt import data
>>> # Load and prepare test data >>> mask = (data.romania_forest_cover_percentage() > 30).astype('int') >>> s2_cube = data.romania_20m()
>>> s2_cube['ndvi'] = (s2_cube.B8A - s2_cube.B04) / (s2_cube.B8A + s2_cube.B04) >>> s2_cube = s2_cube.where(s2_cube.SCL.isin([4,5,7])) >>> cube_history = s2_cube.sel(time=slice('2015-01-01', '2018-12-31'))
>>> # Instantiate monitoring class and fit the model, including outliers screening >>> ccdcMonitor = CCDC(trend=True, mask=mask) >>> ccdcMonitor.fit(dataarray=cube_history.ndvi, ... green=cube_history.B03, ... swir=cube_history.B11, ... scaling_factor=10000)
nrt.monitor.cusum module
- class nrt.monitor.cusum.CuSum(trend=True, harmonic_order=2, sensitivity=0.05, mask=None, **kwargs)
Bases:
BaseNrtMonitoring using cumulative sums (CUSUM) of residuals
Implementation following method as implemented in R package bFast.
- mask
A 2D numpy array containing pixels that should be monitored (1) and not (0). The mask may be updated following history period stability check, and after a call to monitor following a confirmed break. Values are as follow.
{0: 'Not monitored', 1: 'monitored', 2: 'Unstable history', 3: 'Confirmed break - no longer monitored'}- Type:
numpy.ndarray
- trend
Indicate whether stable period fit is performed with trend or not
- Type:
bool
- harmonic_order
The harmonic order of the time-series regression
- Type:
int
- x
array of x coordinates
- Type:
numpy.ndarray
- y
array of y coordinates
- Type:
numpy.ndarray
- sensitivity
sensitivity of the monitoring. Lower numbers correspond to lower sensitivity. Equivalent to significance level ‘alpha’ with which the boundary is computed
- Type:
float
- boundary
process boundary for each time series. Calculated from alpha and length of time series.
- Type:
numpy.ndarray
- sigma
Standard deviation for normalized residuals in history period
- Type:
numpy.ndarray
- histsize
Number of non-nan observations in history period
- Type:
numpy.ndarray
- n
Total number of non-nan observations in time-series
- Type:
numpy.ndarray
- critval
Critical test value corresponding to the chosen sensitivity
- Type:
float
- detection_date
2D array signalling detection date of disturbances in days since 1970-01-01
- Type:
numpy.ndarray
- Parameters:
mask (numpy.ndarray) – A 2D numpy array containing pixels that should be monitored marked as
1and pixels that should be excluded (marked as0). Typically a stable forest mask when doing forest disturbance monitoring. If no mask is supplied all pixels are considered and a mask is created following thefit()calltrend (bool) – Indicate whether stable period fit is performed with trend or not
harmonic_order (int) – The harmonic order of the time-series regression
sensitivity (float) – sensitivity of the monitoring. Lower numbers correspond to lower sensitivity. Equivalent to significance level ‘alpha’ with which the boundary is computed
**kwargs – Used to set internal attributes when initializing with
.from_netcdf()
- fit(dataarray, method='ROC', alpha=0.05, **kwargs)
Stable history model fitting
If method
'ROC'is used for fitting, the argumentalphahas to be passed.- Parameters:
dataarray (xr.DataArray) – xarray Dataarray including the historic data to be fitted
method (string) – Regression to use. See
_fit()for info.alpha (float) – Significance level for
'ROC'stable fit.**kwargs – to be passed to
_fit
nrt.monitor.ewma module
- class nrt.monitor.ewma.EWMA(trend=True, harmonic_order=2, sensitivity=2, mask=None, lambda_=0.3, threshold_outlier=10, **kwargs)
Bases:
BaseNrtMonitoring using EWMA control chart
Implementation following method described in Brooks et al. 2014.
- Parameters:
mask (numpy.ndarray) – A 2D numpy array containing pixels that should be monitored marked as
1and pixels that should be excluded (marked as0). Typically a stable forest mask when doing forest disturbance monitoring. If no mask is supplied all pixels are considered and a mask is created following thefit()calltrend (bool) – Indicate whether stable period fit is performed with trend or not
harmonic_order (int) – The harmonic order of the time-series regression
lambda (float) – Weight of previous observation in the monitoring process (memory). Valid range is [0,1], 1 corresponding to no memory and 0 to full memory
sensitivity (float) – Sensitivity parameter used in the computation of the monitoring boundaries. Lower values imply more sensitive monitoring
threshold_outlier (float) – Values bigger than threshold_outlier*sigma (extreme outliers) will get screened out during monitoring and will not contribute to updating the EWMA process value
**kwargs – Used to set internal attributes when initializing with
.from_netcdf()
- fit(dataarray, method='OLS', screen_outliers='Shewhart', L=5, **kwargs)
Stable history model fitting
The preferred fitting method for this monitoring type is
'OLS'with outlier screening'Shewhart'. It requires a control limit parameterL. Seenrt.outliers.shewartfor more details
nrt.monitor.iqr module
- class nrt.monitor.iqr.IQR(trend=True, harmonic_order=3, sensitivity=1.5, mask=None, boundary=3, **kwargs)
Bases:
BaseNrtOnline monitoring of disturbances based on interquartile range
- Reference:
- mask
A 2D numpy array containing pixels that should be monitored (1) and not (0). The mask may be updated following history period stability check, and after a call to monitor following a confirmed break. Values are as follow.
{0: 'Not monitored', 1: 'monitored', 2: 'Unstable history', 3: 'Confirmed break - no longer monitored'}- Type:
numpy.ndarray
- trend
Indicate whether stable period fit is performed with trend or not
- Type:
bool
- harmonic_order
The harmonic order of the time-series regression
- Type:
int
- beta
3D array containing the model coefficients
- Type:
np.ndarray
- x
array of x coordinates
- Type:
numpy.ndarray
- y
array of y coordinates
- Type:
numpy.ndarray
- sensitivity
sensitivity of the monitoring. Lower numbers are high sensitivity. Value can’t be zero.
- Type:
float
- boundary
Number of consecutive observations identified as outliers to signal as disturbance
- Type:
int
- q25
25th percentile of residuals
- Type:
numpy.ndarray
- q75
75th percentile of residuals
- Type:
numpy.ndarray
- detection_date
2D array signalling detection date of disturbances in days since 1970-01-01
- Type:
numpy.ndarray
- Parameters:
mask (numpy.ndarray) – A 2D numpy array containing pixels that should be monitored marked as
1and pixels that should be excluded (marked as0). Typically a stable forest mask when doing forest disturbance monitoring. If no mask is supplied all pixels are considered and a mask is created following thefit()calltrend (bool) – Indicate whether stable period fit is performed with trend or not
harmonic_order (int) – The harmonic order of the time-series regression
sensitivity (float) – sensitivity of the monitoring. Lower numbers are high sensitivity. Value can’t be zero.
boundary (int) – Number of consecutive observations identified as outliers to signal as disturbance
**kwargs – Used to set internal attributes when initializing with
.from_netcdf()
- fit(dataarray, method='OLS', **kwargs)
nrt.monitor.mosum module
- class nrt.monitor.mosum.MoSum(trend=True, harmonic_order=2, sensitivity=0.05, mask=None, h=0.25, **kwargs)
Bases:
BaseNrtMonitoring using moving sums (MOSUM) of residuals
Implementation following method as implemented in R package bFast.
- mask
A 2D numpy array containing pixels that should be monitored (1) and not (0). The mask may be updated following history period stability check, and after a call to monitor following a confirmed break. Values are as follow.
{0: 'Not monitored', 1: 'monitored', 2: 'Unstable history', 3: 'Confirmed break - no longer monitored'}- Type:
numpy.ndarray
- trend
Indicate whether stable period fit is performed with trend or not
- Type:
bool
- harmonic_order
The harmonic order of the time-series regression
- Type:
int
- x
array of x coordinates
- Type:
numpy.ndarray
- y
array of y coordinates
- Type:
numpy.ndarray
- sensitivity
sensitivity of the monitoring. Lower numbers correspond to lower sensitivity. Equivalent to significance level ‘alpha’ with which the boundary is computed
- Type:
float
- boundary
process boundary for each time series. Calculated from alpha and length of time series.
- Type:
numpy.ndarray
- sigma
Standard deviation for normalized residuals in history period
- Type:
numpy.ndarray
- histsize
Number of non-nan observations in history period
- Type:
numpy.ndarray
- n
Total number of non-nan observations in time-series
- Type:
numpy.ndarray
- critval
Critical test value corresponding to the chosen sensitivity
- Type:
float
- h
Moving window size relative to length of the history period. Can be one of 0.25, 0.5 and 1
- Type:
float
- winsize
2D array with absolute window size. Computed as h*histsize
- Type:
numpy.ndarray
- window
3D array containing the current values in the window
- Type:
numpy.ndarray
- detection_date
2D array signalling detection date of disturbances in days since 1970-01-01
- Type:
numpy.ndarray
- Parameters:
mask (numpy.ndarray) – A 2D numpy array containing pixels that should be monitored marked as
1and pixels that should be excluded (marked as0). Typically a stable forest mask when doing forest disturbance monitoring. If no mask is supplied all pixels are considered and a mask is created following thefit()calltrend (bool) – Indicate whether stable period fit is performed with trend or not
harmonic_order (int) – The harmonic order of the time-series regression
sensitivity (float) – sensitivity of the monitoring. Lower numbers correspond to lower sensitivity. Equivalent to significance level ‘alpha’ with which the boundary is computed
h (float) – Moving window size relative to length of the history period. Can be one of 0.25, 0.5 and 1
**kwargs – Used to set internal attributes when initializing with
.from_netcdf()
- fit(dataarray, method='ROC', alpha=0.05, **kwargs)
Stable history model fitting
If method
'ROC'is used for fitting, the argumentalphahas to be passed.- Parameters:
dataarray (xr.DataArray) – xarray Dataarray including the historic data to be fitted
method (string) – Regression to use. See
_fit()for info.alpha (float) – Significance level for
'ROC'stable fit.**kwargs – to be passed to
_fit
- get_process()
- property process
- set_process(x)
nrt.fit_methods module
Model fitting
Functions defined in this module always use a 2D array containing the dependant
variables (y) and return both coefficient (beta) and residuals matrices.
These functions are meant to be called in nrt.BaseNrt._fit().
The RIRLS fit is derived from Chris Holden’s yatsm package. See the copyright statement below.
- nrt.fit_methods.ols(X, y)
Fit simple OLS model
- Parameters:
X ((M, N) np.ndarray) – Matrix of independant variables
y ({(M,), (M, K)} np.ndarray) – Matrix of dependant variables
- Returns:
The array of regression estimators residuals (numpy.ndarray): The array of residuals
- Return type:
beta (numpy.ndarray)
- nrt.fit_methods.weighted_ols(X, y, w)
Apply a weighted OLS fit to 1D data
- Parameters:
X (np.ndarray) – independent variables
y (np.ndarray) – dependent variable
w (np.ndarray) – observation weights
- Returns:
coefficients and residual vector
- Return type:
tuple
nrt.log module
nrt.outliers module
Removing outliers
Functions defined in this module always use a 2D array containing the dependant
variables (y) and return y with outliers set to np.nan.
These functions are meant to be called in nrt.BaseNrt._fit()
Citations:
Brooks, E.B., Wynne, R.H., Thomas, V.A., Blinn, C.E. and Coulston, J.W., 2013. On-the-fly massively multitemporal change detection using statistical quality control charts and Landsat data. IEEE Transactions on Geoscience and Remote Sensing, 52(6), pp.3316-3332.
Zhu, Zhe, and Curtis E. Woodcock. 2014. “Continuous Change Detection and Classification of Land Cover Using All Available Landsat Data.” Remote Sensing of Environment 144 (March): 152–71. https://doi.org/10.1016/j.rse.2014.01.011.
- nrt.outliers.ccdc_rirls(X, y, green, swir, scaling_factor=1, **kwargs)
Screen for missed clouds and other outliers using green and SWIR band
- Parameters:
X ((M, N) np.ndarray) – Matrix of independant variables
y ((M, K) np.ndarray) – Matrix of dependant variables
green (np.ndarray) – 2D array containing spectral values
swir (np.ndarray) – 2D array containing spectral values (~1.55-1.75um)
scaling_factor (int) – Scaling factor to bring green and swir values to reflectance values between 0 and 1
- Returns:
y with outliers set to np.nan
- Return type:
np.ndarray
- nrt.outliers.shewhart(X, y, L=5, **kwargs)
Remove outliers using a Shewhart control chart
As described in Brooks et al. 2014, following an initial OLS fit, outliers are identified using a shewhart control chart and removed.
- Parameters:
X ((M, N) np.ndarray) – Matrix of independant variables
y ({(M,), (M, K)} np.ndarray) – Matrix of dependant variables
L (float) – control limit used for outlier filtering. Must be a positive float. Lower values indicate stricter filtering. Residuals larger than L*sigma will get screened out
**kwargs – not used
- Returns:
Dependant variables with outliers set to np.nan
- Return type:
y(np.ndarray)
nrt.stats module
- nrt.stats.bisquare(resid, c=4.685)
Weight residuals using bisquare weight function
- Parameters:
resid (np.ndarray) – residuals to be weighted
c (float) – tuning constant for Tukey’s Biweight (default: 4.685)
- Returns:
weights for residuals
- Return type:
weight (ndarray)
- nrt.stats.erfcc(x)
Complementary error function.
- nrt.stats.mad(resid, c=0.6745)
Returns Median-Absolute-Deviation (MAD) for residuals
- Parameters:
resid (np.ndarray) – residuals
c (float) – scale factor to get to ~standard normal (default: 0.6745) (i.e. 1 / 0.75iCDF ~= 1.4826 = 1 / 0.6745)
- Returns:
MAD ‘robust’ variance estimate
- Return type:
float
- nrt.stats.nan_percentile_axis0(arr, percentiles)
Faster implementation of np.nanpercentile
This implementation always takes the percentile along axis 0. Uses numba to speed up the calculation by more than 7x.
Function is equivalent to np.nanpercentile(arr, <percentiles>, axis=0)
- Parameters:
arr (np.ndarray) – 2D array to calculate percentiles for
percentiles (np.ndarray) – 1D array of percentiles to calculate
- Returns:
Array with first dimension corresponding to values passed in percentiles
- Return type:
np.ndarray
- nrt.stats.nanlstsq(X, y)
Return the least-squares solution to a linear matrix equation
Analog to
numpy.linalg.lstsqfor dependant variable containingNanNote
For best performances of the multithreaded implementation, it is recommended to limit the number of threads used by MKL or OpenBLAS to 1. This avoids over-subscription, and improves performances. By default the function will use all cores available; the number of cores used can be controled using the
numba.set_num_threadsfunction or by modifying theNUMBA_NUM_THREADSenvironment variable- Parameters:
X ((M, N) np.ndarray) – Matrix of independant variables
y ({(M,), (M, K)} np.ndarray) – Matrix of dependant variables
Examples
>>> import os >>> # Adjust linear algebra configuration (only one should be required >>> # depending on how numpy was installed/compiled) >>> os.environ['OPENBLAS_NUM_THREADS'] = '1' >>> os.environ['MKL_NUM_THREADS'] = '1' >>> import numpy as np >>> from sklearn.datasets import make_regression >>> from nrt.stats import nanlstsq >>> # Generate random data >>> n_targets = 1000 >>> n_features = 2 >>> X, y = make_regression(n_samples=200, n_features=n_features, ... n_targets=n_targets) >>> # Add random nan to y array >>> y.ravel()[np.random.choice(y.size, 5*n_targets, replace=False)] = np.nan >>> # Run the regression >>> beta = nanlstsq(X, y) >>> assert beta.shape == (n_features, n_targets)
- Returns:
Least-squares solution, ignoring
Nan- Return type:
np.ndarray
- nrt.stats.ncdf(x)
Normal cumulative distribution function Source: Stackoverflow Unknown, https://stackoverflow.com/a/809402/12819237
nrt.utils module
- nrt.utils.build_regressors(dates, trend=True, harmonic_order=3)
Build the design matrix (X) from a list or an array of datetimes
Trend assumes temporal resolution no finer than daily Harmonics assume annual cycles
- Parameters:
dates (pandas.DatetimeIndex) – The dates to use for building regressors
trend (bool) – Whether to add a trend component
harmonic_order (int) – The order of the harmonic component
- Returns:
A design matrix
- Return type:
numpy.ndarray
- nrt.utils.datetimeIndex_to_decimal_dates(dates)
Convert a pandas datetime index to decimal dates
- nrt.utils.dt_to_decimal(dt)
Helper to build a decimal date from a datetime object
- nrt.utils.numba_kwargs(func)
Decorator which enables passing of kwargs to jitted functions by selecting only those kwargs that are available in the decorated functions signature
nrt.utils_efp module
CUSUM utility functions
Functions defined in this module implement functionality necessary for CUSUM and MOSUM monitoring as implemented in the R packages strucchange and bFast.
Portions of this module are derived from Chris Holden’s pybreakpoints package. See the copyright statement below.
- nrt.utils_efp.history_roc(X, y, alpha=0.05, crit=0.9478982340418134)
Reverse Ordered Rec-CUSUM check for stable periods
Checks for stable periods by calculating recursive OLS-Residuals (see
_recresid()) on the reversed X and y matrices. If the cumulative sum of the residuals crosses a boundary, the index of y where this structural change occured is returned.- Parameters:
X ((M, ) np.ndarray) – Matrix of independant variables
y ((M, K) np.ndarray) – Matrix of dependant variables
alpha (float) – Significance level for the boundary (probability of type I error)
crit (float) – Critical value corresponding to the chosen alpha. Can be calculated with
_cusum_rec_test_crit. Default is the value for alpha=0.05
- Returns:
Index of structural change in y.
0- y completely stable>0- y stable after this index- Return type:
int