API Reference

nrt.data package

nrt.data.germany_sample_points(return_meta=False)

Feature collection of 300 sample points from the Germany test area.

These points represent pixel centers sampled using a reclassified version of the germany_stratification layer. In this reclassification, all disturbance strata were combined into a single stratum, resulting in stratified random sampling. An equal allocation method was employed, with 100 samples per stratum.

The feature collection includes a stratum property with values 1, 2, or 3, representing different strata:

1: Non-forested stratum (2,333,767 pixels)

2: Stable forest stratum (972,184 pixels)

3: Disturbed forest stratum (694,049 pixels)

Parameters:: return_meta (bool, optional) – If True, returns both the feature collection and associated metadata. Defaults to False.

Examples

>>> from nrt import data
>>> fc = data.germany_sample_points()
>>> set([feat['properties']['stratum'] for feat in fc])
{1,2,3}
>>> len(fc)
300

Returns:: If return_meta is False, returns a list containing the feature collection of sample points. If return_meta is True, returns a tuple of (feature collection, metadata).
Return type:: list or tuple

nrt.data.germany_stratification(return_meta=False)

Simple rule based classification of forest dynamics for germany_zarr

Raster layer in the EPSG:3035 CRS, spatially aligned with the germany_zarr data cube. The classification is based on a simple rule-based approach applied to a pair of manually selected cloud-free time steps (2019-06-27 and 2022-06-16). NDVI and the second SWIR band are used to distinguish various tree cover dynamics, based on the assumption that healthy forests are generally green in summer (indicated by high NDVI values) and dark in the SWIR region. “Unhealthy” forests, such as standing dead trees following a bark beetle attack, may still be dark in the SWIR region but with lower greenness. Non-forested land lacks these combined characteristics.

Note that this approach is relatively simple and, as a result, produces an imperfect classification. Although there may be some confusion between land cover elements that are difficult to discriminate solely based on NDVI and SWIR values, the layer is well-suited for purposes such as stratification in a stratified sampling design.

Classes are encoded as follows:

Non-forested land
Stable Forest (Constantly green and dark in the SWIR region)
Forest Cover Loss (Conversion from healthy forest to non-forested land)
Forest Canopy Dieback (Loss of greenness but still dark in SWIR)
Salvage Logging (From 'unhealthy' forest to non-forested land)

Returns:: 2D array with encoded 2019-2022 tree cover dynamics or (data, metadata) if return_meta is True.
Return type:: numpy.ndarray or (numpy.ndarray, dict)

nrt.data.germany_zarr(**kwargs)

Sentinel 2 datacube of a forest area affected by Bark Beetle in Germany

The data covers an area of 400 km² located East of the city Cologne in the North Rhine-Westphalia state of Germany. It spans a five year period from 2018 to 2022 and includes the sentinel bands B02, B03, B04, B08, B11, B12 and SCL. Native 10m resolution is preserved for the visible and near-infrared bands while B11, B12 and SCL were resampled to 10m from their original 20m resolution. The data is organized as an online zarr store on the JRC’s Open Data Repository, so that running the germany_zarr() function only creates a symbolic representation of the dataset (dask based). Lazy chunks can be eagerly loaded by invoking the .compute() method. Beware that loading the entire dataset may take a fairly long time due to its size (4.8 GB). Note that although data is stored as int16, the scaling factor is automatically applied to convert spectral channels to unscaled surface reflectance values and the corresponding data variables are casted to float64. No data pixels are also automatically converted to np.nan

Parameters:: **kwargs – Additional keyword arguments passed to xarray.open_zarr()

Examples

>>> import sys
>>> from nrt import data
>>> ds = data.germany_zarr()
>>> print(ds)
<xarray.Dataset>
Dimensions:      (time: 172, y: 2000, x: 2000)
Coordinates:
    spatial_ref  int32 ...
  * time         (time) datetime64[ns] 2018-02-14T10:31:31.026000 ... 2022-12...
  * x            (x) float64 4.133e+06 4.133e+06 ... 4.153e+06 4.153e+06
  * y            (y) float64 3.113e+06 3.113e+06 ... 3.093e+06 3.093e+06
Data variables:
    B02          (time, y, x) float32 dask.array<chunksize=(10, 100, 100), meta=np.ndarray>
    B03          (time, y, x) float32 dask.array<chunksize=(10, 100, 100), meta=np.ndarray>
    B04          (time, y, x) float32 dask.array<chunksize=(10, 100, 100), meta=np.ndarray>
    B08          (time, y, x) float32 dask.array<chunksize=(10, 100, 100), meta=np.ndarray>
    B11          (time, y, x) float32 dask.array<chunksize=(10, 100, 100), meta=np.ndarray>
    B12          (time, y, x) float32 dask.array<chunksize=(10, 100, 100), meta=np.ndarray>
    SCL          (time, y, x) uint8 dask.array<chunksize=(10, 100, 100), meta=np.ndarray>
>>> # Load a subset in memory
>>> print(sys.getsizeof(ds.B02.data))
80
>>> ds_sub = ds.isel(x=slice(100,110), y=slice(200,210), time=slice(10,20)).compute()
>>> print(sys.getsizeof(ds_sub.B02.data))
2144

Returns:: A dask based Dataset of dimension (time: 172, y: 2000, x: 2000) with 7 data variables.
Return type:: xarray.Dataset

nrt.data.make_cube(*args, **kwargs)

nrt.data.make_cube_parameters(*args, **kwargs)

nrt.data.make_ts(*args, **kwargs)

nrt.data.mre_crit_table()

Contains a dictionary equivalent to strucchange’s mreCritValTable The key ‘sig_level’ is a list of the available pre-computed significance (1-alpha) values.

The other keys contain nested dictionaries, where the keys are the available relative window sizes (0.25, 0.5, 1), the second keys are the available periods (2, 4, 6, 8, 10) and the third keys are the functional types (“max”, “range”).

Example

>>> from nrt import data
>>> crit_table = data.mre_crit_table()
>>> win_size = 0.5
>>> period = 10
>>> functional = "max"
>>> alpha=0.025
>>> crit_values = crit_table.get(str(win_size))                                    .get(str(period))                                    .get(functional)
>>> sig_level = crit_table.get('sig_levels')
>>> crit_level = np.interp(1-alpha, sig_level, crit_values)

nrt.data.romania_10m(**kwargs)

Sentinel 2 datacube of a small forested area in Romania at 10 m resolution

Examples

>>> from nrt import data

>>> s2_cube = data.romania_10m()
>>> # Compute NDVI
>>> s2_cube['ndvi'] = (s2_cube.B8 - s2_cube.B4) / (s2_cube.B8 + s2_cube.B4)
>>> # Filter clouds
>>> s2_cube = s2_cube.where(s2_cube.SCL.isin([4,5,7]))

nrt.data.romania_20m(**kwargs)

Sentinel 2 datacube of a small forested area in Romania at 20 m resolution

Examples

>>> from nrt import data

>>> s2_cube = data.romania_20m()
>>> # Compute NDVI
>>> s2_cube['ndvi'] = (s2_cube.B8A - s2_cube.B4) / (s2_cube.B8A + s2_cube.B4)
>>> # Filter clouds
>>> s2_cube = s2_cube.where(s2_cube.SCL.isin([4,5,7]))

nrt.data.romania_forest_cover_percentage(return_meta=False)

Subset of Copernicus HR layer tree cover percentage - 20 m - Romania

Returns:: The data or (data, metadata) if return_meta is True.
Return type:: numpy.ndarray or (numpy.ndarray, dict)

nrt.monitor package

class nrt.monitor.BaseNrt(mask=None, trend=True, harmonic_order=3, save_fit_start=False, beta=None, x_coords=None, y_coords=None, process=None, boundary=None, detection_date=None, fit_start=None, update_mask=True, **kwargs)

Bases: object

Abstract class for Near Real Time change detection

Every new change monitoring approach should inherit from this abstract class and must implement the abstract methods fit(), monitor() and report(). It contains generic method to fit time-series harmonic-trend regression models, backward test for stability, dump the instance to a netcdf file and reload a dump.

mask

A 2D numpy array containing pixels that should be monitored (1) and not (0). The mask may be updated following historing period stability check, and after a call to monitor following a confirmed break. Values are as follow:

{0: 'Not monitored',
'monitored',
'Unstable history',
'Confirmed break - no longer monitored',
'Not enough observations - not monitored'}

Type:: numpy.ndarray

trend

Indicate whether stable period fit is performed with trend or not

Type:: bool

harmonic_order

The harmonic order of the time-series regression

Type:: int

x

array of x coordinates

Type:: numpy.ndarray

y

array of y coordinates

Type:: numpy.ndarray

process

2D numpy array containing the process value for every pixel

Type:: numpy.ndarray

boundary

Process boundary for all pixels or every pixel individually

Type:: Union[numpy.ndarray, int, float]

detection_date

2D array signalling detection date of disturbances in days since 1970-01-01

Type:: numpy.ndarray

fit_start

2D integer array reporting start of history period in days since UNIX Epoch. Start of history period only varies when using stable fitting algorithms

Type:: numpy.ndarray

update_mask

Specifies whether to update the mask and halt further monitoring when values exceed boundary limits during a .monitor() call. A True value indicates that crossing the boundary limits will trigger a mask update and stop successive observation monitoring.

Type:: bool

Parameters:

mask (numpy.ndarray) – A 2D numpy array containing pixels that should be monitored marked as 1 and pixels that should be excluded (marked as 0). Typically a stable forest mask when doing forest disturbance monitoring. If no mask is supplied all pixels are considered and a mask is created following the fit() call
trend (bool) – Indicate whether stable period fit is performed with trend or not
harmonic_order (int) – The harmonic order of the time-series regression
save_fit_start (bool) – If start of the fit should be reported in the model. Only applicable to stable fits (e.g. ‘ROC’, ‘CCDC-stable’). If true, the data will be saved in the attribute fit_start
x_coords (numpy.ndarray) – x coordinates
y_coords (numpy.ndarray) – y coordinates
process (numpy.ndarray) – 2D numpy array containing the process value for every pixel
boundary (Union[numpy.ndarray, int, float]) – Process boundary for all pixels or every pixel individually
detection_date (numpy.ndarray) – 2D array signalling detection date of disturbances in days since 1970-01-01
fit_start (numpy.ndarray) – 2D integer array reporting start of history period in days since UNIX Epoch. Start of history period only varies when using stable fitting algorithms
update_mask (bool) – Specifies whether to update the mask and halt further monitoring when values exceed boundary limits during a .monitor() call. A True value (default) indicates that crossing the boundary limits will trigger a mask update and stop successive observation monitoring.

static build_design_matrix(dataarray, trend=True, harmonic_order=3)

Build a design matrix for temporal regression from xarray DataArray

Parameters:

trend (bool) – Whether to include a temporal trend or not
harmonic_order (int) – The harmonic order to use (1 corresponds to annual cycles, 2 to annual and biannual cycles, etc)

Returns:

A design matrix to be passed to be passed to e.g. the: _fit() method

Return type:

numpy.ndarray

abstract fit()

classmethod from_netcdf(filename, **kwargs)

monitor(array, date, update_mask=None)

Monitor given a new acquisition

The method takes care of (1) predicting the expected pixels values, (2) updating the process value, and (3) updating self.mask in case a break is confirmed

Parameters:

array (np.ndarray) – 2D array containing the new acquisition to be monitored
date (datetime.datetime) – Date of acquisition of data contained in the array
update_mask (bool) – Override update_mask instance attribute

predict(date)

Predict the expected values for a given date

Parameters:: date (datetime.datetime) – The date to predict
Returns:: A 2D array of predicted values
Return type:: numpy.ndarray

report(filename, layers=['mask', 'detection_date'], driver='GTiff', crs=CRS.from_epsg(3035), dtype=<class 'numpy.int16'>)

Write the result of reporting to a raster geospatial file

Parameters:

layers (list) – A list of strings indicating the layers to include in the report. Valid options are 'mask' (the main output layer containing confirmed breaks, non-monitored pixels, etc), 'detection_date' (an integer value matching each confirmed break and indicating the date the break was confirmed in days since epoch), 'process' (the process value). The process value has a different meaning and interpretation for each monitoring method.
dtype (type) – The datatype of the stacked layers. Note that when returning process value for MoSum, CuSum or EWMA the dtype should be set to a float type to retain values

set_xy(dataarray)

to_netcdf(filename)

property transform

nrt.monitor.ccdc module

class nrt.monitor.ccdc.CCDC(trend=True, harmonic_order=2, sensitivity=3, mask=None, boundary=3, **kwargs)

Bases: BaseNrt

Monitoring using CCDC-like implementation

Implementation loosely following method described in Zhu & Woodcock 2014.

Zhu, Zhe, and Curtis E. Woodcock. 2014. “Continuous Change Detection and Classification of Land Cover Using All Available Landsat Data.” Remote Sensing of Environment 144 (March): 152–71. https://doi.org/10.1016/j.rse.2014.01.011.

mask

A 2D numpy array containing pixels that should be monitored (1) and not (0). The mask may be updated following history period stability check, and after a call to monitor following a confirmed break. Values are as follow. {0: 'Not monitored', 1: 'monitored', 2: 'Unstable history', 3: 'Confirmed break - no longer monitored'}

Type:: numpy.ndarray

trend

Indicate whether stable period fit is performed with trend or not

Type:: bool

harmonic_order

The harmonic order of the time-series regression

Type:: int

beta

3D array containing the model coefficients

Type:: np.ndarray

x

array of x coordinates

Type:: numpy.ndarray

y

array of y coordinates

Type:: numpy.ndarray

sensitivity

sensitivity of the monitoring. Lower numbers are high sensitivity. Value can’t be zero.

Type:: float

boundary

Number of consecutive observations identified as outliers to signal as disturbance

Type:: int

rmse

2D float array indicating RMSE for each pixel

Type:: np.ndarray

detection_date

2D array signalling detection date of disturbances in days since 1970-01-01

Type:: numpy.ndarray

Parameters:

mask (numpy.ndarray) – A 2D numpy array containing pixels that should be monitored marked as 1 and pixels that should be excluded (marked as 0). Typically a stable forest mask when doing forest disturbance monitoring. If no mask is supplied all pixels are considered and a mask is created following the fit() call
trend (bool) – Indicate whether stable period fit is performed with trend or not
harmonic_order (int) – The harmonic order of the time-series regression
sensitivity (float) – sensitivity of the monitoring. Lower numbers are high sensitivity. Value can’t be zero.
boundary (int) – Number of consecutive observations identified as outliers to signal as disturbance
**kwargs – Used to set internal attributes when initializing with .from_netcdf()

fit(dataarray, method='CCDC-stable', screen_outliers='CCDC_RIRLS', green=None, swir=None, scaling_factor=1, **kwargs)

Stable history model fitting

If screen outliers is required, green and swir bands must be passed.

The stability check will use the same sensitivity as is later used for detecting changes (default: 3*RMSE)

Parameters:

dataarray (xr.DataArray) – xarray Dataarray including the historic data to be fitted
method (string) – Regression to use. See _fit() for info.
screen_outliers (string) – Outlier screening to use. See _fit() for info.
green (xr.DataArray) – Green reflectance values to be used by screen_outliers.
swir (xr.DataArray) – Short wave infrared (SWIR) reflectance values to be used by screen_outliers.
scaling_factor (int) – Optional Scaling factor to be applied to green and swir. When screen_outliers is 'CCDC_RIRLS' (default for CCDC), the outlier screening algorithm expects green and swir reflectance values in the [0,1] range. EO data are often scaled and stored as integer, with a scaling factor to convert between scaled and actual reflectance values. As an example, if scaled reflectance values are in the [0,10000] range, set scaling_factor to 10000.
**kwargs – to be passed to _fit

Examples

>>> from nrt.monitor.ccdc import CCDC
>>> from nrt import data

>>> # Load and prepare test data
>>> mask = (data.romania_forest_cover_percentage() > 30).astype('int')
>>> s2_cube = data.romania_20m()

>>> s2_cube['ndvi'] = (s2_cube.B8A - s2_cube.B04) / (s2_cube.B8A + s2_cube.B04)
>>> s2_cube = s2_cube.where(s2_cube.SCL.isin([4,5,7]))
>>> cube_history = s2_cube.sel(time=slice('2015-01-01', '2018-12-31'))

>>> # Instantiate monitoring class and fit the model, including outliers screening
>>> ccdcMonitor = CCDC(trend=True, mask=mask)
>>> ccdcMonitor.fit(dataarray=cube_history.ndvi,
...                 green=cube_history.B03,
...                 swir=cube_history.B11,
...                 scaling_factor=10000)

nrt.monitor.cusum module

class nrt.monitor.cusum.CuSum(trend=True, harmonic_order=2, sensitivity=0.05, mask=None, **kwargs)

Bases: BaseNrt

Monitoring using cumulative sums (CUSUM) of residuals

Implementation following method as implemented in R package bFast.

mask

A 2D numpy array containing pixels that should be monitored (1) and not (0). The mask may be updated following history period stability check, and after a call to monitor following a confirmed break. Values are as follow. {0: 'Not monitored', 1: 'monitored', 2: 'Unstable history', 3: 'Confirmed break - no longer monitored'}

Type:: numpy.ndarray

trend

Indicate whether stable period fit is performed with trend or not

Type:: bool

harmonic_order

The harmonic order of the time-series regression

Type:: int

x

array of x coordinates

Type:: numpy.ndarray

y

array of y coordinates

Type:: numpy.ndarray

sensitivity

sensitivity of the monitoring. Lower numbers correspond to lower sensitivity. Equivalent to significance level ‘alpha’ with which the boundary is computed

Type:: float

boundary

process boundary for each time series. Calculated from alpha and length of time series.

Type:: numpy.ndarray

sigma

Standard deviation for normalized residuals in history period

Type:: numpy.ndarray

histsize

Number of non-nan observations in history period

Type:: numpy.ndarray

n

Total number of non-nan observations in time-series

Type:: numpy.ndarray

critval

Critical test value corresponding to the chosen sensitivity

Type:: float

detection_date

2D array signalling detection date of disturbances in days since 1970-01-01

Type:: numpy.ndarray

Parameters:

mask (numpy.ndarray) – A 2D numpy array containing pixels that should be monitored marked as 1 and pixels that should be excluded (marked as 0). Typically a stable forest mask when doing forest disturbance monitoring. If no mask is supplied all pixels are considered and a mask is created following the fit() call
trend (bool) – Indicate whether stable period fit is performed with trend or not
harmonic_order (int) – The harmonic order of the time-series regression
sensitivity (float) – sensitivity of the monitoring. Lower numbers correspond to lower sensitivity. Equivalent to significance level ‘alpha’ with which the boundary is computed
**kwargs – Used to set internal attributes when initializing with .from_netcdf()

fit(dataarray, method='ROC', alpha=0.05, **kwargs)

Stable history model fitting

If method 'ROC' is used for fitting, the argument alpha has to be passed.

Parameters:

dataarray (xr.DataArray) – xarray Dataarray including the historic data to be fitted
method (string) – Regression to use. See _fit() for info.
alpha (float) – Significance level for 'ROC' stable fit.
**kwargs – to be passed to _fit

nrt.monitor.ewma module

class nrt.monitor.ewma.EWMA(trend=True, harmonic_order=2, sensitivity=2, mask=None, lambda_=0.3, threshold_outlier=10, **kwargs)

Bases: BaseNrt

Monitoring using EWMA control chart

Implementation following method described in Brooks et al. 2014.

Parameters:

mask (numpy.ndarray) – A 2D numpy array containing pixels that should be monitored marked as 1 and pixels that should be excluded (marked as 0). Typically a stable forest mask when doing forest disturbance monitoring. If no mask is supplied all pixels are considered and a mask is created following the fit() call
trend (bool) – Indicate whether stable period fit is performed with trend or not
harmonic_order (int) – The harmonic order of the time-series regression
lambda (float) – Weight of previous observation in the monitoring process (memory). Valid range is [0,1], 1 corresponding to no memory and 0 to full memory
sensitivity (float) – Sensitivity parameter used in the computation of the monitoring boundaries. Lower values imply more sensitive monitoring
threshold_outlier (float) – Values bigger than threshold_outlier*sigma (extreme outliers) will get screened out during monitoring and will not contribute to updating the EWMA process value
**kwargs – Used to set internal attributes when initializing with .from_netcdf()

fit(dataarray, method='OLS', screen_outliers='Shewhart', L=5, **kwargs)

Stable history model fitting

The preferred fitting method for this monitoring type is 'OLS' with outlier screening 'Shewhart'. It requires a control limit parameter L. See nrt.outliers.shewart for more details

nrt.monitor.iqr module

class nrt.monitor.iqr.IQR(trend=True, harmonic_order=3, sensitivity=1.5, mask=None, boundary=3, **kwargs)

Bases: BaseNrt

Online monitoring of disturbances based on interquartile range

Reference:
https://stats.stackexchange.com/a/1153

mask

A 2D numpy array containing pixels that should be monitored (1) and not (0). The mask may be updated following history period stability check, and after a call to monitor following a confirmed break. Values are as follow. {0: 'Not monitored', 1: 'monitored', 2: 'Unstable history', 3: 'Confirmed break - no longer monitored'}

Type:: numpy.ndarray

trend

Indicate whether stable period fit is performed with trend or not

Type:: bool

harmonic_order

The harmonic order of the time-series regression

Type:: int

beta

3D array containing the model coefficients

Type:: np.ndarray

x

array of x coordinates

Type:: numpy.ndarray

y

array of y coordinates

Type:: numpy.ndarray

sensitivity

sensitivity of the monitoring. Lower numbers are high sensitivity. Value can’t be zero.

Type:: float

boundary

Number of consecutive observations identified as outliers to signal as disturbance

Type:: int

q25

25th percentile of residuals

Type:: numpy.ndarray

q75

75th percentile of residuals

Type:: numpy.ndarray

detection_date

2D array signalling detection date of disturbances in days since 1970-01-01

Type:: numpy.ndarray

Parameters:

mask (numpy.ndarray) – A 2D numpy array containing pixels that should be monitored marked as 1 and pixels that should be excluded (marked as 0). Typically a stable forest mask when doing forest disturbance monitoring. If no mask is supplied all pixels are considered and a mask is created following the fit() call
trend (bool) – Indicate whether stable period fit is performed with trend or not
harmonic_order (int) – The harmonic order of the time-series regression
sensitivity (float) – sensitivity of the monitoring. Lower numbers are high sensitivity. Value can’t be zero.
boundary (int) – Number of consecutive observations identified as outliers to signal as disturbance
**kwargs – Used to set internal attributes when initializing with .from_netcdf()

fit(dataarray, method='OLS', **kwargs)

nrt.monitor.mosum module

class nrt.monitor.mosum.MoSum(trend=True, harmonic_order=2, sensitivity=0.05, mask=None, h=0.25, **kwargs)

Bases: BaseNrt

Monitoring using moving sums (MOSUM) of residuals

Implementation following method as implemented in R package bFast.

mask

A 2D numpy array containing pixels that should be monitored (1) and not (0). The mask may be updated following history period stability check, and after a call to monitor following a confirmed break. Values are as follow. {0: 'Not monitored', 1: 'monitored', 2: 'Unstable history', 3: 'Confirmed break - no longer monitored'}

Type:: numpy.ndarray

trend

Indicate whether stable period fit is performed with trend or not

Type:: bool

harmonic_order

The harmonic order of the time-series regression

Type:: int

x

array of x coordinates

Type:: numpy.ndarray

y

array of y coordinates

Type:: numpy.ndarray

sensitivity

sensitivity of the monitoring. Lower numbers correspond to lower sensitivity. Equivalent to significance level ‘alpha’ with which the boundary is computed

Type:: float

boundary

process boundary for each time series. Calculated from alpha and length of time series.

Type:: numpy.ndarray

sigma

Standard deviation for normalized residuals in history period

Type:: numpy.ndarray

histsize

Number of non-nan observations in history period

Type:: numpy.ndarray

n

Total number of non-nan observations in time-series

Type:: numpy.ndarray

critval

Critical test value corresponding to the chosen sensitivity

Type:: float

h

Moving window size relative to length of the history period. Can be one of 0.25, 0.5 and 1

Type:: float

winsize

2D array with absolute window size. Computed as h*histsize

Type:: numpy.ndarray

window

3D array containing the current values in the window

Type:: numpy.ndarray

detection_date

2D array signalling detection date of disturbances in days since 1970-01-01

Type:: numpy.ndarray

Parameters:

mask (numpy.ndarray) – A 2D numpy array containing pixels that should be monitored marked as 1 and pixels that should be excluded (marked as 0). Typically a stable forest mask when doing forest disturbance monitoring. If no mask is supplied all pixels are considered and a mask is created following the fit() call
trend (bool) – Indicate whether stable period fit is performed with trend or not
harmonic_order (int) – The harmonic order of the time-series regression
sensitivity (float) – sensitivity of the monitoring. Lower numbers correspond to lower sensitivity. Equivalent to significance level ‘alpha’ with which the boundary is computed
h (float) – Moving window size relative to length of the history period. Can be one of 0.25, 0.5 and 1
**kwargs – Used to set internal attributes when initializing with .from_netcdf()

fit(dataarray, method='ROC', alpha=0.05, **kwargs)

Stable history model fitting

If method 'ROC' is used for fitting, the argument alpha has to be passed.

Parameters:

dataarray (xr.DataArray) – xarray Dataarray including the historic data to be fitted
method (string) – Regression to use. See _fit() for info.
alpha (float) – Significance level for 'ROC' stable fit.
**kwargs – to be passed to _fit

get_process()

property process

set_process(x)

nrt.fit_methods module

Model fitting

Functions defined in this module always use a 2D array containing the dependant variables (y) and return both coefficient (beta) and residuals matrices. These functions are meant to be called in nrt.BaseNrt._fit().

The RIRLS fit is derived from Chris Holden’s yatsm package. See the copyright statement below.

nrt.fit_methods.ols(X, y)

Fit simple OLS model

Parameters:

X ((M, N) np.ndarray) – Matrix of independant variables
y ({(M,), (M, K)} np.ndarray) – Matrix of dependant variables

Returns:

The array of regression estimators residuals (numpy.ndarray): The array of residuals

Return type:

beta (numpy.ndarray)

nrt.fit_methods.weighted_ols(X, y, w)

Apply a weighted OLS fit to 1D data

Parameters:

X (np.ndarray) – independent variables
y (np.ndarray) – dependent variable
w (np.ndarray) – observation weights

Returns:

coefficients and residual vector

Return type:

tuple

nrt.log module

nrt.outliers module

Removing outliers

Functions defined in this module always use a 2D array containing the dependant variables (y) and return y with outliers set to np.nan. These functions are meant to be called in nrt.BaseNrt._fit()

Citations:

Brooks, E.B., Wynne, R.H., Thomas, V.A., Blinn, C.E. and Coulston, J.W., 2013. On-the-fly massively multitemporal change detection using statistical quality control charts and Landsat data. IEEE Transactions on Geoscience and Remote Sensing, 52(6), pp.3316-3332.
Zhu, Zhe, and Curtis E. Woodcock. 2014. “Continuous Change Detection and Classification of Land Cover Using All Available Landsat Data.” Remote Sensing of Environment 144 (March): 152–71. https://doi.org/10.1016/j.rse.2014.01.011.

nrt.outliers.ccdc_rirls(X, y, green, swir, scaling_factor=1, **kwargs)

Screen for missed clouds and other outliers using green and SWIR band

Parameters:

X ((M, N) np.ndarray) – Matrix of independant variables
y ((M, K) np.ndarray) – Matrix of dependant variables
green (np.ndarray) – 2D array containing spectral values
swir (np.ndarray) – 2D array containing spectral values (~1.55-1.75um)
scaling_factor (int) – Scaling factor to bring green and swir values to reflectance values between 0 and 1

Returns:

y with outliers set to np.nan

Return type:

np.ndarray

nrt.outliers.shewhart(X, y, L=5, **kwargs)

Remove outliers using a Shewhart control chart

As described in Brooks et al. 2014, following an initial OLS fit, outliers are identified using a shewhart control chart and removed.

Parameters:

X ((M, N) np.ndarray) – Matrix of independant variables
y ({(M,), (M, K)} np.ndarray) – Matrix of dependant variables
L (float) – control limit used for outlier filtering. Must be a positive float. Lower values indicate stricter filtering. Residuals larger than L*sigma will get screened out
**kwargs – not used

Returns:

Dependant variables with outliers set to np.nan

Return type:

y(np.ndarray)

nrt.stats module

nrt.stats.bisquare(resid, c=4.685)

Weight residuals using bisquare weight function

Parameters:

resid (np.ndarray) – residuals to be weighted
c (float) – tuning constant for Tukey’s Biweight (default: 4.685)

Returns:

weights for residuals

Return type:

weight (ndarray)

Reference:: http://statsmodels.sourceforge.net/stable/generated/statsmodels.robust.norms.TukeyBiweight.html

nrt.stats.erfcc(x): Complementary error function.

nrt.stats.mad(resid, c=0.6745)

Returns Median-Absolute-Deviation (MAD) for residuals

Parameters:

resid (np.ndarray) – residuals
c (float) – scale factor to get to ~standard normal (default: 0.6745) (i.e. 1 / 0.75iCDF ~= 1.4826 = 1 / 0.6745)

Returns:

MAD ‘robust’ variance estimate

Return type:

float

Reference:: http://en.wikipedia.org/wiki/Median_absolute_deviation

nrt.stats.nan_percentile_axis0(arr, percentiles)

Faster implementation of np.nanpercentile

This implementation always takes the percentile along axis 0. Uses numba to speed up the calculation by more than 7x.

Function is equivalent to np.nanpercentile(arr, <percentiles>, axis=0)

Parameters:

arr (np.ndarray) – 2D array to calculate percentiles for
percentiles (np.ndarray) – 1D array of percentiles to calculate

Returns:

Array with first dimension corresponding to values passed in percentiles

Return type:

np.ndarray

nrt.stats.nanlstsq(X, y)

Return the least-squares solution to a linear matrix equation

Analog to numpy.linalg.lstsq for dependant variable containing Nan

Note

For best performances of the multithreaded implementation, it is recommended to limit the number of threads used by MKL or OpenBLAS to 1. This avoids over-subscription, and improves performances. By default the function will use all cores available; the number of cores used can be controled using the numba.set_num_threads function or by modifying the NUMBA_NUM_THREADS environment variable

Parameters:

X ((M, N) np.ndarray) – Matrix of independant variables
y ({(M,), (M, K)} np.ndarray) – Matrix of dependant variables

Examples

>>> import os
>>> # Adjust linear algebra configuration (only one should be required
>>> # depending on how numpy was installed/compiled)
>>> os.environ['OPENBLAS_NUM_THREADS'] = '1'
>>> os.environ['MKL_NUM_THREADS'] = '1'
>>> import numpy as np
>>> from sklearn.datasets import make_regression
>>> from nrt.stats import nanlstsq
>>> # Generate random data
>>> n_targets = 1000
>>> n_features = 2
>>> X, y = make_regression(n_samples=200, n_features=n_features,
...                        n_targets=n_targets)
>>> # Add random nan to y array
>>> y.ravel()[np.random.choice(y.size, 5*n_targets, replace=False)] = np.nan
>>> # Run the regression
>>> beta = nanlstsq(X, y)
>>> assert beta.shape == (n_features, n_targets)

Returns:: Least-squares solution, ignoring Nan
Return type:: np.ndarray

nrt.stats.ncdf(x): Normal cumulative distribution function Source: Stackoverflow Unknown, https://stackoverflow.com/a/809402/12819237

nrt.utils module

nrt.utils.build_regressors(dates, trend=True, harmonic_order=3)

Build the design matrix (X) from a list or an array of datetimes

Trend assumes temporal resolution no finer than daily Harmonics assume annual cycles

Parameters:

dates (pandas.DatetimeIndex) – The dates to use for building regressors
trend (bool) – Whether to add a trend component
harmonic_order (int) – The order of the harmonic component

Returns:

A design matrix

Return type:

numpy.ndarray

nrt.utils.datetimeIndex_to_decimal_dates(dates): Convert a pandas datetime index to decimal dates

nrt.utils.dt_to_decimal(dt): Helper to build a decimal date from a datetime object

nrt.utils.numba_kwargs(func): Decorator which enables passing of kwargs to jitted functions by selecting only those kwargs that are available in the decorated functions signature

nrt.utils_efp module

CUSUM utility functions

Functions defined in this module implement functionality necessary for CUSUM and MOSUM monitoring as implemented in the R packages strucchange and bFast.

Portions of this module are derived from Chris Holden’s pybreakpoints package. See the copyright statement below.

nrt.utils_efp.history_roc(X, y, alpha=0.05, crit=0.9478982340418134)

Reverse Ordered Rec-CUSUM check for stable periods

Checks for stable periods by calculating recursive OLS-Residuals (see _recresid()) on the reversed X and y matrices. If the cumulative sum of the residuals crosses a boundary, the index of y where this structural change occured is returned.

Parameters:

X ((M, ) np.ndarray) – Matrix of independant variables
y ((M, K) np.ndarray) – Matrix of dependant variables
alpha (float) – Significance level for the boundary (probability of type I error)
crit (float) – Critical value corresponding to the chosen alpha. Can be calculated with _cusum_rec_test_crit. Default is the value for alpha=0.05

Returns:

Index of structural change in y. 0 - y completely stable >0 - y stable after this index

Return type:

int

API Reference

nrt.data package

nrt.monitor package

nrt.monitor.ccdc module

nrt.monitor.cusum module

nrt.monitor.ewma module

nrt.monitor.iqr module

nrt.monitor.mosum module

nrt.fit_methods module

nrt.log module

nrt.outliers module

nrt.stats module

nrt.utils module

nrt.utils_efp module

Module contents