nrt.data package

Module contents

nrt.data.make_cube(dates, params_ds, outlier_value=0.1, name='ndvi')

Generate a cube of synthetic time-series

See make_ts for more details on how every single time-series is generated

Parameters:
  • dates (array-like) – List or array of dates (numpy.datetime64)

  • params_ds (xarray.Dataset) – Dataset containing arrays of time-series generation parameters. See make_cube_parameters for a helper to generate such Dataset. Spatial dimensions of the params_ds Dataset are used for the generated cube

  • outlier_value (float) – Value to assign to outliers

  • name (str) – Name of the generated variable in the DataArray

Returns:

Cube of synthetic time-series generated using the paramters

provided via param_ds Dataset.

Return type:

xarray.DataArray

Example

>>> import time
>>> import numpy as np
>>> from nrt import data
>>> import matplotlib.pyplot as plt
>>> dates = np.arange('2018-01-01', '2022-06-15', dtype='datetime64[W]')
>>> params_ds = data.make_cube_parameters(shape=(100,100),
...                                  n_outliers_interval=(0,5),
...                                  n_nan_interval=(0,7),
...                                  break_idx_interval=(100,dates.size - 20))
>>> cube = data.make_cube(dates=dates, params_ds=params_ds)
>>> # PLot one ts
>>> cube.isel(x=5, y=5).plot()
>>> plt.show()
nrt.data.make_cube_parameters(shape=(100, 100), break_idx_interval=(0, 100), intercept_interval=(0.6, 0.8), amplitude_interval=(0.12, 0.2), magnitude_interval=(0.2, 0.3), recovery_time_interval=(800, 1400), sigma_noise_interval=(0.02, 0.04), n_outliers_interval=(0, 5), n_nan_interval=(0, 5), unstable_proportion=0.5)

Create xarray.Dataset of paramters for generation of synthetic data cube

Prepares the main input required by the the make_cube function. This intermediary step eases the creation of multiple synthetic DataArrays sharing similar characteristics (e.g. to simulate multispectral data)

Parameters:
  • shape (tuple) – A size two integer tuple giving the x,y size of the Dataset to be generated

  • break_idx_interval (tuple) – A tuple of two integers indicating the interval from which the breakpoint position in the time-series is drawn. Generate array of random values passed to the break_idx` argument of ``make_ts. Similarly to python ranges, upper bound value is excluded from the resulting array. To produce a zero filled array (0,1) can therefore be used TODO: add a default to allow breakpoint at any location (conflict with Nan that indicate no break)

  • intercept_interval (tuple) – A tuple of two floats providing the interval from which intercept is drawn. Generate array of random values passed to the intercept argument of make_ts

  • amplitude_interval (tuple) – A tuple of two floats indicating the interval from which the seasonal amplitude parameter is drawn. Generate array of random values passed to the amplitude argument of make_ts

  • magnitude_interval (tuple) – A tuple of two floats indicating the interval from which the breakpoint magnitude parameter is drawn. Generate array of random values passed to the magnitude argument of make_ts

  • recovery_time_interval (tuple) – A tuple of two integers indicating the interval from which the recovery time parameter (in days) is drawn. Generate array of random values passed to the recovery_time` argument of ``make_ts

  • sigma_noise_interval (tuple) – A tuple of two floats indicating the interval from which the white noise level is drawn. Generate array of random values passed to the sigma_noise` argument of ``make_ts

  • n_outliers_interval (tuple) – A tuple of two integers indicating the interval from which the number of outliers is drawn. Generate array of random values passed to the n_outliers` argument of ``make_ts

  • n_nan_interval (tuple) – A tuple of two integers indicating the interval from which the number of no-data observations is drawn. Generate array of random values passed to the n_nan` argument of ``make_ts

  • unstable_proportion (float) – Proportion of time-series containing a breakpoint. The other time-series are stable.

Returns:

Dataset with arrays of parameters required for the generation

of synthetic time-series using the spatialized version of make_ts (see make_cube)

Return type:

xarray.Dataset

Examples

>>> import time
>>> import numpy as np
>>> import xarray as xr
>>> from nrt import data
>>> import matplotlib.pyplot as plt
>>> params_nir = data.make_cube_parameters(shape=(20,20),
...                                        n_outliers_interval=(0,1),
...                                        n_nan_interval=(0,1),
...                                        break_idx_interval=(50,100))
>>> params_red = params_nir.copy(deep=True)
>>> # create parameters for red, green, blue cubes by slightly adjusting intercept,
>>> # magnitude and amplitude parameters
>>> params_red['intercept'].data = np.random.uniform(0.09, 0.12, size=(20,20))
>>> params_red['magnitude'].data = np.random.uniform(-0.1, -0.03, size=(20,20))
>>> params_red['amplitude'].data = np.random.uniform(0.03, 0.07, size=(20,20))
>>> params_green = params_nir.copy(deep=True)
>>> params_green['intercept'].data = np.random.uniform(0.12, 0.20, size=(20,20))
>>> params_green['magnitude'].data = np.random.uniform(0.05, 0.1, size=(20,20))
>>> params_green['amplitude'].data = np.random.uniform(0.05, 0.08, size=(20,20))
>>> params_blue = params_nir.copy(deep=True)
>>> params_blue['intercept'].data = np.random.uniform(0.08, 0.13, size=(20,20))
>>> params_blue['magnitude'].data = np.random.uniform(-0.01, 0.01, size=(20,20))
>>> params_blue['amplitude'].data = np.random.uniform(0.02, 0.04, size=(20,20))
>>> dates = np.arange('2018-01-01', '2022-06-15', dtype='datetime64[W]')
>>> # Create cubes (DataArrays) and merge them into a sligle Dataset
>>> nir = data.make_cube(dates, name='nir', params_ds=params_nir)
>>> red = data.make_cube(dates, name='red', params_ds=params_red)
>>> green = data.make_cube(dates, name='green', params_ds=params_green)
>>> blue = data.make_cube(dates, name='blue', params_ds=params_blue)
>>> cube = xr.merge([blue, green, red, nir]).to_array()
>>> # PLot one ts
>>> cube.isel(x=5, y=5).plot(row='variable')
>>> plt.show()
nrt.data.make_ts(dates, break_idx=-1, intercept=0.7, amplitude=0.15, magnitude=0.25, recovery_time=1095, sigma_noise=0.02, n_outlier=3, outlier_value=-0.1, n_nan=3)

Simulate a harmonic time-series with optional breakpoint, noise and outliers

The time-series is generated by adding; - an intercept/trend component which varies depending on the phase of the time-series (stable, recovery) - An annual seasonal component - Random noise drawn from a normal distribution (white noise) Optional outliers are then added to randomly chosen observation as well as np.Nan values. Note that the seasonal cycles simulation approach used here is rather simplistic, using a sinusoidal model and therefore assuming symetrical and regular behaviour around the peak of the simulated variable. Actual vegetation signal is often more asymetrical and irregular.

Parameters:
  • dates (array-like) – List or array of dates (numpy.datetime64)

  • break_idx (int) – Breakpoint index in the date array provided. Defaults to -1, corresponding to a stable time-series

  • intercept (float) – Intercept of the time-series

  • amplitude (float) – Amplitude of the harmonic model (note that at every point of the time-series, the actual model amplitude is multiplied by the intercept

  • magnitude (float) – Break magnitude (always a drop in y value)

  • recovery_time (int) – Time (in days) to recover the initial intersect value following a break

  • sigma_noise (float) – Sigma value of the normal distribution (mean = 0) from which noise values are drawn

  • n_outlier (int) – Number of outliers randomly assigned to observations of the time-series

  • outlier_value (float) – Value to assign to outliers

  • n_nan (int) – Number of np.nan (no data) assigned to observations of the time-series

Example

>>> from nrt import data
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> dates = np.arange('2018-01-01', '2022-06-15', dtype='datetime64[W]')
>>> ts = data.make_ts(dates=dates, break_idx=30)
>>> plt.plot(dates, ts)
>>> plt.show()
Returns:

Array of simulated values of same size as dates

Return type:

np.ndarray

nrt.data.mre_crit_table()

Contains a dictionary equivalent to strucchange’s mreCritValTable The key ‘sig_level’ is a list of the available pre-computed significance (1-alpha) values.

The other keys contain nested dictionaries, where the keys are the available relative window sizes (0.25, 0.5, 1), the second keys are the available periods (2, 4, 6, 8, 10) and the third keys are the functional types (“max”, “range”).

Example

>>> from nrt import data
>>> crit_table = data.mre_crit_table()
>>> win_size = 0.5
>>> period = 10
>>> functional = "max"
>>> alpha=0.025
>>> crit_values = crit_table.get(str(win_size))                                    .get(str(period))                                    .get(functional)
>>> sig_level = crit_table.get('sig_levels')
>>> crit_level = np.interp(1-alpha, sig_level, crit_values)
nrt.data.romania_10m(**kwargs)

Sentinel 2 datacube of a small forested area in Romania at 10 m resolution

Examples

>>> from nrt import data
>>> s2_cube = data.romania_10m()
>>> # Compute NDVI
>>> s2_cube['ndvi'] = (s2_cube.B8 - s2_cube.B4) / (s2_cube.B8 + s2_cube.B4)
>>> # Filter clouds
>>> s2_cube = s2_cube.where(s2_cube.SCL.isin([4,5,7]))
nrt.data.romania_20m(**kwargs)

Sentinel 2 datacube of a small forested area in Romania at 20 m resolution

Examples

>>> from nrt import data
>>> s2_cube = data.romania_20m()
>>> # Compute NDVI
>>> s2_cube['ndvi'] = (s2_cube.B8A - s2_cube.B4) / (s2_cube.B8A + s2_cube.B4)
>>> # Filter clouds
>>> s2_cube = s2_cube.where(s2_cube.SCL.isin([4,5,7]))
nrt.data.romania_forest_cover_percentage()

Subset of Copernicus HR layer tree cover percentage - 20 m - Romania