nrt.data package
Module contents
- nrt.data.make_cube(dates, params_ds, outlier_value=0.1, name='ndvi')
Generate a cube of synthetic time-series
See
make_ts
for more details on how every single time-series is generated- Parameters:
dates (array-like) – List or array of dates (numpy.datetime64)
params_ds (xarray.Dataset) – Dataset containing arrays of time-series generation parameters. See
make_cube_parameters
for a helper to generate such Dataset. Spatial dimensions of the params_ds Dataset are used for the generated cubeoutlier_value (float) – Value to assign to outliers
name (str) – Name of the generated variable in the DataArray
- Returns:
- Cube of synthetic time-series generated using the paramters
provided via
param_ds
Dataset.
- Return type:
xarray.DataArray
Example
>>> import time >>> import numpy as np >>> from nrt import data >>> import matplotlib.pyplot as plt >>> dates = np.arange('2018-01-01', '2022-06-15', dtype='datetime64[W]') >>> params_ds = data.make_cube_parameters(shape=(100,100), ... n_outliers_interval=(0,5), ... n_nan_interval=(0,7), ... break_idx_interval=(100,dates.size - 20)) >>> cube = data.make_cube(dates=dates, params_ds=params_ds) >>> # PLot one ts >>> cube.isel(x=5, y=5).plot() >>> plt.show()
- nrt.data.make_cube_parameters(shape=(100, 100), break_idx_interval=(0, 100), intercept_interval=(0.6, 0.8), amplitude_interval=(0.12, 0.2), magnitude_interval=(0.2, 0.3), recovery_time_interval=(800, 1400), sigma_noise_interval=(0.02, 0.04), n_outliers_interval=(0, 5), n_nan_interval=(0, 5), unstable_proportion=0.5)
Create
xarray.Dataset
of paramters for generation of synthetic data cubePrepares the main input required by the the
make_cube
function. This intermediary step eases the creation of multiple synthetic DataArrays sharing similar characteristics (e.g. to simulate multispectral data)- Parameters:
shape (tuple) – A size two integer tuple giving the x,y size of the Dataset to be generated
break_idx_interval (tuple) – A tuple of two integers indicating the interval from which the breakpoint position in the time-series is drawn. Generate array of random values passed to the
break_idx` argument of ``make_ts
. Similarly to python ranges, upper bound value is excluded from the resulting array. To produce a zero filled array(0,1)
can therefore be used TODO: add a default to allow breakpoint at any location (conflict with Nan that indicate no break)intercept_interval (tuple) – A tuple of two floats providing the interval from which intercept is drawn. Generate array of random values passed to the
intercept
argument ofmake_ts
amplitude_interval (tuple) – A tuple of two floats indicating the interval from which the seasonal amplitude parameter is drawn. Generate array of random values passed to the
amplitude
argument ofmake_ts
magnitude_interval (tuple) – A tuple of two floats indicating the interval from which the breakpoint magnitude parameter is drawn. Generate array of random values passed to the
magnitude
argument ofmake_ts
recovery_time_interval (tuple) – A tuple of two integers indicating the interval from which the recovery time parameter (in days) is drawn. Generate array of random values passed to the
recovery_time` argument of ``make_ts
sigma_noise_interval (tuple) – A tuple of two floats indicating the interval from which the white noise level is drawn. Generate array of random values passed to the
sigma_noise` argument of ``make_ts
n_outliers_interval (tuple) – A tuple of two integers indicating the interval from which the number of outliers is drawn. Generate array of random values passed to the
n_outliers` argument of ``make_ts
n_nan_interval (tuple) – A tuple of two integers indicating the interval from which the number of no-data observations is drawn. Generate array of random values passed to the
n_nan` argument of ``make_ts
unstable_proportion (float) – Proportion of time-series containing a breakpoint. The other time-series are stable.
- Returns:
- Dataset with arrays of parameters required for the generation
of synthetic time-series using the spatialized version of
make_ts
(seemake_cube
)
- Return type:
xarray.Dataset
Examples
>>> import time >>> import numpy as np >>> import xarray as xr >>> from nrt import data >>> import matplotlib.pyplot as plt >>> params_nir = data.make_cube_parameters(shape=(20,20), ... n_outliers_interval=(0,1), ... n_nan_interval=(0,1), ... break_idx_interval=(50,100)) >>> params_red = params_nir.copy(deep=True) >>> # create parameters for red, green, blue cubes by slightly adjusting intercept, >>> # magnitude and amplitude parameters >>> params_red['intercept'].data = np.random.uniform(0.09, 0.12, size=(20,20)) >>> params_red['magnitude'].data = np.random.uniform(-0.1, -0.03, size=(20,20)) >>> params_red['amplitude'].data = np.random.uniform(0.03, 0.07, size=(20,20)) >>> params_green = params_nir.copy(deep=True) >>> params_green['intercept'].data = np.random.uniform(0.12, 0.20, size=(20,20)) >>> params_green['magnitude'].data = np.random.uniform(0.05, 0.1, size=(20,20)) >>> params_green['amplitude'].data = np.random.uniform(0.05, 0.08, size=(20,20)) >>> params_blue = params_nir.copy(deep=True) >>> params_blue['intercept'].data = np.random.uniform(0.08, 0.13, size=(20,20)) >>> params_blue['magnitude'].data = np.random.uniform(-0.01, 0.01, size=(20,20)) >>> params_blue['amplitude'].data = np.random.uniform(0.02, 0.04, size=(20,20)) >>> dates = np.arange('2018-01-01', '2022-06-15', dtype='datetime64[W]') >>> # Create cubes (DataArrays) and merge them into a sligle Dataset >>> nir = data.make_cube(dates, name='nir', params_ds=params_nir) >>> red = data.make_cube(dates, name='red', params_ds=params_red) >>> green = data.make_cube(dates, name='green', params_ds=params_green) >>> blue = data.make_cube(dates, name='blue', params_ds=params_blue) >>> cube = xr.merge([blue, green, red, nir]).to_array() >>> # PLot one ts >>> cube.isel(x=5, y=5).plot(row='variable') >>> plt.show()
- nrt.data.make_ts(dates, break_idx=-1, intercept=0.7, amplitude=0.15, magnitude=0.25, recovery_time=1095, sigma_noise=0.02, n_outlier=3, outlier_value=-0.1, n_nan=3)
Simulate a harmonic time-series with optional breakpoint, noise and outliers
The time-series is generated by adding; - an intercept/trend component which varies depending on the phase of the time-series (stable, recovery) - An annual seasonal component - Random noise drawn from a normal distribution (white noise) Optional outliers are then added to randomly chosen observation as well as
np.Nan
values. Note that the seasonal cycles simulation approach used here is rather simplistic, using a sinusoidal model and therefore assuming symetrical and regular behaviour around the peak of the simulated variable. Actual vegetation signal is often more asymetrical and irregular.- Parameters:
dates (array-like) – List or array of dates (numpy.datetime64)
break_idx (int) – Breakpoint index in the date array provided. Defaults to
-1
, corresponding to a stable time-seriesintercept (float) – Intercept of the time-series
amplitude (float) – Amplitude of the harmonic model (note that at every point of the time-series, the actual model amplitude is multiplied by the intercept
magnitude (float) – Break magnitude (always a drop in y value)
recovery_time (int) – Time (in days) to recover the initial intersect value following a break
sigma_noise (float) – Sigma value of the normal distribution (mean = 0) from which noise values are drawn
n_outlier (int) – Number of outliers randomly assigned to observations of the time-series
outlier_value (float) – Value to assign to outliers
n_nan (int) – Number of
np.nan
(no data) assigned to observations of the time-series
Example
>>> from nrt import data >>> import numpy as np >>> import matplotlib.pyplot as plt
>>> dates = np.arange('2018-01-01', '2022-06-15', dtype='datetime64[W]') >>> ts = data.make_ts(dates=dates, break_idx=30)
>>> plt.plot(dates, ts) >>> plt.show()
- Returns:
Array of simulated values of same size as
dates
- Return type:
np.ndarray
- nrt.data.mre_crit_table()
Contains a dictionary equivalent to strucchange’s
mreCritValTable
The key ‘sig_level’ is a list of the available pre-computed significance (1-alpha) values.The other keys contain nested dictionaries, where the keys are the available relative window sizes (0.25, 0.5, 1), the second keys are the available periods (2, 4, 6, 8, 10) and the third keys are the functional types (“max”, “range”).
Example
>>> from nrt import data >>> crit_table = data.mre_crit_table() >>> win_size = 0.5 >>> period = 10 >>> functional = "max" >>> alpha=0.025 >>> crit_values = crit_table.get(str(win_size)) .get(str(period)) .get(functional) >>> sig_level = crit_table.get('sig_levels') >>> crit_level = np.interp(1-alpha, sig_level, crit_values)
- nrt.data.romania_10m(**kwargs)
Sentinel 2 datacube of a small forested area in Romania at 10 m resolution
Examples
>>> from nrt import data
>>> s2_cube = data.romania_10m() >>> # Compute NDVI >>> s2_cube['ndvi'] = (s2_cube.B8 - s2_cube.B4) / (s2_cube.B8 + s2_cube.B4) >>> # Filter clouds >>> s2_cube = s2_cube.where(s2_cube.SCL.isin([4,5,7]))
- nrt.data.romania_20m(**kwargs)
Sentinel 2 datacube of a small forested area in Romania at 20 m resolution
Examples
>>> from nrt import data
>>> s2_cube = data.romania_20m() >>> # Compute NDVI >>> s2_cube['ndvi'] = (s2_cube.B8A - s2_cube.B4) / (s2_cube.B8A + s2_cube.B4) >>> # Filter clouds >>> s2_cube = s2_cube.where(s2_cube.SCL.isin([4,5,7]))
- nrt.data.romania_forest_cover_percentage()
Subset of Copernicus HR layer tree cover percentage - 20 m - Romania