06. Interpolation - Multiple Linear Regression (+ residual correction)#

In this tutorial, we’ll cover the interpolation of point data using the Multiple Linear Regression (MLR) methodology and applying a residual correction, available in PyMica as mlr+id2d and mlr+id3d depending on the residual correction interpolation method. This methodology requires location (lon and lat), predictor variables such as altitude (altitude) or distance to coast line (among others), and value to interpolate. If mlr+id3d is selected, altitude must be provided in the variables_files.

We’ll use Meteorological Service of Catalonia sample data to demonstrate how to apply these interpolation techniques. Therefore, we need to import the required modules. First, we need to load observation data and also the PyMica class.

import json

from pymica.pymica import PyMica

Interpolation mlr+id2d#

Let’s call the PyMica class with the appropriate parameters, setting the methodology to mlr+id2d and the configuration dictionary as follows:

config_file = 'sample-data/configuration_sample.json'

with open('sample-data/configuration_sample.json', 'r') as f_p:
    config = json.load(f_p)

config['mlr+id2d']
{'id_power': 2.5,
 'id_smoothing': 0.0,
 'clusters': 'None',
 'variables_files': {'altitude': 'sample-data/explanatory/cat_dem_25831.tif',
  'dist': 'sample-data/explanatory/cat_distance_coast.tif'},
 'interpolation_bounds': [260000, 4488100, 530000, 4750000],
 'resolution': 270,
 'EPSG': 25831}

where:

  • id_power: rate at which the influence of distant data points diminishes as we move away from them.

  • id_smoothing: if 0.0 the interpolated value at that point location becomes identical to the observation value recorded at that precise data point.

  • clusters: set to None as no clusters will be used.

  • variables_files: dictionary with predictor variables as keys and their corresponding GeoTIFF path as values. Here, altitude as altitude and distance to coast line as dist.

  • interpolation_bounds: [minimum_x_coordinate, minimum_y_coordinate, maximum_x_coordinate, maximum_y_coordinate], it must be the same as the variable files.

  • resolution: spatial resolution.

  • EPSG: EPSG projection code.

With all these parameters and configurations set, let’s initialize the PyMica class with the methodology set to ‘mlr+id2d’.

mlr_id2d_method = PyMica(methodology='mlr+id2d', config=config_file)

Now that we have the interpolator set, we can input some data for interpolation. We will use data from the Meteorological Service of Catalonia AWS network.

with open('sample-data/data/smc_data.json', 'r') as f_p:
    data = json.load(f_p)

data[0]
{'id': 'C6',
 'value': 8.8,
 'lon': 0.9517200000000001,
 'lat': 41.6566,
 'altitude': 264.0,
 'dist': 0.8587308027349195}

As we can see, the first element of the data meets the requirements of PyMica input data and has the same predictor variables as the ones provided in the configuration dictionary. Therefore, we only need to call the interpolate method from the mlr_id2d_method interpolator class.

data_field = mlr_id2d_method.interpolate(data)

Now, we can get a quick look of the data_field array using matplotlib.

import matplotlib.pyplot as plt

plt.imshow(data_field)
plt.colorbar(label='Air temperature (\u00b0C)')
_images/06_howto_int_mlr_residuals_12_1.png

Now, we can save the result into a GeoTIFF file using save_file() from PyMica class.

mlr_id2d_method.save_file("sample-data/results/mlr_id2d.tif")

We have now completed the first part of this tutorial on how to interpolate station data using the mlr+id2d methodology. The obtained result is similar to the one in 05 Interpolation - Multiple linear regression, but with the additional application of residual correction, which is evident in the interpolated field. You can experiment with changing the variables_files, id_power, and id_smoothing parameters in the configuration dictionary to observe how each parameter affects the interpolation result.

mlr+id3d#

Let’s call the PyMica class with the appropriate parameters, setting the methodology to mlr+id2d and the configuration dictionary as follows:

config_file = 'sample-data/configuration_sample.json'

with open('sample-data/configuration_sample.json', 'r') as f_p:
    config = json.load(f_p)

config['mlr+id3d']
{'id_power': 2.5,
 'id_smoothing': 0.0,
 'id_penalization': 30,
 'clusters': 'None',
 'variables_files': {'altitude': 'sample-data/explanatory/cat_dem_25831.tif',
  'dist': 'sample-data/explanatory/cat_distance_coast.tif'},
 'interpolation_bounds': [260000, 4488100, 530000, 4750000],
 'resolution': 270,
 'EPSG': 25831}

where:

  • id_power: rate at which the influence of distant data points diminishes as we move away from them.

  • id_smoothing: if 0.0 the interpolated value at that point location becomes identical to the observation value recorded at that precise data point.

  • clusters: set to None as no clusters will be used.

  • variables_files: dictionary with predictor variables as keys and their corresponding GeoTIFF path as values. Here, altitude as altitude and distance to coast line as dist. altitude is mandatory as selected residual correction is id3d.

  • interpolation_bounds: [minimum_x_coordinate, minimum_y_coordinate, maximum_x_coordinate, maximum_y_coordinate], it must be the same as the variable files.

  • resolution: spatial resolution.

  • EPSG: EPSG projection code.

With all these parameters and configurations set, let’s initialize the PyMica class with the methodology set to ‘mlr+id3d’.

mlr_id3d_method = PyMica(methodology='mlr+id3d', config=config_file)

The data we’ll use for interpolation is the same as the one used in the mlr+id2d section. Then, let’s call the interpolate class method.

data_field = mlr_id3d_method.interpolate(data)

Now, we can get a quick look of the data_field array using matplotlib.

import matplotlib.pyplot as plt

plt.imshow(data_field)
plt.colorbar(label='Air temperature (\u00b0C)')
_images/06_howto_int_mlr_residuals_23_1.png

Finally, we can save the result into a GeoTIFF file using pymica.pymica.PyMica.save_file() from PyMica class.

mlr_id3d_method.save_file("sample-data/results/mlr_id3d.tif")

We have now completed this tutorial on how to interpolate station data using the mlr methodology combined with residuals correction (id2d and id3d). You can experiment with changing the variables_files in the configuration dictionary to observe how the behavior of each variable affects the interpolation result.