01. Preparing Weather Station Data for PyMica ============================================= In this tutorial, we’ll cover the preparation of weather station data for use in PyMica. The data format for weather station data used by PyMica is a list containing a dictionary for each weather station, including at least the following variables: - ``id``: Identification code. - ``lon``: Longitude coordinate. - ``lat``: Latitude coordinate. - ``value``: Observation value. It can also contain other keys referring to the variables used in interpolation, such as altitude or distance to the coast. Altitude must be named ‘altitude’; the names of other explanatory variables do not need to be specific in PyMica. An element of the list containing these variables is organized as follows for each weather station: :: { "id": "id_code", "lon": "longitude coordinate value", "lat": "latitude coordinate value", "value": "value", "altitude": "altitude value" } The weather station data is supplied to :py:meth:`pymica.pymica.PyMica.interpolate()` as a list of dictionaries, one for each station. As an example, we’ll work with data from the Automatic Weather Station Network (XEMA) of the Meteorological Service of Catalonia (`XEMA `__). However, you can also provide your own data to PyMica. First, let’s import the required library. .. code:: python import pandas as pd Now, let’s suppose that our data is in a .csv format. In the ``sample-data/data`` directory, we’ll find data from the XEMA network for 2017/02/21 12:00 UTC and its corresponding metadata. We’ll open both .csv files, ``XEMA_20170221_1200.csv`` and ``XEMA_metadata.csv``, using the pandas library and present the head of data file. .. code:: python file_data = 'sample-data/data/XEMA_20170221_1200.csv' file_metadata = 'sample-data/data/XEMA_metadata.csv' station_data = pd.read_csv(file_data) metadata = pd.read_csv(file_metadata) station_data.head() .. raw:: html

	key	altitude	dist	hr	lat	lon	temp
0	C6	264.0	0.858731	80.0	41.65660	0.95172	8.8
1	C7	427.0	0.839116	86.0	41.66695	1.16234	7.1
2	C8	554.0	0.825381	76.0	41.67555	1.29609	9.3
3	C9	240.0	0.448604	47.0	40.71825	0.39988	15.7
4	CC	626.0	0.849968	47.0	42.07398	2.20862	15.2

And we also present the head of metedata. .. code:: python metadata.head() .. raw:: html

	key	altitude	dist	lat	lon	name
0	C6	264.0	0.858731	41.65660	0.95172	Castellnou de Seana
1	C7	427.0	0.839116	41.66695	1.16234	Tàrrega
2	C8	554.0	0.825381	41.67555	1.29609	Cervera
3	C9	240.0	0.448604	40.71825	0.39988	Mas de Barberans
4	CC	626.0	0.849968	42.07398	2.20862	Orís

Now, let’s prepare the data in the format required by PyMICA, selecting the air temperature variable (``temp``) and using ``altitude`` and ``dist`` as predictor variables. The variable ``dist`` refers to the distance from a station to the coastline to account for proximity to sea influence. .. code:: python data = [] for key in station_data['key']: df_data = station_data[station_data['key'] == key] df_meta = metadata[metadata['key'] == key] data.append( { 'id': key, 'lon': float(df_meta['lon'].iloc[0]), 'lat': float(df_meta['lat'].iloc[0]), 'value': float(df_data['temp'].iloc[0]), 'altitude': float(df_meta['altitude'].iloc[0]), 'dist': float(df_meta['dist'].iloc[0]) } ) If we print the first element of ``data``, we can see all the required variables for a station, which include identification code, longitude, latitude, temperature value, altitude, and distance to the coastline. .. code:: python print('Sample data: ', data[0]) print('Number of points: ', len(data)) .. parsed-literal:: Sample data: {'id': 'C6', 'lon': 0.95172, 'lat': 41.6566, 'value': 8.8, 'altitude': 264.0, 'dist': 0.8587308027349195} Number of points: 180 We have now completed this tutorial on how to prepare raw observation station data to be ready to feed the PyMICA class.