Home Artificial Intelligence Time Series for Climate Change: Using Deep Learning for Precision Agriculture Precision Agriculture Hands-on: Spatio-Temporal Forecasting of Dew Point Temperature using Deep Learning Key Takeaways

Time Series for Climate Change: Using Deep Learning for Precision Agriculture Precision Agriculture Hands-on: Spatio-Temporal Forecasting of Dew Point Temperature using Deep Learning Key Takeaways

2
Time Series for Climate Change: Using Deep Learning for Precision Agriculture
Precision Agriculture
Hands-on: Spatio-Temporal Forecasting of Dew Point Temperature using Deep Learning
Key Takeaways

In the remainder of this text, we’ll forecast dew point temperature in several locations. You’ll learn the way to construct a spatio-temporal forecasting model using deep learning.

The total code for this tutorial is accessible on Github:

Primer on Spatio-Temporal Forecasting

Spatio-temporal data is a selected case of multivariate time series. These data sets involve observing a variable, akin to dew point temperature, in several locations.

Any such data comprises each a temporal dependency and a spatial dependency. An information point collected in a specific location is correlated with its lags and the present and past lags of nearby locations. Modeling each dependencies may be necessary to improve forecasts in each location.

Spatio-temporal forecasting is frequently done with techniques akin to VAR (vector auto-regression) or STAR (spatio-temporal auto-regression). We’ll use a VAR approach with a deep neural network.

Data set

We’ll use a real-world data set that’s collected by the U.S. Department of Agriculture. More details can be found in reference [1]. Amongst other things, the dataset comprises information concerning the dew point temperature. This variable is captured in 6 nearby stations.

After downloading the info, you possibly can read it using the next code:

import pandas as pd

DATE_TIME_COLS = ['month', 'day', 'calendar_year', 'hour', 'water_year']

# reading the info set
data = pd.read_csv(filepath)

# parsing the datetime column
data['datetime'] =
pd.to_datetime([f'{year}/{month}/{day} {hour}:00'
for year, month, day, hour in zip(data['calendar_year'],
data['month'],
data['day'],
data['hour'])])

data = data.drop(DATE_TIME_COLS, axis=1).set_index('datetime')
data.columns = data.columns.str.replace('_dpt_C', '')

Here’s what the info looks like:

The time series in the various locations seem like correlated.

VAR — preparing the info for supervised learning

We’ll use a VAR approach to organize the info for training a deep neural network. VAR methods aim at capturing temporal dependencies amongst different variables. On this case, the variables represent the dew point temperature collected at 6 locations.

We will do that by transforming each variable right into a matrix format using a sliding window after which combining the outcomes. You may check a previous article for more details on this process.

from sklearn.model_selection import train_test_split

from src.tde import transform_mv_series

N_LAGS, HORIZON = 12, 12

# variety of stations
N_STATIONS = data.shape[1]

# leaving last 20% of observations for testing
train, test = train_test_split(data, test_size=0.2, shuffle=False)

# computing the common of every series within the training set
mean_by_location = train.mean()

# mean-scaling: dividing each series by its mean value
train_scaled = train / mean_by_location
test_scaled = test / mean_by_location

# transforming the info for supervised learning
X_train, Y_train = transform_mv_series(train_scaled, n_lags=N_LAGS, horizon=HORIZON)
X_test, Y_test = transform_mv_series(test_scaled, n_lags=N_LAGS, horizon=HORIZON)

Then, we construct a stacked LSTM model based on keras. An LSTM (Long Short-Term Memory) is a special form of recurrent neural network that may capture temporal dependencies.

from keras.models import Sequential
from keras.layers import Dense, Dropout, LSTM, RepeatVector, TimeDistributed

model = Sequential()
model.add(LSTM(64, activation='relu', input_shape=(N_LAGS, N_STATIONS)))
model.add(Dropout(.2))
model.add(RepeatVector(HORIZON))
model.add(LSTM(32, activation='relu', return_sequences=True))
model.add(Dropout(.2))
model.add(TimeDistributed(Dense(N_STATIONS)))

model.compile(optimizer='adam', loss='mse')

After defining and compiling the model, we are able to train it as follows:

from keras.callbacks import ModelCheckpoint

# model checkpoint for saving the perfect model during training
model_checkpoint = ModelCheckpoint(
filepath='best_model_weights.h5',
save_weights_only=True,
monitor='val_loss',
mode='min',
save_best_only=True)

# fitting the model
history = model.fit(X_train, Y_train,
epochs=25,
validation_split=0.2,
callbacks=[model_checkpoint])

After training, we are able to load the perfect weights that were saved by the model checkpoint callback:

# The very best model weights are loaded into the model after training
model.load_weights('best_model_weights.h5')

# predictions on the test set
preds = model.predict_on_batch(X_test)

Extending the model

We could extend this model in alternative ways. For instance, include other meteorological information as explanatory variables. Meteorological data akin to the dew point temperature is affected by various aspects. Their inclusion could also be key for higher forecasting performance.

Testing other neural network configurations could also prove helpful. We applied a stacked LSTM, but other methods have also shown promising forecasting performance. Examples include N-BEATS, DeepAR, or ES-RNN.

We could also include spatial information, akin to geographical coordinates. This manner, the model could improve the modeling of spatial dependencies amongst the various locations.

2 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here