## Forecasting with deep neural networks

Supervised learning involves training a machine learning model with an input data set. This data set is generally a matrix: A two-dimensional data structure composed of rows (samples) and columns (features).

A time series is a sequence of values ordered in time. So, it must be transformed for supervised learning.

In a previous article, we learned learn how to transform a univariate time series from a sequence right into a matrix. This is completed with a sliding window. Each remark of the series is modeled based on past recent values, also called lags.

Hereâ€™s an example of this transformation using the sequence from 1 to 10:

This transformation enables a sort of modeling called auto-regression. In auto-regression, a model is built using the past recent values (lags) of a time series as explanatory variables. These are used to predict future observations (goal variable). The intuition for the name auto-regression is that the time series is regressed with itself.

In the instance above, the lags are the initial 5 columns. The goal variable is the last column (the worth of the series in the following time step).

While most methods work with matrices, deep neural networks need a distinct structure.

The input to deep neural networks akin to LSTMs or CNNs is a three-dimensional array. The actual data is similar because the one youâ€™d put in a matrix. But, itâ€™s structured otherwise.

Besides rows (samples) and columns (lags), the additional dimension refers back to the variety of variables within the series. In a matrix, you concatenate all attributes together regardless of their source. Neural networks are a bit tidier. The input is organized by each variable within the series using a 3rd dimension.

Letâ€™s do a practical example to make this clear.

On this tutorial, youâ€™ll learn learn how to transform a time series for supervised learning with an LSTM (Long Short-Term Memory). An LSTM is a sort of neural network that is particularly useful to model time series.

Weâ€™ll split the time series transformation process into two steps:

- From a sequence of values right into a matrix;
- From a matrix right into a three-D array for deep learning.

First, weâ€™ll do an example with a univariate time series. Multivariate time series are covered next.

## Univariate Time Series

Letâ€™s start by reading the info. Weâ€™ll use a time series related to the sales of various sorts of wine. You may check the source in reference [1].

`import pandas as pd`# https://github.com/vcerqueira/blog/tree/major/data

data = pd.read_csv('data/wine_sales.csv', parse_dates=['date'])

data.set_index('date', inplace=True)

series = data['Sparkling']

We concentrate on the sales of sparkling wine to do an example for the univariate case. This time series looks like this:

## From a sequence of values right into a matrix

We apply a sliding window to remodel this series for supervised learning. You may learn more about this process in a previous article.

`# src module here: https://github.com/vcerqueira/blog/tree/major/src`

from src.tde import time_delay_embedding# using 3 lags as explanatory variables

N_LAGS = 3

# forecasting the following 2 values

HORIZON = 2

# using a sliding window method called time delay embedding

X, Y = time_delay_embedding(series, n_lags=N_LAGS, horizon=HORIZON, return_Xy=True)

Hereâ€™s a sample of the explanatory variables (X) and corresponding goal variables (Y):

This data set is the premise for training traditional machine learning methods. For instance, a linear regression or an xgboost.

`from sklearn.linear_model import RidgeCV`# training a ridge regression model

model = RidgeCV()

model.fit(X, Y)

## From a matrix right into a three-D structure for deep learning

You must reshape this data set to coach a neural network like an LSTM. The next function may be used to do that:

`import re`

import pandas as pd

import numpy as npdef from_matrix_to_3d(df: pd.DataFrame) -> np.ndarray:

"""

Transforming a time series from matrix into three-D structure for deep learning

:param df: (pd.DataFrame) Time series within the matrix format after embedding

:return: Reshaped time series into three-D structure

"""

cols = df.columns

# getting unique variables within the time series

# this list has a single element for univariate time series

var_names = np.unique([re.sub(r'([^)]*)', '', c) for c in cols]).tolist()

# getting remark for every variable

arr_by_var = [df.loc[:, cols.str.contains(v)].values for v in var_names]

# reshaping the info of every variable right into a three-D format

arr_by_var = [x.reshape(x.shape[0], x.shape[1], 1) for x in arr_by_var]

# concatenating the arrays of every variable right into a single array

ts_arr = np.concatenate(arr_by_var, axis=2)

return ts_arr

# transforming the matrices

X_3d = from_matrix_to_3d(X)

Y_3d = from_matrix_to_3d(Y)

Finally, you possibly can train an LSTM using the resulting data set:

`from sklearn.model_selection import train_test_split`from keras.models import Sequential

from keras.layers import (Dense,

LSTM,

TimeDistributed,

RepeatVector)

# variety of variables within the time series

# 1 since the series is univariate

N_FEATURES = 1

# creating a straightforward stacked LSTM

model = Sequential()

model.add(LSTM(8, activation='relu', input_shape=(N_LAGS, N_FEATURES)))

model.add(RepeatVector(HORIZON))

model.add(LSTM(4, activation='relu', return_sequences=True))

model.add(TimeDistributed(Dense(N_FEATURES)))

model.compile(optimizer='adam', loss='mse')

# compiling the model

model.compile(optimizer='adam', loss='mse')

# basic train/validation split

X_train, X_valid, Y_train, Y_valid = train_test_split(X_3d, Y_3d, test_size=.2, shuffle=False)

# training the model

model.fit(X_train, Y_train, epochs=100, validation_data=(X_valid, Y_valid))

# making predictions

preds = model.predict_on_batch(X_valid)

## Multivariate Time Series

Now, letâ€™s have a look at a multivariate time series example. On this case, the goal is to forecast the longer term values of several variables, not only one. So, you would like a model for multivariate and multi-step forecasting.

The transformation process is like before.

To remodel the multivariate time series right into a matrix format, you possibly can apply the sliding window approach to every variable. Then, you mix all resulting matrices right into a single one.

Hereâ€™s an example:

`# transforming each variable right into a matrix format`

mat_by_variable = []

for col in data:

col_df = time_delay_embedding(data[col], n_lags=N_LAGS, horizon=HORIZON)

mat_by_variable.append(col_df)# concatenating all variables

mat_df = pd.concat(mat_by_variable, axis=1).dropna()

# defining goal (Y) and explanatory variables (X)

predictor_variables = mat_df.columns.str.accommodates('(t-|(t)')

target_variables = mat_df.columns.str.accommodates('(t+')

X = mat_df.iloc[:, predictor_variables]

Y = mat_df.iloc[:, target_variables]

The explanatory variables seem like this for 2 of the variables (others are omitted for conciseness):

You need to use the identical function to remodel the info into three dimensions:

`X_3d = from_matrix_to_3d(X)`

Y_3d = from_matrix_to_3d(Y)

The training part can also be like before. The data concerning the variety of variables within the series is provided within the *N_FEATURES *constant. Because the name implied, this constant is the variety of variables within the time series.

`model = Sequential()`

model.add(LSTM(8, activation='relu', input_shape=(N_LAGS, N_FEATURES)))

model.add(Dropout(.2))

model.add(RepeatVector(HORIZON))

model.add(LSTM(4, activation='relu', return_sequences=True))

model.add(Dropout(.2))

model.add(TimeDistributed(Dense(N_FEATURES)))model.compile(optimizer='adam', loss='mse')

X_train, X_valid, Y_train, Y_valid = train_test_split(X_3d, Y_3d, test_size=.2, shuffle=False)

model.fit(X_train, Y_train, epochs=500, validation_data=(X_valid, Y_valid))

preds = model.predict_on_batch(X_valid)

The next plot shows a sample of one-step ahead forecasts.

The forecasts will not be that good. The time series is small and we didnâ€™t optimize the model in any way. Deep learning methods are known to be data-hungry. So, should you go for this type of approach, ensure you have got enough data.