## Forecasting with deep neural networks

Supervised learning involves training a machine learning model with an input data set. This data set is generally a matrix: A two-dimensional data structure composed of rows (samples) and columns (features).

A time series is a sequence of values ordered in time. So, it must be transformed for supervised learning.

In a previous article, we learned the right way to transform a univariate time series from a sequence right into a matrix. This is finished with a sliding window. Each statement of the series is modeled based on past recent values, also called lags.

Here’s an example of this transformation using the sequence from 1 to 10:

This transformation enables a kind of modeling called auto-regression. In auto-regression, a model is built using the past recent values (lags) of a time series as explanatory variables. These are used to predict future observations (goal variable). The intuition for the name auto-regression is that the time series is regressed with itself.

In the instance above, the lags are the initial 5 columns. The goal variable is the last column (the worth of the series in the following time step).

While most methods work with matrices, deep neural networks need a unique structure.

The input to deep neural networks equivalent to LSTMs or CNNs is a three-dimensional array. The actual data is similar because the one you’d put in a matrix. But, it’s structured otherwise.

Besides rows (samples) and columns (lags), the additional dimension refers back to the variety of variables within the series. In a matrix, you concatenate all attributes together no matter their source. Neural networks are a bit tidier. The input is organized by each variable within the series using a 3rd dimension.

Let’s do a practical example to make this clear.

On this tutorial, you’ll learn the right way to transform a time series for supervised learning with an LSTM (Long Short-Term Memory). An LSTM is a kind of neural network that is particularly useful to model time series.

We’ll split the time series transformation process into two steps:

- From a sequence of values right into a matrix;
- From a matrix right into a 3-D array for deep learning.

First, we’ll do an example with a univariate time series. Multivariate time series are covered next.

## Univariate Time Series

Let’s start by reading the info. We’ll use a time series related to the sales of various sorts of wine. You may check the source in reference [1].

`import pandas as pd`# https://github.com/vcerqueira/blog/tree/principal/data

data = pd.read_csv('data/wine_sales.csv', parse_dates=['date'])

data.set_index('date', inplace=True)

series = data['Sparkling']

We give attention to the sales of sparkling wine to do an example for the univariate case. This time series looks like this:

## From a sequence of values right into a matrix

We apply a sliding window to remodel this series for supervised learning. You may learn more about this process in a previous article.

`# src module here: https://github.com/vcerqueira/blog/tree/principal/src`

from src.tde import time_delay_embedding# using 3 lags as explanatory variables

N_LAGS = 3

# forecasting the following 2 values

HORIZON = 2

# using a sliding window method called time delay embedding

X, Y = time_delay_embedding(series, n_lags=N_LAGS, horizon=HORIZON, return_Xy=True)

Here’s a sample of the explanatory variables (X) and corresponding goal variables (Y):

This data set is the idea for training traditional machine learning methods. For instance, a linear regression or an xgboost.

`from sklearn.linear_model import RidgeCV`# training a ridge regression model

model = RidgeCV()

model.fit(X, Y)

## From a matrix right into a 3-D structure for deep learning

You want to reshape this data set to coach a neural network like an LSTM. The next function will be used to do that:

`import re`

import pandas as pd

import numpy as npdef from_matrix_to_3d(df: pd.DataFrame) -> np.ndarray:

"""

Transforming a time series from matrix into 3-D structure for deep learning

:param df: (pd.DataFrame) Time series within the matrix format after embedding

:return: Reshaped time series into 3-D structure

"""

cols = df.columns

# getting unique variables within the time series

# this list has a single element for univariate time series

var_names = np.unique([re.sub(r'([^)]*)', '', c) for c in cols]).tolist()

# getting statement for every variable

arr_by_var = [df.loc[:, cols.str.contains(v)].values for v in var_names]

# reshaping the info of every variable right into a 3-D format

arr_by_var = [x.reshape(x.shape[0], x.shape[1], 1) for x in arr_by_var]

# concatenating the arrays of every variable right into a single array

ts_arr = np.concatenate(arr_by_var, axis=2)

return ts_arr

# transforming the matrices

X_3d = from_matrix_to_3d(X)

Y_3d = from_matrix_to_3d(Y)

Finally, you’ll be able to train an LSTM using the resulting data set:

`from sklearn.model_selection import train_test_split`from keras.models import Sequential

from keras.layers import (Dense,

LSTM,

TimeDistributed,

RepeatVector)

# variety of variables within the time series

# 1 since the series is univariate

N_FEATURES = 1

# creating a straightforward stacked LSTM

model = Sequential()

model.add(LSTM(8, activation='relu', input_shape=(N_LAGS, N_FEATURES)))

model.add(RepeatVector(HORIZON))

model.add(LSTM(4, activation='relu', return_sequences=True))

model.add(TimeDistributed(Dense(N_FEATURES)))

model.compile(optimizer='adam', loss='mse')

# compiling the model

model.compile(optimizer='adam', loss='mse')

# basic train/validation split

X_train, X_valid, Y_train, Y_valid = train_test_split(X_3d, Y_3d, test_size=.2, shuffle=False)

# training the model

model.fit(X_train, Y_train, epochs=100, validation_data=(X_valid, Y_valid))

# making predictions

preds = model.predict_on_batch(X_valid)

## Multivariate Time Series

Now, let’s have a look at a multivariate time series example. On this case, the goal is to forecast the long run values of several variables, not only one. So, you wish a model for multivariate and multi-step forecasting.

The transformation process is like before.

To remodel the multivariate time series right into a matrix format, you’ll be able to apply the sliding window approach to every variable. Then, you mix all resulting matrices right into a single one.

Here’s an example:

`# transforming each variable right into a matrix format`

mat_by_variable = []

for col in data:

col_df = time_delay_embedding(data[col], n_lags=N_LAGS, horizon=HORIZON)

mat_by_variable.append(col_df)# concatenating all variables

mat_df = pd.concat(mat_by_variable, axis=1).dropna()

# defining goal (Y) and explanatory variables (X)

predictor_variables = mat_df.columns.str.incorporates('(t-|(t)')

target_variables = mat_df.columns.str.incorporates('(t+')

X = mat_df.iloc[:, predictor_variables]

Y = mat_df.iloc[:, target_variables]

The explanatory variables seem like this for 2 of the variables (others are omitted for conciseness):

You should use the identical function to remodel the info into three dimensions:

`X_3d = from_matrix_to_3d(X)`

Y_3d = from_matrix_to_3d(Y)

The training part can also be like before. The knowledge concerning the variety of variables within the series is provided within the *N_FEATURES *constant. Because the name implied, this constant is the variety of variables within the time series.

`model = Sequential()`

model.add(LSTM(8, activation='relu', input_shape=(N_LAGS, N_FEATURES)))

model.add(Dropout(.2))

model.add(RepeatVector(HORIZON))

model.add(LSTM(4, activation='relu', return_sequences=True))

model.add(Dropout(.2))

model.add(TimeDistributed(Dense(N_FEATURES)))model.compile(optimizer='adam', loss='mse')

X_train, X_valid, Y_train, Y_valid = train_test_split(X_3d, Y_3d, test_size=.2, shuffle=False)

model.fit(X_train, Y_train, epochs=500, validation_data=(X_valid, Y_valid))

preds = model.predict_on_batch(X_valid)

The next plot shows a sample of one-step ahead forecasts.

The forecasts usually are not that good. The time series is small and we didn’t optimize the model in any way. Deep learning methods are known to be data-hungry. So, for those who go for this sort of approach, make sure that you’ve enough data.