Deep Learning for Forecasting: Preprocessing and Training Deep Learning for Forecasting Using many time series for deep learning Hands-On Using Callbacks for Training a Deep Neural Network Key Take-Aways


train deep neural networks using several time series

Photo by Tamara Malaniy on Unsplash

This text is a follow-up to a previous one. There, we learned find out how to transform a time series for deep learning.

We proceed to explore deep neural networks for forecasting. On this post, we’ll:

  • Learn find out how to train a world forecasting model using deep learning, including basic preprocessing steps;
  • Explore keras callbacks to drive the training means of a neural network.

Deep neural networks tackle forecasting problems using auto-regression. Auto-regression is a modeling technique that involves using past observations to predict future ones.

Deep neural networks could be designed in alternative ways, reminiscent of recurrent or convolutional architectures. Recurrent neural networks are sometimes preferred for time series data. Amongst other reasons, such a network excels at modeling long-term dependencies. This feature can have a powerful impact on forecasting performance.

Here’s find out how to define a particular type of recurrent neural network called LSTM (Long Short-Term Memory). The comments provide a temporary description of every model element.

from keras.models import Sequential
from keras.layers import (Dense,

# Variety of variables within the time series.
# 1 means the time series is univariate
# Variety of lags within the auto-regressive model
N_LAGS = 24
# Variety of future steps to be predicted

# 'Sequential' instance is used to create a linear stack of layers
# ... each layer feeds into the subsequent one.
model = Sequential()
# Adding an LSTM layer with 32 units and relu activation
model.add(LSTM(32, activation='relu', input_shape=(N_LAGS, N_FEATURES)))
# Using dropout to avoid overfitting
# Repeating the input vector HORIZON times to match the form of the output.
# One other LSTM layer, this time with 16 units
# Also returning the output of every time step (return_sequences=True)
model.add(LSTM(16, activation='relu', return_sequences=True))
# Using dropout again with 0.2 dropout rate
# Adding an ordinary fully connected neural network layer
# And distributing the layer to every time step

# Compiling the model using ADAM and setting the target to reduce MSE
model.compile(optimizer='adam', loss='mse')

Before, we learned find out how to transform a time series to coach this model. But, sometimes you have got several time series available.

How do you handle such cases?

The rise of worldwide methods

Forecasting models are frequently created with the historical data of a time series. Such models could be known as local to that point series. In contrast, global methods pool the historical data of many time series to construct a model.

The interest in global models surged when a way called ES-RNN won the M4 contest — a forecasting competition featuring 100000 different time series.

When and why to make use of a world model

Global models can provide considerable value in forecasting problems involving many time series. For instance, in retail where the goal is to predict the sales of many products.

One other motivation for using this type of approach is to have more data. Machine learning algorithms are prone to perform higher with larger training sets. This is very so with methods with a lot of parameters, reminiscent of deep neural networks. These are known to be data-hungry.

Global forecasting models don’t assume that the underlying time series are dependent. That’s, the lags of 1 series could be used to forecast the longer term values of one other series.

Somewhat, these techniques exploit information from many time series to estimate the parameters of the model. When forecasting the longer term of a time series, the principal input to the model is the past recent lags of that series.

In the remaining of this text, we’ll explore find out how to train a deep neural network using many time series.


We’ll use an information set concerning the power consumption in 8 regions across the USA:

Day by day power consumption (log) in 8 regions across the USA. Data source in reference [1]. Image by creator.

The goal is to forecast power consumption in the next days. This problem is relevant for power systems operators. Accurate predictions help balance the provision and demand of energy.

We will read the information as follows:

import pandas as pd

data = pd.read_csv('data/daily_energy_demand.csv',


Preprocessing steps

When training a deep neural network with multiple time series you should apply some preprocessing steps. Here, we’ll explore the next two:

  • Mean-scaling
  • Log transformation

The available set of time series can have different scales. Thus, it’s vital to normalize each series into a typical value range. For global forecasting models, this is often done by dividing each statement by the mean value of the respective series.

from sklearn.model_selection import train_test_split

# leaving last 20% of observations for testing
train, test = train_test_split(data, test_size=0.2, shuffle=False)

# computing the common of every series within the training set
mean_by_series = train.mean()

# mean-scaling: dividing each series by its mean value
train_scaled = train / mean_by_series
test_scaled = test / mean_by_series

After mean-scaling, the log transformation may also be helpful.

In a previous article, we explore how taking the log of time series is a useful transformation to handle heteroskedasticity. The log transformation may also help avoid saturation areas of the neural network. Saturation occurs when the neural network becomes insensitive to different inputs. This hampers the educational process, resulting in a poor model.

import numpy as np

class LogTransformation:

def transform(x):
xt = np.sign(x) * np.log(np.abs(x) + 1)

return xt

def inverse_transform(xt):
x = np.sign(xt) * (np.exp(np.abs(xt)) - 1)

return x

# log transformation
train_scaled_log = LogTransformation.transform(train_scaled)
test_scaled_log = LogTransformation.transform(test_scaled)


After pre-processing every time series, we want to rework them from sequences right into a set of observations. For a single time series, you may check the previous article to learn the main points of this process.

For several time series, the concept is comparable. We create a set of observations for every series individually. Then, these are concatenated right into a single data set.

Here’s how you may do that:

# src module here:
from src.tde import time_delay_embedding

N_FEATURES = 1 # time series is univariate
N_LAGS = 3 # variety of lags
HORIZON = 2 # forecasting horizon

# transforming time series for supervised learning
train_by_series, test_by_series = {}, {}
# iterating over every time series
for col in data:
train_series = train_scaled_log[col]
test_series = test_scaled_log[col] = 'Series' = 'Series'

# creating observations using a sliding window method
train_df = time_delay_embedding(train_series, n_lags=N_LAGS, horizon=HORIZON)
test_df = time_delay_embedding(test_series, n_lags=N_LAGS, horizon=HORIZON)

train_by_series[col] = train_df
test_by_series[col] = test_df

After that, you mix the information of every time series by a row-wise concatenation:

train_df = pd.concat(train_by_series, axis=0)


Finally, we split the goal variables from the explanatory ones as described before:

# defining goal (Y) and explanatory variables (X)
predictor_variables = train_df.columns.str.incorporates('(t-|(t)')
target_variables = train_df.columns.str.incorporates('(t+')
X_tr = train_df.iloc[:, predictor_variables]
Y_tr = train_df.iloc[:, target_variables]

# transforming the information from matrix right into a 3-D format for deep learning
X_tr_3d = from_matrix_to_3d(X_tr)
Y_tr_3d = from_matrix_to_3d(Y_tr)

# defining the neural network
model = Sequential()
model.add(LSTM(32, activation='relu', input_shape=(N_LAGS, N_FEATURES)))
model.add(LSTM(16, activation='relu', return_sequences=True))
model.compile(optimizer='adam', loss='mse')

# spliting training right into a development and validation set
X_train, X_valid, Y_train, Y_valid =
train_test_split(X_tr_3d, Y_tr_3d, test_size=.2, shuffle=False)

# training the neural network, Y_train, validation_data=(X_valid,Y_valid), epochs=100)

Photo by Jack B on Unsplash

Deep neural networks are iterative methods. They go over the training dataset several times in cycles called epochs.

Within the above example, we ran 100 epochs. But, it’s not clear what number of epochs one should run to coach a network. Too few epochs can result in underfitting; too many iterations result in overfitting.

A method to handle this problem is by monitoring the performance of the neural network after each epoch. Every time the model improves performance, you reserve it before continuing the training process. Then, after the training is over, you get the most effective model that was saved.

In keras, you should utilize callbacks to handle this process for you. A callback is a function that performs some motion through the training process. You’ll be able to check keras documentation for an entire list of the available callbacks. Or find out how to learn to write down your individual!

The callback that’s used to avoid wasting the model during training known as ModelCheckPoint:

from keras.callbacks import ModelCheckpoint

model_checkpoint = ModelCheckpoint(

model = Sequential()
model.add(LSTM(32, activation='relu', input_shape=(N_LAGS, N_FEATURES)))
model.add(LSTM(16, activation='relu', return_sequences=True))
model.compile(optimizer='adam', loss='mse')

history =, Y_train,

One other interesting callback you should utilize for training is EarlyStopping. It may well be used to stop training when performance has stopped improving.

Making predictions

After training, we will retrieve the most effective model and make predictions on the test set.

# The perfect model weights are loaded into the model.

# Inference on DAYTON region
test_dayton = test_by_series['DAYTON']

# spliting goal variables from explanatory ones
X_ts = test_df.iloc[:, predictor_variables]
Y_ts = test_df.iloc[:, target_variables]
X_ts_3d = from_matrix_to_3d(X_ts)

# predicting on normalized data
preds = model.predict_on_batch(X_ts_3d)
preds_df = from_3d_to_matrix(preds, Y_ts.columns)

# reverting log transformation
preds_df = LogTransformation.inverse_transform(preds_df)
# reverting mean scaling
preds_df *= mean_by_series['DAYTON']


What are your thoughts on this topic?
Let us know in the comments below.

Notify of
Inline Feedbacks
View all comments

Share this article

Recent posts

Conversational AI revolutionizes the shopper experience landscape

I feel the identical applies after we discuss either agents or employees or supervisors. They do not necessarily wish to be alt-tabbing or...

Former Twitter engineers are constructing Particle, an AI-powered news reader

A team led by former Twitter engineers is rethinking how AI may be used to assist people process news and data., which entered...

China, shocked by the looks of 'Sora'… “China is only a 'fine-tuned version' of the USA”

China showed a shocked response to OpenAI's video-generating artificial intelligence (AI) 'Sora'. There's concern that the technology gap has widened to the purpose...

What’s Multitenancy in Vector Databases?

While you upload and manage your data on GitHub that nobody else can see unless you make it public, you share physical infrastructure with...

Synapsoft launches Synap document viewer on ‘GPT Store’

Synapsoft (CEO Jeon Kyeong-heon), a specialist in artificial intelligence (AI) digital document software as a service (SaaS), announced on the twenty second that it...

Recent comments

skapa binance-konto on LLMs and the Emerging ML Tech Stack
бнанс рестраця для США on Model Evaluation in Time Series Forecasting
Bonus Pendaftaran Binance on Meet Our Fleet
Créer un compte gratuit on About Me — How I give AI artists a hand
To tài khon binance on China completely blocks ‘Chat GPT’
Regístrese para obtener 100 USDT on Reducing bias and improving safety in DALL·E 2
crystal teeth whitening on What babies can teach AI
binance referral bonus on DALL·E API now available in public beta prihlásení on Neural Networks and Life
Büyü Yapılmışsa Nasıl Bozulur on Introduction to PyTorch: from training loop to prediction
yıldızname on OpenAI Function Calling
Kısmet Bağlılığını Çözmek İçin Dua on Examining Flights within the U.S. with AWS and Power BI
Kısmet Bağlılığını Çözmek İçin Dua on How Meta’s AI Generates Music Based on a Reference Melody
Kısmet Bağlılığını Çözmek İçin Dua on ‘이루다’의 스캐터랩, 기업용 AI 시장에 도전장
uçak oyunu bahis on Thanks!
para kazandıran uçak oyunu on Make Machine Learning Work for You
medyum on Teaching with AI
aviator oyunu oyna on Machine Learning for Beginners !
yıldızname on Final DXA-nation
adet kanı büyüsü on ‘Fake ChatGPT’ app on the App Store
Eşini Eve Bağlamak İçin Dua on LLMs and the Emerging ML Tech Stack
aviator oyunu oyna on AI as Artist’s Augmentation
Büyü Yapılmışsa Nasıl Bozulur on Some Guy Is Trying To Turn $100 Into $100,000 With ChatGPT
Eşini Eve Bağlamak İçin Dua on Latest embedding models and API updates
Kısmet Bağlılığını Çözmek İçin Dua on Jorge Torres, Co-founder & CEO of MindsDB – Interview Series
gideni geri getiren büyü on Joining the battle against health care bias
uçak oyunu bahis on A faster method to teach a robot
uçak oyunu bahis on Introducing the GPT Store
para kazandıran uçak oyunu on Upgrading AI-powered travel products to first-class
para kazandıran uçak oyunu on 10 Best AI Scheduling Assistants (September 2023)
aviator oyunu oyna on 🤗Hugging Face Transformers Agent
Kısmet Bağlılığını Çözmek İçin Dua on Time Series Prediction with Transformers
para kazandıran uçak oyunu on How China is regulating robotaxis
bağlanma büyüsü on MLflow on Cloud
para kazandıran uçak oyunu on Can The 2024 US Elections Leverage Generative AI?
Canbar Büyüsü on The reverse imitation game
bağlanma büyüsü on The NYU AI School Returns Summer 2023
para kazandıran uçak oyunu on Beyond ChatGPT; AI Agent: A Recent World of Staff
Büyü Yapılmışsa Nasıl Bozulur on The Murky World of AI and Copyright
gideni geri getiren büyü on ‘Midjourney 5.2’ creates magical images
Büyü Yapılmışsa Nasıl Bozulur on Microsoft launches the brand new Bing, with ChatGPT inbuilt
gideni geri getiren büyü on MemCon 2023: We’ll Be There — Will You?
adet kanı büyüsü on Meet the Fellow: Umang Bhatt
aviator oyunu oyna on Meet the Fellow: Umang Bhatt
abrir uma conta na binance on The reverse imitation game
código de indicac~ao binance on Neural Networks and Life
Larry Devin Vaughn Wall on How China is regulating robotaxis
Jon Aron Devon Bond on How China is regulating robotaxis
otvorenie úctu na binance on Evolution of Blockchain by DLC
puravive reviews consumer reports on AI-Driven Platform Could Streamline Drug Development
puravive reviews consumer reports on How OpenAI is approaching 2024 worldwide elections Registrácia on DALL·E now available in beta