Home Artificial Intelligence Unleashing the Power of Multiple Timeseries Forecasting 📊💡 Create forecasts with Stats & ML methods. Stats Methods with StatsForecast ML Methods with MLForecast Forecast plots Validate Model’s Performance Plot CV Aggregate sales forecasts for all items Evaluate forecasts per store_item and CV window Distribution of errors Select Best Model Per Item_id & Cross Validation Fold We will find the perfect performing model per store_item (=item_id), evaluation metric & cross validation fold In what number of cross validation fold & metric is each model over performing best? What’s the perfect model for store_item=”1_1″ sales forecasting? Visualize the forecasts of the perfect model for unique_id == “1_1” Visualize AutoETS forecasts for more unique_ids Visualize XGBRegressor forecasts for more unique_ids Sources

Unleashing the Power of Multiple Timeseries Forecasting 📊💡 Create forecasts with Stats & ML methods. Stats Methods with StatsForecast ML Methods with MLForecast Forecast plots Validate Model’s Performance Plot CV Aggregate sales forecasts for all items Evaluate forecasts per store_item and CV window Distribution of errors Select Best Model Per Item_id & Cross Validation Fold We will find the perfect performing model per store_item (=item_id), evaluation metric & cross validation fold In what number of cross validation fold & metric is each model over performing best? What’s the perfect model for store_item=”1_1″ sales forecasting? Visualize the forecasts of the perfect model for unique_id == “1_1” Visualize AutoETS forecasts for more unique_ids Visualize XGBRegressor forecasts for more unique_ids Sources

2
Unleashing the Power of Multiple Timeseries Forecasting 📊💡
Create forecasts with Stats & ML methods.
Stats Methods with StatsForecast
ML Methods with MLForecast
Forecast plots
Validate Model’s Performance
Plot CV
Aggregate sales forecasts for all items
Evaluate forecasts per store_item and CV window
Distribution of errors
Select Best Model Per Item_id & Cross Validation Fold
We will find the perfect performing model per store_item (=item_id), evaluation metric & cross validation fold
In what number of cross validation fold & metric is each model over performing best?
What’s the perfect model for store_item=”1_1″ sales forecasting?
Visualize the forecasts of the perfect model for unique_id == “1_1”
Visualize AutoETS forecasts for more unique_ids
Visualize XGBRegressor forecasts for more unique_ids
Sources

Predict sales for 50 different items at 10 different stores. 📈🛒

Kaggle Competition

Store Item Demand Forecasting Challenge

Goal

Predict sales for 50 different items at 10 different stores. 📈🛒

Python Notebook

Multiple Timeseries Forecasting notebook is on the market on Github.

Installing Requirements

Activate a Python environment and install the next

%%capture
!pip install pandas
!pip install numpy
!pip install darts
!pip install matplotlib
!pip install -U "gluonts[torch]==0.11.0" "optuna~=2.10.0"
!pip install ipykernel
!pip install --upgrade nbformat
!pip install lightgbm xgboost
!pip install seaborn
!pip install distributed
!pip install datasetsforecast
!pip install darts
!pip install mlforecast
!pip install statsforecast

Import packages

%matplotlib inline

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Dataset

On this tutorial, we’ll train and evaluate multiple time-series forecasting models using the Store Item Demand Forecasting Challenge dataset from Kaggle. This dataset has 10 different stores and every store has 50 items, i.e. total of 500 each day level time series data for five years (2013–2017).

Download data

  • Download the train.csv from here
  • Create a ./data directory contained in the directory of this Python notebook
  • Save the train.csv contained in the ./data directory

Load data

data = pd.read_csv("./data/train.csv", index_col=0)
data = data.reset_index()
data.head()
png
print(f"The dataset has {data.shape[0]} rows and {data.shape[1]} columns")

The dataset has 913000 rows and 4 columns

Data fields

  • — Date of the sale data. There aren’t any holiday effects or store closures.
  • — Store ID
  • — Item ID
  • — Variety of items sold at a specific store on a specific date.
data.describe()
png
  • We don’t have any negative “sales” values
  • We’d like to convert “store” and “item” columns to type string

Reduce the quantity of knowledge

For simplicity & speed, we’ll only keep data from the primary 2 out of the ten stores

data = data.loc[data["store"] <= 2]
print(f"The dataset now has {data.shape[0]} rows and {data.shape[1]} columns")print(f"The dataset now has {data.shape[0]} rows and {data.shape[1]} columns")
The dataset now has 182600 rows and 4 columns

Convert “store” and “item” columns to type string

data[["store", "item"]] = data[["store", "item"]].astype(str)

Check for duplicate rows

count_duplicate_rows = len(data)-len(data.drop_duplicates())
print(f"There are {count_duplicate_rows} duplicate rows")
There are 0 duplicate rows

Create a recent column by concatenating “store” and “item” columns

data["store_item"] = data["store"] + "_" + data["item"]

Convert date column to datetime

data["date"] = pd.to_datetime(data["date"])

Calculate total sales per date, store and item

item_sales_per_date = data.groupby(["date","store_item"])["sales"].aggregate("sum")
item_sales_per_date = item_sales_per_date.reset_index()
item_sales_per_date.columns = ["date","store_item","sales"]
item_sales_per_date = item_sales_per_date.sort_values("date", ascending=True)
item_sales_per_date.head()
png
item_sales_per_date.tail()
png

Plot total sales for all products over time

total_sales_per_date = item_sales_per_date.groupby(["date"])["sales"].aggregate("sum")
total_sales_per_date.plot()
png

Not all items sell daily

  • For every “store_item” value add all of the missing “date” values and fill in “sales” with 0s

Generate an index with all combos of “date” and “store_item” values

import itertools

multiindex = list(zip(item_sales_per_date["date"], item_sales_per_date["store_item"]))
multiindex = pd.MultiIndex.from_tuples(multiindex, names=('index_1', 'index_2'))

dataset = item_sales_per_date.copy()
dataset.index = multiindex

idx_dates = list(pd.date_range(min(item_sales_per_date["date"]), max(item_sales_per_date["date"])))
idx_ids = list(dataset["store_item"].unique())
idx = list(itertools.product(idx_dates, idx_ids))
dataset = dataset.reindex(idx, None)
dataset = dataset.reset_index()
dataset.head()

png

For every “store_item” value fill add all of the missing “date” values and fill in “sales” with 0s

dataset["date"] = dataset["date"].fillna(dataset["index_1"])
dataset["store_item"] = dataset["store_item"].fillna(dataset["index_2"])
dataset["sales"] = dataset["sales"].fillna(0)
dataset = dataset.drop(columns=["index_1", "index_2"])
dataset = dataset.set_index("date")
dataset.index.name = "date"
dataset.head()
png

Create the dataset that we’ll use for training the forecasting models

# Rename columns to match the Nixtlaverse's expectations
# The 'store_item' becomes 'unique_id' representing the unique identifier of the time series
# The 'date' becomes 'ds' representing the time stamp of the info points
# The 'sales' becomes 'y' representing the goal variable we would like to forecast
# Y_df = dataset.query('unique_id.str.startswith("6_46")').copy()
Y_df = dataset.copy()
Y_df = Y_df.reset_index()
Y_df = Y_df.rename(columns={
'store_item': 'unique_id',
'date': 'ds',
'sales': 'y'
})
Y_df['y'] = Y_df['y'].astype(int)
# Convert the 'ds' column to datetime format to make sure proper handling of date-related operations in subsequent steps
Y_df['ds'] = pd.to_datetime(Y_df['ds'])
Y_df.tail()

Y_df['unique_id'] = Y_df['unique_id'].astype(str)

png
%%capture
!pip install tqdm
from tqdm.autonotebook import tqdm
from statsforecast import StatsForecast
# Plot random series for EDA
StatsForecast.plot(Y_df)

Check for seasonality in the full variety of ‘sales’ per ‘date’

import darts
from darts import TimeSeries
from darts.utils.statistics import plot_acf, check_seasonality

seasonality_check_data = total_sales_per_date.reset_index()
seasonality_check_data = seasonality_check_data.set_index("date")
seasonality_check_data_series = TimeSeries.from_times_and_values(seasonality_check_data.index, seasonality_check_data["sales"].values)

plot_acf(seasonality_check_data_series, m=7, alpha=0.05, max_lag=30)

png

The ACF presents a spike at x in [1, 7, 14, 21], which suggests a weekly seasonality trend (highlighted). The blue zone determines the importance of the statistics for a confidence level of $alpha = 5%$. We also can run a statistical check of seasonality for every candidate period m.

plot_acf(seasonality_check_data_series, m=7, alpha=0.05, max_lag=12*30)
png

We even have 6 months seasonality

We are going to now train multiple Statistical & ML models and evaluate which one performs best

from time import time
# Import needed models from the statsforecast library
from statsforecast.models import (
# SeasonalNaive: A model that uses the previous season's data because the forecast
SeasonalNaive,
# Naive: An easy model that uses the last observed value because the forecast
Naive,
# HistoricAverage: This model uses the common of all historical data because the forecast
HistoricAverage,
# CrostonOptimized: A model specifically designed for intermittent demand forecasting
CrostonOptimized,
# ADIDA: Adaptive combination of Intermittent Demand Approaches, a model designed for intermittent demand
ADIDA,
# IMAPA: Intermittent Multiplicative AutoRegressive Average, a model for intermittent series that includes autocorrelation
IMAPA,
# AutoETS: Automated Exponential Smoothing model that mechanically selects the perfect Exponential Smoothing model based on AIC
AutoETS
)

# horizon is the variety of days in the long run for which we'll make a forecast
horizon = 30
# the seasonality window is 7 because now we have weekly seasonality
season_length = 7
# the variety of days that the model will use to make a forecast
window_size = 6*30
models = [
SeasonalNaive(season_length=season_length),
Naive(),
HistoricAverage(),
CrostonOptimized(),
ADIDA(),
IMAPA(),
AutoETS(season_length=season_length)
]

# Instantiate the StatsForecast class
sf = StatsForecast(
models=models, # An inventory of models for use for forecasting
freq='D', # The frequency of the time series data (on this case, 'D' stands for each day frequency)
n_jobs=-1, # The variety of CPU cores to make use of for parallel execution (-1 means use all available cores)
)

# Get the present time before forecasting starts, this will likely be used to measure the execution time
init = time()

# Call the forecast approach to the StatsForecast instance to predict the subsequent horizon days
# Level is about to [90], which suggests that it's going to compute the 90% prediction interval
fcst_df = sf.forecast(df=Y_df, h=horizon, level=[90])

# Get the present time after the forecasting ends
end = time()

# Calculate and print the full time taken for the forecasting in minutes
print(f'Forecast Minutes: {(end - init) / 60}')

Forecast Minutes: 0.9659132162729899
fcst_df.head()
png
from mlforecast import MLForecast
from mlforecast.target_transforms import Differences
from mlforecast.utils import PredictionIntervals
from window_ops.expanding import expanding_mean

# Import the needed models from various libraries

# LGBMRegressor: A gradient boosting framework that uses tree-based learning algorithms from the LightGBM library
from lightgbm import LGBMRegressor

# XGBRegressor: A gradient boosting regressor model from the XGBoost library
from xgboost import XGBRegressor

# LinearRegression: An easy linear regression model from the scikit-learn library
from sklearn.linear_model import LinearRegression

# Instantiate the MLForecast object
mlf = MLForecast(
models=[LGBMRegressor(max_depth=10), XGBRegressor(max_depth=10, eval_metric='rmse'), LinearRegression()], # List of models for forecasting: LightGBM, XGBoost and Linear Regression
freq='D', # Frequency of the info - 'D' for each day frequency
lags=list(range(1, 7)), # Specific lags to make use of as regressors: 1 to six days
lag_transforms = {
1: [expanding_mean], # Apply expanding mean transformation to the lag of 1 day
},
date_features=['year', 'month', 'day', 'dayofweek', 'quarter', 'week', 'dayofyear'], # Date features to make use of as regressors
)

# Start the timer to calculate the time taken for fitting the models
init = time()

# Fit the MLForecast models to the info, with prediction intervals set using a window size of window_size days
mlf.fit(Y_df, prediction_intervals=PredictionIntervals(window_size=window_size))

# Calculate the tip time after fitting the models
end = time()

# Print the time taken to suit the MLForecast models, in minutes
print(f'MLForecast Minutes: {(end - init) / 60}')

MLForecast Minutes: 0.3328384002049764fcst_mlf_df
fcst_mlf_df = mlf.predict(horizon=horizon, level=[90])
fcst_mlf_df.head()
png

Per store, item and forecast model

fcst_df = fcst_df.merge(fcst_mlf_df, how='left', on=['unique_id', 'ds'])
sf.plot(Y_df, fcst_df, max_insample_length=30)

Cross Validation in StatsForecast

init = time()
cv_df = sf.cross_validation(df=Y_df, h=horizon, n_windows=3, step_size=horizon, level=[90])
end = time()
print(f'CV Minutes: {(end - init) / 60}')
CV Minutes: 2.01016796429952

Cross Validation in MLForecast

init = time()
cv_mlf_df = mlf.cross_validation(
data=Y_df,
window_size=horizon,
n_windows=3,
step_size=horizon,
level=[90],
)
end = time()
print(f'CV Minutes: {(end - init) / 60}')
CV Minutes: 0.4411295692125956
cv_df = cv_df.merge(cv_mlf_df.drop(columns=['y']), how='left', on=['unique_id', 'ds', 'cutoff'])
cv_df.head()
png

Visualize cross validation splits for a selected unique_id

cutoffs = cv_df['cutoff'].unique()

for cutoff in cutoffs:
img = sf.plot(
Y_df,
cv_df.query('cutoff == @cutoff').drop(columns=['y', 'cutoff']),
max_insample_length=30*4,
unique_ids=['1_1'],
)

agg_cv_df = cv_df.loc[:,~cv_df.columns.str.contains('hi|lo')].groupby(['ds', 'cutoff']).sum(numeric_only=True).reset_index()
agg_cv_df.insert(0, 'unique_id', 'Total sales (all items)')

agg_Y_df = Y_df.groupby(['ds']).sum(numeric_only=True).reset_index()
agg_Y_df.insert(0, 'unique_id', 'Total sales (all items)')

for cutoff in cutoffs:
img = sf.plot(
agg_Y_df,
agg_cv_df.query('cutoff == @cutoff').drop(columns=['y', 'cutoff']),
max_insample_length=30*4,
)

from typing import List, Callable

from distributed import Client
from fugue import transform
from fugue_dask import DaskExecutionEngine
from datasetsforecast.losses import mse, mae, smape, mape

def evaluate(df: pd.DataFrame, metrics: List[Callable]) -> pd.DataFrame:
eval_ = {}
models = df.loc[:, ~df.columns.str.contains('unique_id|y|ds|cutoff|lo|hi')].columns
for model in models:
eval_[model] = {}
for metric in metrics:
eval_[model][metric.__name__] = metric(df['y'], df[model])
eval_df = pd.DataFrame(eval_).rename_axis('metric').reset_index()
eval_df.insert(0, 'cutoff', df['cutoff'].iloc[0])
eval_df.insert(0, 'unique_id', df['unique_id'].iloc[0])
return eval_df

str_models = cv_df.loc[:, ~cv_df.columns.str.contains('unique_id|y|ds|cutoff|lo|hi')].columns
str_models = ','.join([f"{model}:float" for model in str_models])
cv_df['cutoff'] = cv_df['cutoff'].astype(str)
cv_df = cv_df.reset_index()
cv_df['unique_id'] = cv_df['unique_id'].astype(str)

Let’s cleate a dask client.

client = Client() # without this, dask shouldn't be in distributed mode
# fugue.dask.dataframe.default.partitions determines the default partitions for a recent DaskDataFrame
engine = DaskExecutionEngine({"fugue.dask.dataframe.default.partitions": 96})

The transform function takes the evaluate functions and applies it to every combination of time series (unique_id) and cross validation window (cutoff) using the dask client we created before.

evaluation_df = transform(
cv_df.loc[:, ~cv_df.columns.str.contains('lo|hi')],
evaluate,
engine="dask",
params={'metrics': [mse, mae, mape, smape]},
schema=f"unique_id:str,cutoff:str,metric:str, {str_models}",
as_local=True,
partition={'by': ['unique_id', 'cutoff']}
)
evaluation_df.head()
png
# Calculate the mean metric for every cross validation window
evaluation_df.groupby(['unique_id','cutoff', 'metric']).mean(numeric_only=True)
png
import matplotlib.pyplot as plt
import seaborn as sns

evaluation_df_melted = pd.melt(evaluation_df, id_vars=['unique_id', 'cutoff', 'metric'], var_name='model', value_name='error')
evaluation_df_melted

png

Distribution of MSE errors per model, unique_id and evaluation metric

sns.violinplot(evaluation_df_melted.query('metric=="mse"'), x='error', y='model').set_title('Distribution of MSE errors')
png

Distribution of SMAPE errors per model, unique_id and evaluation metric

sns.violinplot(evaluation_df_melted.query('metric=="smape"'), x='error', y='model').set_title('Distribution of SMAPE errors')
png
trained_models = list(evaluation_df_melted.model.unique())

evaluation_df[trained_models] = evaluation_df[trained_models].astype(float)
evaluation_df

png
# Select the perfect model for every time series, metric, and cross validation window
evaluation_df['best_model'] = evaluation_df.idxmin(axis=1, numeric_only=True)
# count how again and again a model wins per metric and cross validation window
count_best_model = evaluation_df.groupby(['unique_id', 'metric', 'best_model']).size().rename('n').to_frame().reset_index()
# plot results
sns.barplot(count_best_model, x='n', y='best_model', hue='metric')
png

AutoETS is the perfect performing model for all evaluation metrics

This doesn’t mean that AutoETS is the perfect performing model for every individual “store_item”

count_best_model
png
# add colours
colours = ['#ff9999','#66b3ff','#99ff99','#ffcc99']
# For the mse, calculate how again and again a model wins
count_best_model_mse = count_best_model.query('metric == "mse" & unique_id == "1_1"')
counts_series_mse = count_best_model_mse["n"]
plt.pie(counts_series_mse, labels=count_best_model_mse["best_model"], autopct='%.0f%%', colours=colours)
plt.title("Winning Models Based On MSE - Store_item='1_1'")
plt.show()
png

was the perfect performing model based on for two out of the three validation folds of .

# For the smape, calculate how again and again a model wins
count_best_model_smape = count_best_model.query('metric == "smape" & unique_id == "1_1"')
counts_series_smape = count_best_model_smape["n"]
plt.pie(counts_series_smape, labels=count_best_model_smape["best_model"], autopct='%.0f%%', colours=colours)
plt.title("Winning Models Based On SMAPE - Store_item='1_1'")
plt.show()
png

was the perfect performing model based on for two out of the three validation folds of .

  • The very best model based on MSE is XGBRegressor
  • The very best model based on SMAPE is LGBMRegressor
eval_series_df = evaluation_df.query('metric == "mse" & unique_id == "1_1"').groupby(['unique_id']).mean(numeric_only=True)
eval_series_df['best_model'] = eval_series_df.idxmin(axis=1)

sf.plot(Y_df, cv_df.drop(columns=['cutoff', 'y']),
max_insample_length=30 * 2,
models=['XGBRegressor'],
unique_ids=eval_series_df.query('best_model == "XGBRegressor"').index[:8])

eval_series_df = evaluation_df.query('metric == "smape" & unique_id == "1_1"').groupby(['unique_id']).mean(numeric_only=True)
eval_series_df['best_model'] = eval_series_df.idxmin(axis=1)

sf.plot(Y_df, cv_df.drop(columns=['cutoff', 'y']),
max_insample_length=30 * 2,
models=['LGBMRegressor'],
unique_ids=eval_series_df.query('best_model == "LGBMRegressor"').index[:8])

eval_series_df = evaluation_df.query('metric == "mape"').groupby(['unique_id']).mean(numeric_only=True)
eval_series_df['best_model'] = eval_series_df.idxmin(axis=1)

sf.plot(Y_df, cv_df.drop(columns=['cutoff', 'y']),
max_insample_length=30 * 2,
models=['AutoETS'],
unique_ids=eval_series_df.query('best_model == "AutoETS"').index[:8])

eval_series_df = evaluation_df.query('metric == "mape"').groupby(['unique_id']).mean(numeric_only=True)
eval_series_df['best_model'] = eval_series_df.idxmin(axis=1)

sf.plot(Y_df, cv_df.drop(columns=['cutoff', 'y']),
max_insample_length=30 * 2,
models=['XGBRegressor'],
unique_ids=eval_series_df.query('best_model == "XGBRegressor"').index[:8])

This code relies on the next publicly available resources

2 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here