Home Artificial Intelligence Hitting Time Forecasting: The Other Way for Time Series Probabilistic Forecasting

Hitting Time Forecasting: The Other Way for Time Series Probabilistic Forecasting

0
Hitting Time Forecasting: The Other Way for Time Series Probabilistic Forecasting

How long does it take to achieve a selected value?

Photo by Mick Haupt on Unsplash

The power to make accurate predictions is prime for each time series forecasting application. Following this purpose, data scientists are used to picking one of the best models that minimize errors from a degree forecast perspective. That’s correct but is probably not all the time one of the best effective approach.

Data scientists must also consider the potential of developing probabilistic forecasting models. These models produce, along with point estimates, also upper and lower reliability bands by which future observations are prone to fall in. Despite probabilistic forecasting seeming to be a prerogative of statistical or deep learning solutions, any model might be used to provide probabilistic forecasts. The concept is explained in certainly one of my previous posts where I introduced conformal prediction as a approach to estimate prediction intervals with any scikit-learn models.

Needless to say a degree forecast is considerably easier to speak to non-technical stakeholders. At the identical time, the chance to generate KPIs on the reliability of our predictions is an added value. A probabilistic output may carry more information to support decision-making. Communicating that there’s a 60% likelihood of rain in the subsequent hours could also be more informative than reporting what number of millimeters of rain will fall.

On this post, we propose a forecasting technique, generally known as forecasting hitting time, used to estimate when a selected event or condition will occur. It reveals to be accurate because it’s based on conformal prediction, interpretable since it has probabilistic interpretability, and reproducible with any forecasting technique.

Forecasting hitting time is an idea commonly utilized in various fields. It refers to predicting or estimating the time it takes for a certain event or condition to occur, often within the context of reaching a selected threshold or level.

Simulated seasonality and trend [image by the author]
Simulated time series (seasonality + trend) with an example of hitting time level [image by the author]

Essentially the most known applications of hitting time discuss with fields like reliability evaluation and survival evaluation. It involves estimating the time it takes for a system or process to experience a selected event, corresponding to a failure or reaching a specific state. In finance, hitting time is commonly applied to find out which is the probability of a signal/index following a desired direction.

Overall, forecasting hitting time involves making predictions concerning the time it takes for a specific event, which follows temporal dynamics, to occur.

To accurately estimate hitting times we’ve to start out from point forecasting. As a primary step, we decide the specified forecasting algorithm. For this text, we adopt a straightforward recursive estimator easily available in scikit-learn style from tspiral.

Predicted vs real data points on test set [image by the author]
model = ForecastingCascade(
Ridge(),
lags=range(1,24*7+1),
use_exog=False,
)

Our aim is to provide forecasting distributions for every predicted point from which extract probabilistic insights. This is finished following a three-step approach and making use of the idea behind conformal prediction:

  • Forecasts are collected on the training set through cross-validation after which averaged together.
CV = TemporalSplit(n_splits=10, test_size=y_test.shape[0])

pred_val_matrix = np.full(
shape=(X_train.shape[0], CV.get_n_splits(X_train)),
fill_value=np.nan,
dtype=float,
)

for i, (id_train, id_val) in enumerate(CV.split(X_train)):

pred_val = model.fit(
X_train[id_train],
y_train[id_train]
).predict(X_train[id_val])

pred_val_matrix[id_val, i] = np.array(
pred_val, dtype=float
)

pred_val = np.nanmean(pred_val_matrix, axis=1)

  • Conformity scores are calculated on the training data as absolute residuals from cross-validated predictions and real values.
conformity_scores  = np.abs(
np.subtract(
y_train[~np.isnan(pred_val)],
pred_val[~np.isnan(pred_val)]
)
)
  • Future forecast distributions are obtained by adding conformity scores to check predictions.
pred_test = model.fit(
X_train,
y_train
).predict(X_test)

estimated_test_distributions = np.add(
pred_test[:, None], conformity_scores
)

Predicted distribution on test data [image by the author]

Following the procedure depicted above, we find yourself with a group of plausible trajectories that future values may follow. We’ve got all that we want to offer a probabilistic representation of our forecasts.

For every future time point, it’s recorded how persistently the values within the estimated test distributions exceed a predefined threshold (our hit goal level). This count is transformed right into a probability simply normalizing by the variety of values in each estimated test distribution.

Finally, a metamorphosis is applied to the array of probabilities to have a series of monotonic increasing probabilities.

THRESHOLD = 40

prob_test = np.mean(estimated_test_distributions > THRESHOLD, axis=1)

prob_test = pd.Series(prob_test).expanding(1).max()

Predicted vs real data points on test set plus hitting time probabilities [image by the author]

Regardless of the event we try to forecast, we are able to generate a curve of probabilities simply ranging from the purpose forecasts. The interpretation stays straightforward, i.e. for every forecasted time point we are able to derive the probability of our goal series reaching a predefined level.

On this post, we introduced a approach to provide probabilistic outcomes to our forecasting models. It doesn’t require the appliance of strange and intensive additional estimation techniques. Simply ranging from a degree forecasting problem, it’s possible so as to add a probabilistic overview of the duty by applying a hitting time approach.

LEAVE A REPLY

Please enter your comment!
Please enter your name here