Home Artificial Intelligence Forecasting with Granger Causality: Checking for Time Series Spurious Correlations EXPERIMENT SETUP GRANGER FORECASTING SUMMARY

Forecasting with Granger Causality: Checking for Time Series Spurious Correlations EXPERIMENT SETUP GRANGER FORECASTING SUMMARY

1
Forecasting with Granger Causality: Checking for Time Series Spurious Correlations
EXPERIMENT SETUP
GRANGER FORECASTING
SUMMARY

Photo by Phoenix Han on Unsplash

In time series forecasting is usually helpful to examine graphically the info at disposal. This helps us understand the dynamics of the phenomena we’re analyzing and take decisions accordingly. Despite having a colourful plot with our time series could also be fascinating, it might result in incorrect conclusions. .

An example of spurious correlation [SOURCE]

As rational individuals, we are able to easily negate any form of relationship between the number of people that died by becoming tangled of their bedsheets and per capita cheese consumption. , though we will not be experts in each fields.

Those that work with data know that these patterns may occur often, also where we now have difficulties interpreting the context and discriminating between true and incorrect correlations. For that reason, the necessity for methodologies that assist in discriminating against these situations is crucial.

.

Granger-causality is built on the intuition that along with the data contained in past observations of Y2.

Example of possible Granger-causality between time series [image by the author]

Testing for Granger causality doesn’t mean Y1 have to be a cause for Y2. It simply signifies that past values of Y1 are ok to enhance the forecast of Y2’s future values. From this implication, we may derive a naive definition of causality.

The adoption of the Granger causality test implies strict assumptions on the underlying data (i.e. stationarity and linear dependency), which could also be difficult to meet in real-world applications. For that reason, on this post, .

For the scope of this post, we simulate two different time series consequently of autoregressive processes.

Simulated AR processes [image by the author]

Each series are correlated with a few of their past timesteps (autocorrelation).

Autocorrelation of AR processes [image by the author]

The time series exhibit an overall Pearson correlation of 0.637 with a discrete positive relationship preserved over time.

Pearson correlation of AR processes over time [image by the author]

At first sight, it seems we’re within the presence of two events which have a positive connection. . It’s essentially the most commonly used statistic to measure linear relationships between variables. It’s so common that usually people wrongly interpret it trying to offer it a causal meaning. That could be a mistake! .

Person correlation formula [image by the author]

In our simulated scenario, the positive relationship is merely a mathematical result since we all know the 2 series are related in just one direction. More precisely, past values of Y1 are linearly related to actual values of Y2 (vice-versa shouldn’t be valid). Our scope is to make a practical demonstration of this statement.

. This is completed by running a linear model on the lagged series values.

The null hypothesis of the test states that the coefficients corresponding to past values of Y1 are zero. We reject the null hypothesis if the p-values are below a particular threshold. In that case, Y1 doesn’t Granger cause Y2.

In other words, .

As step one, we fit two autoregressive models, on each Y1 and Y2, without additional exogenous variables and store the predictions obtained on test data.

forecaster = ForecastingCascade(
RandomForestRegressor(30, random_state=42, n_jobs=-1),
lags=lags,
use_exog=False,
)

model_y1 = clone(forecaster).fit(None, df_train['y1'])
model_y2 = clone(forecaster).fit(None, df_train['y2'])

y1_pred = np.concatenate([
model_y1.predict(
[[0.]],
last_y=df['y1'].iloc[:i]
) for i in range(len(df_train), len(df_train) + len(df_test))
])
y2_pred = np.concatenate([
model_y2.predict(
[[0.]],
last_y=df['y2'].iloc[:i]
) for i in range(len(df_train), len(df_train) + len(df_test))
])

Secondly, we repeat the identical forecasting procedure but add lagged exogenous variables (i.e. when forecasting Y1 we use past values of Y2 plus past values of Y1).

forecaster = ForecastingCascade(
make_pipeline(
FunctionTransformer(
lambda x: x[:,1:] # remove current values of exog series
),
RandomForestRegressor(30, random_state=42, n_jobs=-1)
),
lags=lags,
use_exog=True,
exog_lags=lags,
)

model_y1y2 = clone(forecaster).fit(df_train[['y2']], df_train['y1'])
model_y2y1 = clone(forecaster).fit(df_train[['y1']], df_train['y2'])

y1y2_pred = np.concatenate([
model_y1y2.predict(
pd.DataFrame({'y2': [0.]}),
last_y=df['y1'].iloc[:i],
last_X=df[['y2']].iloc[:i]
) for i in range(len(df_train), len(df_train) + len(df_test))
])
y2y1_pred = np.concatenate([
model_y2y1.predict(
pd.DataFrame({'y1': [0.]}),
last_y=df['y2'].iloc[:i],
last_X=df[['y1']].iloc[:i]
) for i in range(len(df_train), len(df_train) + len(df_test))
])

At the tip of the forecasting phase, we store the predictions of 4 different models (two for forecasting Y1 and the opposite two for forecasting Y2). It’s time for a results comparison.

Squared residuals are computed on the sample level for all of the prediction types. The distributions of the squared residuals are analyzed together for a similar prediction goal. We use the usual Kolmogorov-Smirnov test to ascertain for distribution divergencies.

Comparison of squared residual distributions [image by the author]

The forecasts for Y1 look like the identical with and without the addition of Y2’s features.

Comparison of squared residual distributions [image by the author]

Quite the opposite, the forecasts of Y2 are significative different with and without the addition of Y1’s features. That signifies that Y1 has a positive impact in predicting Y2, i.e. Y1 Granger cause Y2 (the vice-versa shouldn’t be true).

On this post, we proposed an alternative choice to the usual Granger causality test to confirm causation dynamics within the time series domain. We didn’t stop looking only on the Pearson correlation coefficient to return to conclusions on the info. We analyzed, in an empirical way, the possible presence of reciprocal influences of events at our disposal spotting spurious relationships. The benefit of use of the proposed methodology and its adaptability, with low assumptions, make it suitable to be adopted in any time series analytic journey.

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here