Evaluation of Sales Shift in Retail with Causal Impact: A Case Study at Carrefour

A store’s assortment is a whole and varied range of products sold to customers. It’s subject to evolve based on various aspects corresponding to: economic conditions, consumer trends, profitability, quality or compliance issues, renewal of some product ranges, stock levels, seasonal changes, etc.

When a product is not any longer available on the shop shelves, a few of its sales may shift to other products. For a significant food retailer like Carrefour, it’s crucial to estimate this sales shift accurately to administer the danger of loss attributable to product unavailability and approximate the loss attributable to it.

This measurement serves as an indicator of the implications of the unavailability of a product. Moreover, it progressively builds a precious history of sales shift impact estimates.

Yet, estimating sales shifts is complex. Customer behavior — influenced by hard-to-predict emotional aspects — seasonality of certain products, or introduction of recent products can all affect sales shifts. As well as, many products turn out to be unavailable across all stores concurrently, making it unimaginable to determine a control population.

The Causal Impact synthetic control approach, developed by a Google team, matches the particularities of our evaluation framework. It enables us to isolate the effect of product unavailability on sales from influencing aspects, and is suitable for each quasi-experimental and observational studies. Based on Bayesian structural time-series models, Causal Impact performs a counterfactual evaluation, calculating the effect on sales because the difference between the sales observed after a product becomes unavailable and, through an artificial control, the sales that will have been observed had the product remained available.

This text presents our Causal Impact approach for estimating the sales shift effect following product unavailability, in addition to a heuristic for choosing control group time series.

I) Specifying the Use Case

Product unavailability occurs in two primary forms:

Complete unavailability: the product is not any longer available within the national assortment, affecting all stores.
Partial unavailability: the product is not any longer available from some — but not all — stores. It stays available in others.

We consider that a reliable sales shift impact estimate should accurately assess each lost sales and portion of sales transferred to other products. Yet, knowing the precise value of those quantities is unimaginable, making this challenge complex.

Our study analyzes cases of complete product unavailability as these cases are essentially the most significant when it comes to sales impact.

Please also note that causal inference is just not a predictive framework for future events: it identifies causal links previously relatively than forecasting future events.

II) Why did we decide Google’s Causal Impact model?

Causal approaches aim to grasp causal relationships between variables, explaining how one affects one other by isolating the effect we try to investigate from all other existing effects.

Amongst these tools, Causal Impact is a user-friendly library, and it operates inside a totally Bayesian framework, allowing prior information integration while providing inherent credibility intervals in its results. Its predictions represent expected outcomes had the intervention not occurred, expressed as distribution functions relatively than single values.

Causal Impact generates predictions by combining endogenous components, corresponding to seasonality and native level, with user-chosen external time series (covariates). These covariates have to be unaffected by the intervention and may capture trends or aspects that might influence the primary time series. We’ll discuss covariate selection later.

III) Managing Outliers and Anomalies in data

To make sure accurate evaluation, we addressed sales data anomalies by following two key steps:

We excluded time series with negative sales or numerous zero sales from the evaluation.
For time series with occasional zero sales, we replaced these values with the typical of the preceding and following weeks’ sales.

IV) Model Design

The selection of covariates significantly influences counterfactual prediction accuracy. These time series must capture trends or external aspects prone to influence the goal time series without being affected by the intervention.

As well as, it’s crucial to contemplate the dimensions of the estimated sales shift effect relative to the time series being studied: if the intervention is predicted to affect the goal series by only a number of percent, the series is probably not appropriate, as small effects are difficult to differentiate from random noise (especially because the library designers have shown that effects lower than 1% are difficult to prove as being linked to the intervention). Subsequently, we analyzed sales shift only when the theoretical maximum sales shift rate exceeds 5% of sales in its sub-family. We calculated this as S/(1-S), where S represents the share of turnover the product generated in its sub-family before becoming unavailable.

Given those preliminary considerations, we designed our Causal Impact model as follows:

Goal

Because the goal time series, we chosen the sum of sales for the product’s sub-family, excluding the product that became unavailable.

Covariates

We first excluded the next forms of time series:

Products from the identical sub-family because the discontinued product, to stop any influence from its unavailability.
Products from different families than the discontinued product, since covariates should remain business-relevant.
Time series that showed correlation but not co-integration with the goal series, to avoid spurious relationships.

Using these filters, we chosen 60 covariates:

20 covariates were chosen based on their highest co-integration with the goal series through the yr before intervention.
40 additional covariates were chosen from the highest 200 co-integrated series, based on their strongest correlation with the goal series through the yr before intervention.

Note that these numbers (20, 40, and 60) are rules of thumb derived from our previous model matches.

This empirical approach combines time series that capture each long-term trends (through co-integration) and short-term variations (through correlation). We deliberately selected numerous covariates because Causal Impact employs a “spike and slab” method, which mechanically reduces the influence of less important series by assigning them near-zero regression coefficients, while giving greater weight to necessary ones.

V) Model Validation

To validate our covariate selection strategy, we drew heavily on the approach utilized by the Causal Impact designers. We conducted a study of partial product unavailability as follows:

We examined cases where products became partially unavailable and performed an initial conventional statistical evaluation using difference-in-differences.
We applied Causal Impact using, as covariates, a control population that consisted of the product’s sub-family sales (excluding the unavailable product) in stores where the product remained available. These covariates provided the very best available counterfactual since these stores were unaffected by the intervention.
Finally, we applied Causal Impact with out a control population, as a substitute using our selection process based on co-integration and correlation as outlined within the Model Design section.

Consistent estimates across multiple reports (spanning different products, quantities, and categories) would reveal that we will reliably apply this approach on a broader scale.

Moreover we developed two metrics to judge the synthetic control’s quality: a fitness measure and a predictive capability measure.

The fitness measure, scored between 0 and 1, assesses how well the synthetic control models the goal over the pre-intervention period.
The predictive capability measure is a type of backtesting that evaluates the synthetic control’s quality during a simulated false intervention previously.

A Practical Validation Example

To validate the method described above with a practical example, we analyzed a case where a yogurt pack became unavailable in certain stores. We established treatment and control groups by matching each store where the product became unavailable with an identical store that also had the product, based on criteria corresponding to sales performance, customer characteristics, and geographic location.

The theoretical maximum sales shift rate for this product was 9.5%, and our previous analyses showed very high sales shift rates within the dairy product family. Consequently, we anticipated to acquire an estimate near the theoretical maximum rate.

Following our three-step validation method, we obtained these results:

The difference-in-differences evaluation estimated the causal effect at 8.7% with 98.7% probability.
As shown in Figure 2 (below), the Causal Impact evaluation using a control population estimated a causal effect of 9.0%, with a confidence interval of [3.7%, 14.4%] and 99.9% probability. We may see that while the model effectively tracks the time series fluctuations, it does show some minor deviations.

As well as, when using covariates chosen based on co-integration and correlation as a substitute of a control population, the Causal Impact evaluation estimated a causal effect of 8.5%, with a confidence interval of [2.4%, 15.1%] and 99.9% probability as shown in Figure 3 (below). Again, the model effectively tracks the time series fluctuations, yet showing some minor deviations.

Here’s a summary of the estimates obtained across the three different evaluation methods:

Evaluation	Effect estimation	Causal effect probability
Difference in Differences	8.7%	98.7% (significant)
Causal Impact with a control population	9.0% CI: [3.7%, 14.4%]	99.9% (significant)
Causal Impact with out a control population information	8.5% CI: [2.4, 15.1%]	99.1% (significant)

It shows that the estimates remain consistent in magnitude, whether using a control population or not, thus validating our selection process for covariates when no control population is accessible.

VI) Complete unavailability: A rice pack now not available

We examined a nationwide case where a pack of rice brand became unavailable. We restrained our evaluation to the following couple of months after the product became unavailable to avoid capturing unrelated effects that may emerge over an extended period. The theoretical maximum sales shift rate for the product was 31.2%. We applied the covariate selection methodology described earlier to estimate the potential sales shift effect.

As shown in Figure 4, the synthetic control models the goal thoroughly over the period before the intervention. The prediction accurately captures seasonal trends after the intervention. The credibility interval may be very narrow across the estimate.

We obtained a statistically significant estimate at 22% increase in turnover attributable to the product unavailability over the next months, with over 99.9% probability. This quantity represents roughly 70% of the pack of rice total sales before the product became unavailable, implying that 30% of the pack of rice sales didn’t shift.

VII) Usage recommendations and experience report

Causal Impact is a sturdy and user-friendly tool for causal inferences. Yet after significant time spent specifying the model and improving its accuracy, we encountered challenges in fine-tuning it to acquire an industrializable solution.

The primary point we wish to spotlight is the importance of the “garbage in, garbage out” principle, which is especially relevant when using Causal Impact. Whatever the covariates used, Causal Impact will at all times produce a result, sometimes with very high probability, even in cases where results are unrealistic, or unimaginable.
Time series chosen solely based on the co-integration criterion sometimes overshadow others in model feature importance, which may drastically reduce the estimation accuracy when adjustment is just not well-controlled.
The choice of 20 series for co-integration and 40 for correlation is an empirical rule of thumb. While effective typically we encountered, it may gain advantage from further refinement.

Conclusion

In this text we proposed a causal approach to estimate the sales shift effect when a product becomes unavailable, using Causal Impact. We outlined a technique for choosing analyzable products, and covariates.

Although this approach is functional and robust typically, it has limitations and areas for improvement. Some are structural, while others require spending more time on model adjustment.

We tested the methodology on different products with promising results, however it is just not exhaustive. Some very seasonal products or ones with little historical data pose challenges. Moreover, products that became unavailable in just a number of stores are rare, limiting our ability to validate the strategy on numerous diverse cases.
One other structural limitation is the model’s requirement for post-hoc evaluation: the tool doesn’t allow sales shift effect prediction before a product becomes unavailable. Having the ability to accomplish that would greatly profit business teams. Work is underway to approach sales shift prediction using bayesian structural time series forecasting.
The sales shift effect evaluation ignores margin impacts: the product that became unavailable could have the next unit margin than the products to which its sales shifted. The industrial conclusions to be drawn could then differ, but evaluation at a sub-family level precludes this level of detail.
Finally we could explore alternative synthetic controls, corresponding to Augmented SC, Robust SC, Penalized SC, and even other causal approaches corresponding to the two-way fixed effect model.

Evaluation of Sales Shift in Retail with Causal Impact: A Case Study at Carrefour

I) Specifying the Use Case

II) Why did we decide Google’s Causal Impact model?

III) Managing Outliers and Anomalies in data

IV) Model Design

V) Model Validation

VI) Complete unavailability: A rice pack now not available

VII) Usage recommendations and experience report

Conclusion

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Prompt Caching with the OpenAI API: A Full Hands-On Python tutorial

Constructing a Navier-Stokes Solver in Python from Scratch: Simulating Airflow

Escaping the SQL Jungle

A Gentle Introduction to Nonlinear Constrained Optimization with Piecewise Linear Approximations

Agentic RAG Failure Modes: Retrieval Thrash, Tool Storms, and Context Bloat (and How you can Spot Them Early)

Evaluation of Sales Shift in Retail with Causal Impact: A Case Study at Carrefour

I) Specifying the Use Case

II) Why did we decide Google’s Causal Impact model?

III) Managing Outliers and Anomalies in data

IV) Model Design

V) Model Validation

VI) Complete unavailability: A rice pack now not available

VII) Usage recommendations and experience report

Conclusion

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.