Home Artificial Intelligence Practical Approaches to Optimizng Budget in Marketing Mix Modeling

Practical Approaches to Optimizng Budget in Marketing Mix Modeling

1
Practical Approaches to Optimizng Budget in Marketing Mix Modeling

Photo by Joel Filipe on Unsplash

Marketing Mix Modeling (MMM) is a data-driven approach that’s used to discover and analyze the important thing drivers of the business final result equivalent to sales or revenue by examining the impact of varied aspects which will influence the response. The goal of MMM is to offer insights into how marketing activities, including promoting, pricing, and promotions, could be optimized to enhance the business performance. Amongst all of the aspects influencing the business final result, marketing contribution, equivalent to promoting spend in various media channels, is taken into account to have a direct and measurable impact on the response. By analyzing the effectiveness of promoting spend in several media channels, MMM can provide priceless insights into which channels are essentially the most effective for increasing sales or revenue, and which channels may must be optimized or eliminated to maximise marketing ROI.

Marketing Mix Modeling (MMM) is a multi-step process involving series of unique steps which might be driven by the marketing effects being analyzed. First, the coefficients of the media channels are constrained to be positive to account for positive effect of promoting activity.

Second, adstock transformation is applied to capture the lagged and decayed impact of promoting on consumer behavior.

Third, the connection between promoting spend and the corresponding business final result just isn’t linear, and follows the law of diminishing returns. In most MMM solutions, the modeler typically employs linear regression to coach the model, which presents two key challenges. Firstly, the modeler must apply the saturation transformation step to determine the non-linear relationship between the media activity variables and the response variable. Secondly, the modeler must develop hypotheses concerning the possible transformation functions which might be applicable to every media channel. Nonetheless, more complex machine learning models may capture non-linear relationships without applying the saturation transformation.

The last step is to construct a marketing mix model by estimating the coefficients, and parameters of the adstock and saturation functions.

Each saturation curves and a trained model could be utilized in marketing mix modeling to optimize budget spend. The benefits of using saturation curves are:

  • Simplicity in visualizing the influence of spend on the final result
  • The underlying model just isn’t required anymore so budget optimization procedure is simplified and requires only the parameters of the saturation transformation

One in every of the disadvantages is that saturation curves are based on historical data and should not all the time accurately predict the response to future spends.

The benefits of using the trained model for budget optimization is that the model uses complex relationship between media activities and other variables including trend, and seasonality and may higher capture the diminishing returns over time.

Data

I proceed using the dataset made available by Robyn under MIT Licence as in my previous articles for practical examples, and follow the identical data preparation steps by applying Prophet to decompose trends, seasonality, and holidays.

The dataset consists of 208 weeks of revenue (from 2015–11–23 to 2019–11–11) having:

  • 5 media spend channels: tv_S, ooh_S, print_S, facebook_S, search_S
  • 2 media channels which have also the exposure information (Impression, Clicks): facebook_I, search_clicks_P (not utilized in this text)
  • Organic media without spend: newsletter
  • Control variables: events, holidays, competitor sales (competitor_sales_B)

Modeling

I built a whole working MMM pipeline that could be applied in a real-life scenario for analyzing media spend on the response variable, consisting of the next components:

A note on coefficients

In scikit-learn, Ridge Regression doesn’t offer the choice to set a subset of coefficients to be positive. Nonetheless, a possible workaround is to reject the optuna solution if a number of the media coefficients transform negative. This could be achieved by returning a really large value, indicating that the negative coefficients are unacceptable and have to be excluded.

A note on saturation transformation

The Hill saturation function assumes that the input variable falls inside a variety of 0 to 1, which implies that the input variable have to be normalized before applying the transformation. This is vital since the Hill function assumes that the input variable has a maximum value of 1.

Nonetheless, it is feasible to use the Hill transformation on non-normalized data by scaling the half saturation parameter to the spend range by utilizing the next equation:

half_saturation_unscaled = half_saturation * (spend_max - spend_min) + spend_min

where half_saturation is the unique half saturation parameter within the range between 0 and 1, spend_min and spend_max represent the minimum and maximum spend values, respectively.

The whole transformation function is provided below:

class HillSaturation(BaseEstimator, TransformerMixin):
def __init__(self, slope_s, half_saturation_k):
self.slope_s = slope_s
self.half_saturation_k = half_saturation_k

def fit(self, X, y=None):
return self

def transform(self, X: np.ndarray, x_point = None):

self.half_saturation_k_transformed = self.half_saturation_k * (np.max(X) - np.min(X)) + np.min(X)

if x_point is None:
return (1 + self.half_saturation_k_transformed**self.slope_s / X**self.slope_s)**-1

#calculate y at x_point
return (1 + self.half_saturation_k_transformed**self.slope_s / x_point**self.slope_s)**-1

Budget Optimization using Saturation Curves

Once the model is trained, we are able to visualize the impact of media spend on the response variable using response curves which were generated through Hill saturation transformations for every media channel. The plot below illustrates the response curves for five media channels, depicting the connection between spend of every channel (on weekly basis) and response over a period of 208 weeks.

Image by Writer

Optimizing budget using saturation curves involves identifying the optimal spend for every media channel that can lead to the very best overall response while keeping the whole budget fixed for a specific time period.

To initiate optimization, the common spend for a particular time period is usually used as a baseline. The optimizer then uses the budget per channel, which might fluctuate inside predetermined minimum and maximum limits (boundaries), for constrained optimization.

The next code snippet demonstrates how budget optimization could be achieved using the minimize function from the scipy.optimize package. Nonetheless, it’s value noting that alternative optimization packages, equivalent to nlopt or nevergrad, may also be used for this purpose.

optimization_percentage = 0.2

media_channel_average_spend = result["model_data"][media_channels].mean(axis=0).values

lower_bound = media_channel_average_spend * np.ones(len(media_channels))*(1-optimization_percentage)
upper_bound = media_channel_average_spend * np.ones(len(media_channels))*(1+optimization_percentage)

boundaries = optimize.Bounds(lb=lower_bound, ub=upper_bound)

def budget_constraint(media_spend, budget):
return np.sum(media_spend) - budget

def saturation_objective_function(coefficients,
hill_slopes,
hill_half_saturations,
media_min_max_dictionary,
media_inputs):

responses = []
for i in range(len(coefficients)):
coef = coefficients[i]
hill_slope = hill_slopes[i]
hill_half_saturation = hill_half_saturations[i]

min_max = np.array(media_min_max_dictionary[i])
media_input = media_inputs[i]

hill_saturation = HillSaturation(slope_s = hill_slope, half_saturation_k=hill_half_saturation).transform(X = min_max, x_point = media_input)
response = coef * hill_saturation
responses.append(response)

responses = np.array(responses)
responses_total = np.sum(responses)
return -responses_total

partial_saturation_objective_function = partial(saturation_objective_function,
media_coefficients,
media_hill_slopes,
media_hill_half_saturations,
media_min_max)

max_iterations = 100
solver_func_tolerance = 1.0e-10

solution = optimize.minimize(
fun=partial_saturation_objective_function,
x0=media_channel_average_spend,
bounds=boundaries,
method="SLSQP",
jac="3-point",
options={
"maxiter": max_iterations,
"disp": True,
"ftol": solver_func_tolerance,
},
constraints={
"type": "eq",
"fun": budget_constraint,
"args": (np.sum(media_channel_average_spend), )
})

Some details:

  • fun — the target function to be minimized. On this case, it takes the next parameters:
    media coefficients — Ridge regression coefficients for every media channel which might be multiplied with the corresponding saturation level to estimate the response level for every media channel.
    slopes and half saturations — two parameters of the Hill transformation spend min-max values for every media channel to appropriately estimate the response level for a given media spend.
    The target function iterates over all media channels and calculates the whole response based on the sum of individual response levels per media channel. To maximise the response within the optimization function, we’d like to convert it right into a minimization problem. Subsequently, we obtain the negative value of the whole response, which we then use as the target for the optimization function.
  • method = SLSQP — The Sequential Least Squares Programming (SLSQP) algorithm is a preferred method for constrained optimization problems, and it is usually used for optimizing budget allocation in marketing mix modeling.
  • x0 — Initial guess. Array of real elements of size (n,), where n is the variety of independent variables. On this case, x0 corresponds to the media channel average spend, i.e., an array of average spends per channel.
  • bounds — refers back to the bounds of media spend per channel.
  • constraints — constraints for SLSQP are defined as a listing of dictionaries, where budget_constraint is a function that ensures that the sum of media spends is the same as the fixed budget: np.sum(media_channel_average_spend)

After the optimization process is complete, we are able to generate response curves for every media channel and compare the spend allocation before and after optimization to evaluate the impact of the optimization process.

Image by Writer

Budget Optimization using the Trained Model

The means of optimizing the budget using the trained model is sort of much like the previous approach, and could be applied to each models which have and people who shouldn’t have the saturation transformation. This approach offers greater flexibility for optimizing marketing mix, allowing for optimization across various time periods, including future ones.

The next code highlights the differences between the present and the previous approach:

The common spend per channel is multiplied by the specified optimization period

optimization_period = result["model_data"].shape[0]
print(f"optimization period: {optimization_period}")

optimization_percentage = 0.2

media_channel_average_spend = optimization_period * result["model_data"][media_channels].mean(axis=0).values

lower_bound = media_channel_average_spend * np.ones(len(media_channels))*(1-optimization_percentage)
upper_bound = media_channel_average_spend * np.ones(len(media_channels))*(1+optimization_percentage)

boundaries = optimize.Bounds(lb=lower_bound, ub=upper_bound)

We will interpet the outcomes of the optimization as “what’s the suitable amount of spending per channel during a particular time interval”

The target function expects two additional parameters: optimization_periodand additional_inputsall other variables like trend, seasonality, control variables used for model training and available for the chosen time period:

def model_based_objective_function(model, 
optimization_period,
model_features,
additional_inputs,
hill_slopes,
hill_half_saturations,
media_min_max_ranges,
media_channels,
media_inputs):

media_channel_period_average_spend = media_inputs/optimization_period

#transform original spend into hill transformed
transformed_media_spends = []
for index, media_channel in enumerate(media_channels):
hill_slope = hill_slopes[media_channel]
hill_half_saturation = hill_half_saturations[media_channel]

min_max_spend = media_min_max_ranges[index]
media_period_spend_average = media_channel_period_average_spend[index]

transformed_spend = HillSaturation(slope_s = hill_slope, half_saturation_k=hill_half_saturation).transform(np.array(min_max_spend), x_point = media_period_spend_average)
transformed_media_spends.append(transformed_spend)

transformed_media_spends = np.array(transformed_media_spends)

#replicate average perio spends into all optimization period
replicated_media_spends = np.tile(transformed_media_spends, optimization_period).reshape((-1, len(transformed_media_spends)))

#add _hill to the media channels
media_channels_input = [media_channel + "_hill" for media_channel in media_channels]
media_channels_df = pd.DataFrame(replicated_media_spends, columns = media_channels_input)

#prepare data for predictions
new_data = pd.concat([additional_inputs, media_channels_df], axis = 1)[model_features]

predictions = model.predict(X = new_data)

total_sum = predictions.sum()

return -total_sum

The target function takes in media spends which might be bounded by our constraints throughout the time period through the media_inputs parameter. We assume that these media spends are equally distributed along all weeks of the time period. Subsequently, we first divide media_inputs by the point period to acquire the common spend after which replicate it using np.tile.After that, we concatenate the non-media variables with the media spends and use them to predict the response withmodel.predict(X=new_data)for every week throughout the time interval. Finally, we calculate the whole response because the sum of the weekly responses and return the negative value of the whole response for minimization.

Optimizing budget spend in marketing mix modeling is vital since it allows marketers to allocate their resources in essentially the most effective way possible, maximizing the impact of their marketing efforts and achieving their business objectives.

I showed two practical approaches to optimizing marketing mix using saturation curves and trained models.

For an in depth implementation, please discuss with the entire code available for download on my Github repo.

Thanks for reading!

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here