Have you ever ever played a co-operative game or sport? Let’s consider one other example, but this time within the skilled world. Let’s say you’re a part of a company whose primary technique of driving product sales is thru its e-commerce site. Inside this organization, you likely have various marketing teams that drive customers to the web site through online advertisements, email campaigns, and other channels. The web site itself is maintained by one other set of teams whose responsibilities include design, merchandising, suggestion strategies, and plenty of more. The organization must also consider the product itself and the teams that create, improve, and develop recent products. This poses a major query to the organization:
If that query was not already difficult enough to reply, we must consider a more realistic viewpoint. All of those teams inside the business rely on one another to make a sale. The product teams need a technique to sell the products. The organization has an internet site to facilitate this. For the web site to make a sale, it requires customers; that’s where the marketing teams have to create, launch, and maintain campaigns that drive traffic to the web site. Recognizing the intertwined dependencies among the many organization’s teams, the business must understand value by way of team That is where come into play.
Harsanyi Dividends
Harsanyi dividends, an idea from cooperative game theory, measure the surplus value of coalitions in a cooperative game. The important thing here is the worth. A co-operative game is an idea in game theory, the study of how participants interact in a game or activity with a shared goal.
Let’s do a bird’s-eye view of cooperative game theory, specifically transferable-utility (TU) cooperative games. In TU cooperative game theory, players can form coalitions to attain a collective payoff in an agreed-upon way. For those of us who’ve built robust predictive models using frameworks akin to XGBoost or other ensemble methods, we now have probably found ourselves using Shapley values to know the contribution of every feature, because the model itself is a black box. Shapley values will also be used to find out each player’s payoff in a coalition in TU cooperative games. There may be actually numerous value in a framework akin to Shapley values for understanding individual contributions; nonetheless, Harsanyi Dividends help us know the additional value generated by coalitions. Let’s take a look at a hypothetical example.
A cooperative game — Dragon Slayer
Let’s say three friends get together to play a brand new co-op video game where the goal is to work together to inflict as much damage as possible on a dragon. The players are Andrew, Bryan, and Carson. They’ve played this game persistently; nonetheless, not all players play together each time. Sometimes, it is barely Andrew & Carson, Carson & Bryan, Carson by himself, etc. They’ve played this game a lot that each possible subset of the group has played it persistently, including sessions with just one player.
Carson, an information scientist by trade, wants to assemble a deeper understanding of the group’s performance. He gathered the scores of every session and eventually had a mean rating for every coalition and individual. Take a take a look at these aggregate scores below. Each player will likely be represented by their first initial. We are going to represent the common rating of every coalition/individual with .
10
12
18
27
23
29
We will clearly see that Carson is the highest individual player, while Andrew & Bryan are the highest duo. You might be probably not surprised that the three-player coalition yields the best rating. Carson, the curious data scientist, wants to check these interactions amongst his friends in greater depth. To accomplish that, he decides to calculate to see which group collaborated essentially the most effectively. Now that is a sophisticated query. We will easily see the scores by each coalition; nonetheless, what if we adjusted for what individual players already contributed? We will discover which coalitions enhance one another or act as a detriment to what the smaller coalitions of players already contribute. In other words, where does 1+1 equal something greater than two, and where does 1+1 equal something lower than two?
To perform this, we’ll use the formula below:
Let’s break it down piece by piece.

This represents the Harsanyi Dividend for coalition .

The symbol is utilized in mathematics to reveal taking the sum of a series of terms in a compact form. Just under it shows the expression of the terms we’re taking a sum over. On this case, it explicitly reads T subset S. T is a subset of the coalition S, which is the coalition for which we’re calculating the dividend. Together with Sigma, this demonstrates that we’re taking the sum of all possible subsets of S in a particular manner that we are going to discuss next. One final note on this part, the coalition itself is taken into account a subset.

The pipe symbols around and indicate that we’re the sizes of the sets. The results of these differences is the facility we raise negative one to in each summation. Then, it’s multiplied by the worth for subset
Calculating the Dividends — Individuals
Let’s start with the person players (Andrew, Bryan, and Carson), as this will likely be essentially the most straightforward. For convenience, listed here are their individual aggregated scores mentioned earlier:
10
12
18
Are you able to guess what their Harsanyi Dividends are? Let’s start with Andrew and calculate it step-by-step ().

For every a part of the sum, we want the subset’s value and its size. For individual players, that leaves us with just one subset (itself), so we only must undergo the loop once.
Starting with the exponent, the scale of set is only one. may even be of size one. This leaves us with raising -1 to the 0 power, which yields 1. We then multiply that by our price for which is 10, yielding a Harsanyi dividend of 10 for Andrew. For people, the dividend is just the worth.
18
Calculating the Dividends — Pairs
Let’s calculate the dividends for Andrew and Bryan (The subsets are (a,b), (a), and (b). Subsequently, we could have three sums.
Sum #1, Subset (a), v(a) = 10, size of (a) = 1,
-1^([a,b]-[a]) * v(a) = -1^(2-1) * 10 =
Sum #2, Subset (b), v(b) = 12, size of (b) = 1,
-1^([a,b]-[b]) * v(b) = -1^(2-1) * 12 =
Sum #3, Subset (a,b), v(a,b) = 27, size of (a,b) = 2,
-1^([a,b]-[a,b]) * v(a,b) = -1^(2-2) * 27 =
Add all of them together, and we get:
Let’s pause here to debate some quick intuition behind calculating Harsanyi dividends for pairs. To place it simply, the dividend is solely the worth of the pair minus the values of the individuals within the pair. In other words, it shows whether the pair generates a surplus of value or loses value once they work together. In this instance, Andrew & Bryan reveal they played the sport more efficiently together. Take a take a look at the dividends for the remaining pairs. What insights can we derive? The very first thing that involves my mind is that Carson might be not the most effective teammate, not less than when he’s in a pair. Let’s see how things change once we take a look at the trio.
-1
Calculating the Dividends — Trios
Buckle up, there are numerous sums here; nonetheless, it is crucial to know which values are added versus subtracted within the trio calculation.
- Sum #1, Subset (a), v(a) = 10, size of (a) = 1
- -1^([a,b,c]-[a]) * v(a) = -1^(3-1) * 10 =
- Sum #2, Subset (b), v(b) = 12, size of (b) = 1
- -1^([a,b,c]-[b]) * v(b) = -1^(3-1) * 12 =
- Sum #3, Subset (c), v(c) = 18, size of (c) = 1
- -1^([a,b,c]-[c]) * v(c) = -1^(3-1) * 18 =
- Sum #4, Subset (a,c), v(ab) = 27, size of (a,b) = 2
- -1^([a,b,c]-[a,b]) * v(a,b) = -1^(3-2) * 27 = -27
- Sum #5, Subset (a,c), v(a,c) = 23, size of (a,c) = 2
- -1^([a,b,c]-[a,c]) * v(a,c) = -1^(3-2) * 23 = -23
- Sum #6, Subset (c,b), v(c,b) = 29, size of (c,b) = 2
- -1^([a,b,c]-[c,b]) * v(c,b) = -1^(3-2) * 29 = -29
- Sum #7, Subset (a,b,c), v(a,b,c) = 37, size of (a,b,c) = 3
- -1^([a,b,c]-[a,b,c]) * v(a,b,c) = -1^(3-3) * 37 = 37
-2
So there’s actually numerous math, but it surely is simple. What concerning the intuition behind what is going on? As you simply saw, calculating dividends for one and two-player coalitions is kind of easy to execute without the formula; nonetheless, when you get to three-player coalitions and above, the steps increase exponentially. With the three-player coalition specifically, it is straightforward to see that the two-player coalition values get subtracted, while the one-player coalition values get added back in. What about four-player coalitions? Three player coalitions would get subtracted, two would get added back in, singles can be subtracted, etc. You’ll be able to easily extrapolate the pattern here; nonetheless, what does this pattern of subtracting and adding actually do? Let’s concentrate on the three-player example. By subtracting the 2 player coalition values, we’re removing the synergy received from that coalition and the lower level values from the smaller coalitions inside it, nonetheless, when this happens, it actually over-subtracts value and when the only player values are added back in, we’re adjusting for the over-subtracted value and are leftover with the pure synergy from the three player coalition.
Real World Application — E-commerce Website

Going back to our original example, let’s construct an application that calculates Harsanyi Dividends for an e-commerce website for the entire actions a customer can perform, in order that we are able to get a way of which elements of the web site work well together. These insights can assist stakeholders with the next questions:
The Harsanyi Application
The entire project will be found on my GitHub here. I’ll walk you thru the three core files: synthetic_data.py, dividends.py, and app.py.
synthetic_data.py
Why include an artificial data feature? One among my goals for this project is to be educational, and the synthetic generation data portion allows an end user to quickly explore the tool and even gain a way of the kind of data the tool is designed to handle. Note, there’s also an option for a user to upload their very own data via a CSV file.
Here’s a simplified view of what the info should appear like:
| search engine optimisation | Product Page | Desktop | Conversion | |
| 1 | 0 | 1 | 0 | 1 |
| 0 | 0 | 1 | 0 | 1 |
| 0 | 0 | 0 | 0 | 0 |
| 0 | 1 | 1 | 1 | 0 |
As you’ll be able to see, each feature and the goal column () are boolean. Each remark will be implied as a customer or an internet site session. Within the synthetic data section, the feature variables will be in three categories: channel, page, and device; nonetheless, in the event you are uploading your personal data, you should use whatever you wish so long as it’s Boolean.
Feature Variable Propensities
Generating “good” synthetic data means making it as realistic as possible. On this project, meaning we must include realistic propensities for every feature variable.
Within the file, I added an inventory of feature propensity ranges. These will be easily configured and are utilized to model the propensity to convert. A few of them intentionally should not have a variety, but for people who do, they’re passed through a custom randomization function that outputs a worth in between the range.
FEATURE_PROPENSITY_RANGES
FEATURE_PROPENSITY_RANGES: Dict[str, Tuple[float, float]] = {
# Channels
"email": (2.0, 2.0),
"website positioning": (6.0, 6.0),
"sem": (6.0, 6.0),
"direct": (5.0, 5.0),
"display": (1.0, 1.0),
"social": (1.0, 1.0),
"affiliate": (7.0, 7.0),
# Pages (A/F with ranges where specified)
"product_page_a": (5.0, 7.0),
"product_page_b": (4.0, 8.0),
"product_page_c": (5.0, 7.0),
"product_page_d": (4.0, 8.0),
"product_page_e": (5.0, 7.0),
"product_page_f": (4.0, 8.0),
"deals_page": (6.0, 6.0),
"search_page": (5.0, 5.0),
"homepage": (4.0, 4.0),
"account_page": (7.0, 7.0),
"support_page": (3.0, 3.0),
# Device
"device_desktop": (6.0, 6.0),
"device_mobile": (3.0, 3.0),
}
_coef_range_for_score
The ranges themselves can’t be plugged directly right into a model to generate a sample that yields a mean conversion rate of around 5%. We’re accomplishing this via a Logistic Regression, which requires realistic coefficients within the linear function. To convert these ranges into meaningful coefficients, I created the next function:
def _coef_range_for_score(rating: float) -> Tuple[float, float]:
if rating <= 2.0:
return (-1.0, -0.3) # negative effect
elif rating <= 4.0:
return (-0.3, 0.3) # near neutral
elif rating <= 6.0:
return (0.3, 1.0) # moderate positive
elif rating <= 8.0:
return (1.0, 2.5) # strong positive
else:
return (2.5, 4.0) # very strong positive
_sample_marginal_probabilities
While propensity is crucial, one must also consider how often we expect a user to interact with each channel, page, or device. Subsequently, we want a function that determines how often each element is interacted with by a customer. Note, I even have the channel section shown, however the remaining are executed similarly. Take note that every one the functions you've got seen to date make sure that the synthetic data is different every time it's generated.
def _sample_marginal_probabilities(
rng: np.random.Generator,
) -> Tuple[Dict[str, float], float]:
probs: Dict[str, float] = {}
# Channels – fairly sparse, some more common (search engine optimisation, Direct)
probs["email"] = rng.uniform(0.03, 0.15)
probs["seo"] = rng.uniform(0.10, 0.60)
probs["sem"] = rng.uniform(0.05, 0.40)
probs["direct"] = rng.uniform(0.10, 0.50)
probs["display"] = rng.uniform(0.01, 0.10)
probs["social"] = rng.uniform(0.03, 0.20)
probs["affiliate"] = rng.uniform(0.02, 0.15)
_build_logistic_spec
The next function is what builds the logistic regression model. Listed here are the primary few lines.
def _build_logistic_spec(
rng: np.random.Generator,
) -> LogisticSpec:
scores = _sample_feature_scores(rng)
# Predominant effects
main_effects = {}
for feature in ALL_BINARY_FEATURES:
rating = scores[feature]
lo, hi = _coef_range_for_score(rating)
main_effects[feature] = rng.uniform(lo, hi)
To disclose interactions among the many variables, we'll have to add interaction terms to the model. To perform this, we'll add some functions inside our custom logistic regression function that add interaction terms for combos of two and three features. These will be configured within the function itself, as you'll be able to see within the second code block.
interactions_2 = {}
interactions_3 = {}
strong_2 = (1.0, 3.0)
moderate_2 = (0.5, 1.5)
weak_2 = (-0.3, 0.3)
strong_3 = (1.5, 3.5)
moderate_3 = (0.7, 2.0)
def add_interaction_2(a, b, coef_range):
key = tuple(sorted((a, b)))
interactions_2[key] = rng.uniform(*coef_range)
def add_interaction_3(a, b, c, coef_range):
key = tuple(sorted((a, b, c)))
interactions_3[key] = rng.uniform(*coef_range)
add_interaction_3("sem", "product_page_a", "deals_page", strong_3)
add_interaction_3("website positioning", "product_page_c", "search_page", moderate_3)
Finally, we add the intercept. Note that while it will ensure our baseline model keeps to data at around a 5% conversion rate, we'll have to fine-tune it to maintain it near 5%.
intercept = float(np.log(0.05 / (1.0 - 0.05)))
return LogisticSpec(intercept, main_effects, interactions_2, interactions_3)
_compute_linear_predictor
Now, the previous function doesn't actually construct the model; it sets the stage by making a dictionary of features, feature interactions, and their associated coefficients. The function below iterates and returns the output once the values for a given remark are plugged in.
def _compute_linear_predictor(
df: pd.DataFrame,
spec: LogisticSpec,
) -> np.ndarray:
z = np.full(shape=len(df), fill_value=spec.intercept, dtype=float)
# Predominant effects
for f, beta in spec.main_effects.items():
if f in df.columns:
z += beta * df[f].values
# 2-way
for (a, b), beta in spec.interactions_2.items():
if a in df.columns and b in df.columns:
z += beta * (df[a].values * df[b].values)
# 3-way
for (a, b, c), beta in spec.interactions_3.items():
if a in df.columns and b in df.columns and c in df.columns:
z += beta * (df[a].values * df[b].values * df[c].values)
return z
_calibrate_intercept_to_global_rate
Conversion rates can vary significantly; nonetheless, I consider it's secure to assume that the majority web sites receive conversions from a small variety of their customers. On this tool, we'll adjust the info to take a conversion rate of around 5%. There are a number of ways we are able to do that; nonetheless, I find essentially the most efficient approach is to regulate the intercept term until we get a threshold near the 5% goal. The function below does just that. The ultimate function that follows this one combines every thing preceding it and is what is definitely called in the appliance.
def _calibrate_intercept_to_global_rate(
df: pd.DataFrame,
spec: LogisticSpec,
target_rate: float = 0.05,
max_iter: int = 8,
) -> LogisticSpec:
for _ in range(max_iter):
z = _compute_linear_predictor(df, spec)
p = expit(z)
mean_p = float(p.mean())
if mean_p <= 0 or mean_p >= 1:
break # something degenerate; quit
current_odds = mean_p / (1.0 - mean_p)
target_odds = target_rate / (1.0 - target_rate)
delta = np.log(target_odds / current_odds)
spec.intercept += float(delta)
# Early stop if close enough
if abs(mean_p - target_rate) < 0.002:
break
return spec
dividends.py
As you most likely guessed, this file is the engine that computes the Harsanyi Dividends. We already went through a strong exercise reviewing how they're calculated; due to this fact, I believe it's rather more productive to debate how the dividends will likely be calculated within the context of this tool.
Clickstream data, in itself, will be very sparse, as a typical customer journey may involve several individual actions. This poses a challenge when calculating coalition values. Say we now have a dataset of 100k customers with the entire actions they took, and we would like to calculate the coalition value for purchasers who interacted with the homepage and a product page. We may find only a handful of consumers who performed those two actions alone; due to this fact, for every coalition, we'll check whether a customer performed those actions no matter what else they did. From there, we take the common to acquire the coalition’s value. One vital note I should mention is that there is no such thing as a formal definition of how a worth rating must be calculated within the context of Harsanyi dividends; due to this fact, one needs to make use of one’s best judgment. In this instance, taking the common is effective because we're using binary data and the common yields a proportion or percentage. Now, if we were using revenue as an alternative, taking the common may very well be significantly misleading on account of potential outliers.
Lastly, I should mention that this file uses parallel programming via the module and the dynamic configurations. Parallel programming can significantly reduce the time required to compute Harsanyi dividends when working with large datasets. There may be also an choice to designate the utmost size of the coalitions for which you would like to calculate dividends. The aim of this tool is to present stakeholders something actionable they will work with. When you are delivering coalitions of customer journeys that include several interactions, this will result in many fragmented opportunities that might stretch available resources quite than specializing in a number of small, high-value coalitions. The last configuration I'll mention is the minimum data proportion for a coalition to be included within the calculations. This ensures that any opportunities that the tool uncovers have a decent sample size.
Demo using Synthetic Data
Now, let’s do a fast demonstration with the tool. We are going to go from start to complete using the synthetic dataset option and end with a number of insights.
Step 1: Generate an artificial Dataset

Step 2: Configure the utmost coalition size and the minimum % of knowledge required for a coalition to be counted, then calculate the Harsanyi Dividends.

Step 3: Analyze the Results

The resulting dataframe will likely be sorted by the Harsanyi Dividend column; due to this fact, one would probably see that the primary few coalitions are from the only players. Given the context wherein one would likely use Harsanyi Dividends, individual players aren’t invaluable, but they're practical in that context. The actual impact comes from analyzing multiplayer coalitions. Let’s take a take a look at a number of via the export of the above table.

These are the multiple-player coalitions with the most important Harsanyi Dividends; in other words, the players who generate essentially the most synergy together. So, what will we do with this information?
The highest multi-player coalition is “deals page” & “SEM”, more practically speaking, customers who went to the deals page from a SEM campaign. One suggestion you can provide as knowledgeable is that more investment may very well be helpful for a lot of these campaigns.
What about the next few coalitions? There look like various combos of product pages. You can recommend upsell or cross-sell experiences for these products, as conversion rates increase measurably when customers interact with these pages in the course of the same journey. Upselling and cross-selling these products together could prove to be helpful.
Conclusion
I could go on and on concerning the limitless opportunities a Harsanyi Dividend-derived evaluation could deliver, especially in a high-volume marketing or online store environment where countless variables are at all times at work. To conclude, I would like to depart you all with a number of suggestions with regards to driving ideas and opportunities via Harsanyi Dividends:
- Discover a balance between coalition value and volume: You'll undoubtedly encounter situations where you discover helpful coalitions, but specializing in them would affect only a fraction of the business or customers. It's vital to seek out a healthy balance from this attitude.
- Stick to reasonably sized coalitions: Pitching opportunities or ideas to large coalitions could prove costly from several angles. In my e-commerce site example, there could also be instances where a helpful coalition spans multiple pages and maybe quite a few marketing channels. If I tell stakeholders to concentrate on those combos, it could require complex investments across various teams and technologies. With that being said, whether it is a big coalition of several similar pages, then any investment increase may very well be streamlined. Ultimately, a fairly sized coalition will rely on the business case. As with all data science project, domain knowledge is vital here.
- Translate Dividends into measurable impact: Any opportunity or idea pitched to a stakeholder will probably require a financial impact. Subsequently, one must give you the option to translate a Harsanyi Dividend into an investment return. It is perhaps so simple as reverting to the coalition value metric and adding some multiplier in the event you recommend a project that will result in a bigger coalition size, for instance, more campaigns from a particular channel to a specific page, as I discussed earlier. There'll probably be countless ways to perform this sort of mathematical translation.
I hope you enjoyed this text! I find this area of co-operative game theory numerous fun! If you desire to learn more, make sure to try the unique published paper from John Harsanyi entitled: , published in 1963.
