The Role of Luck in Sports: Can We Measure It?

: When Skill Isn’t Enough

You’re watching your team dominate possession, double the variety of shots… and still lose. Is it just bad luck?

Fans blame referees. Players blame “off days.” Coaches mention “momentum.” But what if we told you that randomness—not talent or tactics—could be a serious hidden variable in sports outcomes?

This post dives deep into how luck influences sports, how we will try to quantify randomness using data, and the way data science helps us separate skill from likelihood.

So, as all the time, here’s a fast summary of what we’ll undergo today:

Defining luck in sports
Measuring luck
Case study
Famous randomness moments
What if we could remove luck?
Final Thoughts

Defining Luck in Sports

This could be controversial, as different people might define it in a different way and all interpretations can be equally acceptable. Here’s mine: luck in sports is about variance and uncertainty.

In other terms, let’s imagine luck is all of the variance in outcomes not explained by skill.

Now, for the man data scientists, one other way of claiming it: luck is the residual noise our models can’t explain nor predict appropriately (the model might be a football match, for instance). Listed here are some examples:

An empty-goal shot hitting the post as an alternative of getting into.
A tennis net cord that changes the ball direction.
A controversial VAR decision.
A coin toss win in cricket or American football.

Luck is in every single place, I’m not discovering anything recent here. But can we measure it?

Measuring Luck

We could measure luck in some ways, but we’ll visit three going from basic to advanced.

Regression Residuals

We normally deal with modeling the expected outcomes of an event: hwo many goals will a team rating, which can be the purpose difference between two NBA teams…

No perfect model exists and it’s unrealistic to aim for a 100%-accuracy model, everyone knows that. Nevertheless it’s precisely that difference, what separates our model from an ideal one, what we will define as regression residuals.

Let’s see a quite simple example: we wish to predict the ultimate rating of a football (soccer) match. We use metrics like xG, possession %, home advantage, player metrics… And our model predicts the house team will rating 3.1 goals and the visitor’s scoreboard will show a 1.2 (obviously, we’d must round them because goals are integers in real matches).

Yet the outcome is 1-0 (as an alternative of three.1-1.2 or the rounded 3-1). This noise, the difference between the end result and our prediction, is the luck component we’re talking about.

The goal will all the time be for our models to scale back this luck component (error), but we could also use it to rank teams by overperformance vs expected, thus seeing which teams are more affected by luck (based on our model).

Monte Carlo Method

After all, MC had to seem on this post. I have already got a post digging deeper into it (well, more specifically into Markov Chain Monte Carlo) but I’ll introduce it anyway.

The Monte Carlo method or simulations consists in using sampling numbers repeatedly to acquire numerical ends in the shape of the likelihood of a spread of results of occurring.

Mainly, it’s used to estimate or approximate the possible outcomes or distribution of an uncertain event.

To stick to our Sports examples, let’s say a basketball player shoots accurately 75% from the free-throw line. With this percentage, we could simulate 10,000 seasons supposing every player keeps the identical skill level and generating match outcomes stochastically.

With the outcomes, we could compare the skill-based predicted outcomes with the simulated distributions. If we see the team’s actual FT% record lies outside the 95% of the simulation range, then that’s probably luck (good or bad depending on the acute they lie in).

Bayesian Inference

By far my favorite approach to measure luck due to Bayesian models’ ability to separate underlying skill from noisy performance.

Suppose you’re in a football scouting team, and also you’re checking a really young striker from the very best team within the local Norwegian league. You’re particularly excited about his goal conversion, because that’s what your team needs, and also you see that he scored 9 goals within the last 10 games. Is he elite? Or lucky?

With a Bayesian prior (e.g., average conversion rate = 15%), we update our belief after each match and we find yourself having a posterior distribution showing whether his performance is sustainably above average or a fluke.

If you happen to’d wish to get into the subject of Bayesian Inference, I wrote a post attempting to predict last season’s Champions League using these methods: https://towardsdatascience.com/using-bayesian-modeling-to-predict-the-champions-league-8ebb069006ba/

Case Study

Let’s get our hands dirty.

The scenario is the subsequent one: we have now a round-robin season between 6 teams where each team played one another twice (home and away), each match generated expected goals (xG) for each teams and the actual goals were sampled from a Poisson distribution around xG:

Home	Away	xG Home	xG Away	Goals Home	Goals Away
Team A	Team B	1.65	1.36	2	0
Team B	Team A	1.87	1.73	0	2
Team A	Team C	1.36	1.16	1	1
Team C	Team A	1.00	1.59	0	1
Team A	Team D	1.31	1.38	2	1

Maintaining where we left within the previous section, let’s estimate the true goal-scoring ability of every team and see how much their actual performance diverges from it — which we’ll interpret as .

We’ll use a Bayesian Poisson model:

Let λₜ be the latent goal-scoring rate for every team.
Then our prior is λₜ ∼ Gamma(α,β)
And we assume the Goals ∼ Poisson(λₜ), updating beliefs about λₜ using the actual goals scored across matches.

λₜ | data ∼ Gamma(α+total goals, β+total matches)

Right, now we want to choose our values for α and β:

My initial belief (without any data) is that the majority teams rating around 2 goals per match. I also know that in a Gamma distribution, the mean is computed using α/β.
But I’m not very confident about it, so I would like the usual deviation to be relatively high, above 1 goal definitely. Again, in a Gamma distribution, the usual deviation is computed from √α/β.

Resolving the easy equations that emerge from these reasonings, we discover that α=2 and β=1 are probably good prior assumptions.

With that, if we run our model, we get the subsequent results:

Team	Games Played	Total Goals	Posterior Mean (λ)	Posterior Std	Observed Mean	Luck (Obs – Post)
Team A	10	14	1.45	0.36	1.40	−0.05
Team D	10	13	1.36	0.35	1.30	−0.06
Team E	10	12	1.27	0.34	1.20	−0.07
Team F	10	10	1.09	0.31	1.00	−0.09
Team B	10	9	1.00	0.30	0.90	−0.10
Team C	10	9	1.00	0.30	0.90	−0.10

How will we interpret them?

All teams barely underperformed their posterior expectations — common briefly seasons as a result of variance.

Team B and Team C had the largest negative “luck” gap: their actual scoring was 0.10 goals per game lower than the Bayesian estimate.

Team A was closest to its predicted strength — essentially the most “neutral luck” team.

This was a fake example using fake data, but I bet you may already sense its power.

Let’s now check some historical randomness moments on the earth of sports.

Famous Randomness Moments

Any NBA fan remembers the 2016 Finals. It’s game 7, Cleveland play at Warriors’, and so they’re tied at 89 with lower than a minute left. Kyrie Irving faces Stephen Curry and hits a memorable, clutch 3. Then, the Cavaliers win the Finals.

Was this skill or luck? Kyrie is a top player, and possibly a very good shooter too. But with the opposition he had, the time and scoreboard pressure… We simply can’t know which one was it.

Moving now to football, we focus now on the 2019 Champions League semis, Liverpool vs Barcelona. This one is personally hurtful. Barça won the primary leg at home 3-0, but lost 4-0 at Liverpool within the second leg, giving the reds the choice to advance to the ultimate.

Liverpool’s overperformance? Or an statistical anomaly?

One last example: NFL coin toss OT wins. The complete playoff outcomes are decided by a 50/50 easy scenario where the coin (luck) has all the ability to choose.

What if we could remove luck?

Can we remove luck? The reply is a transparent NO.

Yet, why are so lots of us attempting to? For professionals it’s clear: this uncertainty affects performance. The more control we will have over all the pieces, the more we will optimize our methods and techniques.

More certainty (less luck), means extra money.

And we’re rightfully doing so: luck isn’t removable but we will diminish it. That’s why we construct complex xG models, or we construct betting models with probabilistic reasoning.

But sports are supposed to be unpredictable. That’s what makes them thrilling for the spectator. Most wouldn’t watch a game if we already knew the result.

Final Thoughts

Today we had the chance to speak concerning the role of luck in sports, which is very large. Understanding it could help fans avoid overreacting. Nevertheless it could also help scouting and team management, or inform smarter betting or fantasy league decisions.

All in all, we must know that the very best team doesn’t all the time win, but data can tell us how often they need to have.

The Role of Luck in Sports: Can We Measure It?

: When Skill Isn’t Enough

Defining Luck in Sports