reading the thought-provoking book by Daniel Kahneman (Nobel Prize Winner in Economics and one of the best selling writer of ) and Professors Olivier Sibony and Cass Sunstein. Noise highlights the looming, but often well-hidden, presence of persistent noise in human affairs — defined because the variability in decision making outcomes for a similar tasks across experts in a specific field. The book supplies many compelling anecdotes into the true effects of noise from fields akin to Insurance, Medicine, Forensic Science and Law.
Noise is distinguished from which is the magnitude and direction of the error in decision making across those self same set of experts. The important thing difference is best explained in the next diagram:
The diagram illustrates the excellence between and in human judgment. Each goal represents repeated judgments against the identical problem, with the bullseye symbolising the proper answer. Bias occurs when judgments are systematically shifted away from the reality, as in Teams A and B, where the shots are consistently off-center. , in contrast, reflects : the judgments scatter unpredictably, as seen in Teams A, C and D. In this instance, Team A has a big degree of noise and bias.
We are able to summarise this as follows:
- Team A: The shots are all off-center (bias) and never tightly clustered (noise). This shows .
- Team B: Shots are tightly clustered but systematically away from the bullseye. This shows .
- Team C: Shots are opened up and inconsistent, with no clear cluster. That is , with less systematic bias.
- Team D: Also opened up, showing
Artificial Intelligence (AI) practitioners could have an moment just now, because the bias and noise described above is harking back to the bias-variance trade-off in AI, where we seek models that specify the info well, but without fitting to the noise. Noise here is synonymous with variance.
The 2 major components of human judgement error might be broken down through what is known as the , with mean squared error (MSE) used to aggregate the errors across individual decisions:
Overall Error (MSE) = Bias² + Noise²
Bias is the typical error, while noise is the usual deviation of judgments. Overall error might be reduced by addressing either, since each contribute equally. Bias is normally the more visible component — it is commonly obvious when a set of choices systematically leans in a single direction. Noise, in contrast, is harder to detect since it hides in variability. Consider the goal I presented earlier: bias is when all of the arrows cluster off-center, while noise is when arrows are scattered all around the board. Each reduce accuracy, but in other ways. The sensible takeaway from the error equation is obvious: we must always aim to cut back each bias and noise, relatively than fixating on the more visible bias alone. Reducing noise also has the good thing about making any underlying bias far easier to identify.
To solidify our understanding of bias and noise, one other useful visualisation from the book is shown below. These diagram: the x-axis shows the magnitude of the error (difference between judgment and truth), and the y-axis shows its probability. Within the left plot, noise is reduced while bias stays: the distribution narrows, but its mean stays offset from zero. In the best plot, bias is reduced: the complete distribution shifts toward zero, while its width (the noise) stays unchanged.

Noise and bias help explain why organisations often reach decisions which are each inaccurate and inconsistent, with outcomes swayed by aspects akin to mood, timing, or context. Court rulings are an excellent example: two judges — and even the identical judge on different days — may determine similar cases in a different way. External aspects as trivial because the weather or an area sports result can even shape a judgment. To counter this, startups like Bench IQ are using AI to reveal noise and bias in judicial decision-making. Their pitch highlights a tool that maps judges’ patterns to present lawyers a clearer view of how a ruling might unfold. This tool goals to tackle a core concern of : when randomness distorts high-stakes decisions, tools that measure and predict judgment patterns could help restore consistency.
One other compelling example presented by the book comes from the insurance industry. In , the authors show how judgments by underwriters and adjusters varied dramatically. A noise audit revealed that quotes often relied on who was assigned — essentially a lottery. On average, the difference between two underwriters’ estimates was 55% of their mean, five times higher than what a bunch of surveyed CEOs expected. For a similar case, one underwriter might set a premium at $9,500 while one other set it at $16,700 — an incredibly wide margin. Noise is clearly at play here, and this is only one example amongst many.
Ask yourself this query: when counting on skilled judgement
By now it ought to be apparent that noise is a really real phenomenon and costs organisations lots of of hundreds of thousands in errors, inefficiencies, and lost opportunities through ineffective decision making.
Why Group Decisions are Even More Noisier: Information Cascades and Group Polarisation
The suggests that group decisions can approximate the reality — when people make judgments independently, their errors cancel out. The concept of the wisdom of crowds goes back to Francis Galton in 1906. At a livestock fair, he asked 800 people to guess the burden of an ox. Individually, their estimates varied widely. But when averaged, the gang’s judgment was almost perfect — only one pound off. This illustrates the promise of aggregation: independent errors cancel out, and the group judgment converges on the reality.
But in point of fact, psychological and social aspects often derail this process. In groups, outcomes are swayed by who speaks first, who sits next to whom, or who gestures at the best moment. The identical group, faced with the identical problem, can reach very different conclusions on different days.
In , the authors highlight a study on music popularity for example of how group decisions might be distorted by social influence. When people saw that a specific song had already been downloaded again and again, they were more prone to download it themselves, making a self-reinforcing cycle of recognition. Strikingly, the identical song could find yourself with very different levels of success across different groups, depending largely on whether it happened to draw early momentum. The study shows how social influence can shape collective judgment, often amplifying noise in unpredictable ways.
Two key mechanisms help explain the dynamics of group-based decision making:
- Information Cascades — Like dominoes falling after the primary push, small early signals can tip a whole group. People copy what’s already been said as a substitute of voicing their very own true judgment. Social pressure compounds the effect — few want to seem silly or contrarian.
- Group Polarization — Deliberation often drives groups toward more extreme positions. As a substitute of balancing out, discussion amplifies tendencies. Kahneman and colleagues illustrate this with juries: , where members judge independently, show much less noise than , where discussion pushes the group toward either greater leniency or greater severity, as in comparison with the median.
Paradoxically, talking together could make groups less accurate and noisier than if individuals had judged alone. There may be a salient lesson here for management: group discussions should ideally be orchestrated in a way that’s noise-sensitive, using strategies that aim to cut back bias and noise.
Mapping the Landscape of Noisy Decisions
The important thing lesson from is that every one human decision-making, each individual and group-based, is noisy. This may increasingly or may not come as a surprise, depending on how often you’ve got personally been affected by the variance in skilled judgments. However the evidence is overwhelming: medicine is noisy, child-custody rulings are noisy, forecasts are noisy, asylum decisions are noisy, personnel judgments are noisy, bail hearings are noisy. Even forensic science and patent reviews are noisy. Noise is in every single place, yet it is never noticed — and much more rarely counteracted.
To assist get a grasp on noise, it will probably be useful to attempt to categorise it. Let’s begin with a taxonomy of choices. Two vital distinctions help us organise noisy decisions — recurrent vs singular and evaluative vs predictive. Together, these form an easy mental framework for guidance:
- Recurrent vs Singular decisions: Recurrent decisions involve repeated judgments of comparable cases — underwriting insurance policies, hiring employees, or diagnosing patients. Here, noise is simpler to identify because patterns of inconsistency emerge across decision-makers. Singular decisions, in contrast, are essentially recurrent decisions made just once: granting a patent, approving bail, or deciding an asylum case. Each decision stands alone, so the noise is present but largely invisible — we cannot easily compare what one other decision-maker would have done in the identical case.
- Evaluative vs Predictive decisions: Evaluative decisions are judgments of quality or merit — akin to rating a job candidate, evaluating a scientific paper, or assessing performance. Predictive decisions, then again, forecast outcomes — estimating whether a defendant will reoffend, how a patient will reply to treatment, or whether a startup will succeed. Each types are subject to noise, however the mechanisms differ: evaluative noise often reflects inconsistent standards or criteria, while predictive noise stems from variability in how people imagine and weigh the long run.
Together, these categories provide a framework for understanding the noise inside human judgment. Noise influences how we evaluate and the way we predict. Recognising these distinctions is step one toward designing systems that reduce variability and improve decision quality. Later, I’ll present some concrete measures that might be taken for reducing noise in each forms of judgements.
Not All Noise Is the Same: A Guide to Its Varieties
A noise audit, which is typically possible for recurrent decisions, can reveal just how inconsistent human judgment might be. Management can conduct a noise audit by having multiple individuals evaluate the identical case. This helps make the variability within the responses grow to be visible and measurable. The outcomes can sometimes be very revealing, an excellent example is the underwriting case I summarised earlier.
To strike at the center of the beast, the authors of distinguish between several forms of noise. On the broadest level is system noise — the general variability in judgments across a bunch of execs the identical case. System noise might be further divided into the next three sub-components:
- Level Noise — Differences in the general average judgments across individuals — some judges are stricter, some underwriters more generous.
- Pattern Noise — That is the private, idiosyncratic tendencies that skew a person’s decisions — all the time a bit lenient, all the time a bit pessimistic, all the time harsher on certain forms of cases. Pattern noise might be broken down into stable pattern noise, which reflects enduring personal tendencies that persist across time and situations, and transient pattern noise, which arises from temporary states akin to mood, fatigue, or context that will shift decision to decision.
- Occasion Noise — Variation in the identical person’s judgments at different times, influenced by mood, fatigue, or context. Occasion noise is usually a smaller component in the whole system noise. In other words, and thankfully, we are frequently more consistent with ourselves across time than interchangeable with one other person in the identical role.
The relative impact of every style of noise varies across tasks, domains and individuals, with level noise often contributing probably the most to system noise, followed by pattern noise after which occasion noise. These types of noise highlight the complexity of untangling how variability affects decision-making, and their differing effects explain why organizations so often reach inconsistent outcomes even when applying the identical rules to the identical information.
By recognizing each the forms of decisions and the sources of noise that shape them, we will design more deliberate strategies to cut back variability and enhance the standard of our judgments.
Strategies for Minimising Noise in our Judgements
Noise in decision-making can never be eliminated, but it will probably be reduced through well-designed processes and habits — what Kahneman and colleagues call . Like hand-washing, it prevents problems we cannot see or trace directly, yet still lowers risk.
Key strategies include:
- Conduct a noise audit: Acknowledge that noise is feasible and assess the magnitude of variation in judgments by asking multiple decision-makers to guage the identical cases. This makes noise visible and quantifiable. For instance, within the table below three raters scored the identical case 4/10, 7/10, and eight/10, producing a mean rating of 6.3/10 and a ramification of 4 points. The calculated highlights how much individual judgments deviate from the group, making inconsistency explicit.

- Use a choice observer: Having a neutral participant within the room helps guide the conversation, surface biases, and keep the group aligned with decision principles. Using a choice observer is most useful to cut back bias in decision making — which is more visible and easier to detect than noise.
- Assemble a various, expert team: Diversity of experience reduces correlated errors and provides complementary perspectives, limiting the chance of systematic blind spots.
- Sequence information rigorously: Present only relevant information, in the best order. Exposing irrelevant details early can anchor judgments in unhelpful ways. For instance, fingerprint analysts may very well be swayed by details of the case, or the judgement of a colleague.
- Adopt checklists: Easy checklists, as championed in , might be highly effective in high-stakes, high-stress situations by ensuring that critical aspects are usually not ignored. For instance, in medicine the Apgar rating began as a suggestion for systematically assessing newborn health but was translated right into a checklist: clinicians tick through predefined dimensions — heart rate, respiratory, reflexes, muscle tone, and skin color — inside a minute of birth. In this fashion a a fancy decision is decomposed into sub-judgments, reducing cognitive load, and improves consistency.
- Use a shared scale: Decisions ought to be anchored to a typical, external frame of reference relatively than each judge counting on personal criteria. This approach has been shown to cut back noise in contexts akin to hiring and workplace performance evaluations. By structuring each performance dimension individually and comparing multiple team members concurrently, applying a standardised rating scale, and using forced anchors for reference (e.g., case studies showing what good and great means), evaluators are much less prone to introduce idiosyncratic biases and variability.
- Harness the wisdom of crowds: Independent judgments, aggregated, are sometimes more accurate than collective deliberation. Francis Galton’s famous “village fair” study showed that the median of many independent estimates can outperform even experts.
- Create an “inner crowd”: Individuals can reduce their very own noise by simulating multiple perspectives — making the identical judgment again after time has passed, or by deliberately arguing against their initial conclusion. This effectively samples responses from an internal probability distribution, harking back to how large language models (LLMs) generate alternative completions. An important source of examples of this system in motion might be present in Ben Horowitz’s excellent book . You may see Horowitz forming an inner crowd to check every angle when facing high-stakes decisions — for instance, weighing whether to switch a struggling executive, or deciding if the corporate should pivot its strategy within the midst of crisis. Fairly than counting on a single instinct, he systematically challenges his own assumptions, replaying the choice from multiple standpoints until probably the most resilient path forward becomes clear.
- Anchor to an external baseline: when making predictive judgments, and begin by identifying an appropriate external baseline average. Then assess how strongly the knowledge at hand correlates with the consequence. If the correlation is high, adjust the baseline accordingly; whether it is weak or nonexistent, follow the typical as your best estimate. As an example, imagine you’re attempting to predict a student’s GPA. The natural baseline is the statistical average GPA of 3.2. If the scholar has consistently excelled across similar courses, that record is strongly correlated with future performance, and you’ll be able to reasonably adjust your forecast upward toward your intuitive guess of, say, 3.8. But in case your primary piece of knowledge is something weakly predictive — like the scholar participating in a debate club — you need to resist making adjustments and stick near the baseline. This approach not only reduces noise but additionally guards against the common bias of ignoring regression to the mean: the statistical tendency for extreme performances (good or bad) to maneuver closer to the typical over time. Starting with the baseline and only shifting when strong evidence justifies it’s the essence of noise reduction in predictive judgments, because the diagram below illustrates.

Lastly, and not at all least, we can even turn to as a helper in our decision making: from to , algorithms can radically reduce noise in judgments. Used with a human within the loop for oversight and verification, they supply a consistent baseline while leaving space for human discretion when it’s most precious.
Finding the Broken Legs: Leveraging AI in Judgment
One of the vital questions in decision-making is when to when trust algorithms and when to let human judgment take the lead. A useful place to begin is the broken leg principle:
For instance, if a model predicts that somebody will run their usual morning 5k because they never miss a day, but they’re down with the flu, you don’t need the algorithm’s forecast — you already know the jog isn’t happening.
To grasp what a broken leg is, imagine a commuter who recurrently bikes to work daily, but on the one morning there’s a severe snowstorm, the chances of biking collapse—an anomaly the info and an appropriately tuned AI can still catch.
The book — — highlights how Sendhil Mullainathan and colleagues explored this concept within the context of bail decisions. They trained an AI system on over 758,000 bail cases. Judges had access to the identical information — rap sheets, prior failures to seem, and other case details — however the AI was also given the outcomes: whether defendants were released, failed to seem in court, or were rearrested. The AI produced an easy numerical rating estimating risk. Crucially, irrespective of where the edge was set, the model outperformed human judges. The AI was significantly more accurate at predicting failures to seem and rearrests.
The advantage comes from AI’s ability to detect complex mixtures of variables. While a human judge might concentrate on obvious cues, the model can weigh 1000’s of subtle correlations concurrently. This is particularly powerful in identifying the highest-risk individuals, where rare but telling patterns predict dangerous outcomes. In other words, the AI excels at picking up rare but decisive signals — — that humans either overlook or can’t consistently evaluate.
“The algorithm makes mistakes, after all. But when human judges make much more mistakes, whom should we trust” Source: (HarperCollins, 2021).
AI models, if designed and applied rigorously, can reduce discrimination and improve accuracy. As we’ve seen, AI can enhance human decision making by uncovering hidden structure in messy, complex data. The challenge due to this fact becomes how you can balance the 2, and establish an efficient : when to trust the statistical patterns, and when to step in with human judgment for the broken legs the model can’t yet see.

When large-scale data isn’t available to coach advanced AI models, all isn’t lost. We are able to go : either through the use of equally weighted predictors — where each factor or input is given the identical importance relatively than a learned weight (as in multiple regression) — or by applying easy rules. Each approaches can significantly outperform human judgment. Psychologist Robyn Dawes demonstrated this counterintuitive finding, coining the term to explain the equal-weighting method.
For instance, imagine forecasting next quarter’s sales using 4 independent predictors: historical trend extrapolation (+8%), market sentiment index (+12%), analyst consensus (+6%), and manager gut-feel (+10%). As a substitute of trusting any single forecast, the improper linear model simply averages them, producing a final prediction of +9%. By cancelling out random variation in individual inputs, this method often beats expert judgment and shows why equal weighting might be surprisingly powerful.
AI practitioners can view Dawes’ breakthrough as an early type of capability control: in low-data settings, giving every input equal weight prevents the model from overfitting to noise.
Rules are arguably even simpler and might dramatically cut down the noise. Kahneman, Sibony, and Sunstein highlight a team of researchers who built an easy model to evaluate flight risk for defendants awaiting trial. Using just two predictors — age and the variety of missed court dates — the model produced a risk rating that rivalled human assessments. The formula was so easy it may very well be calculated by hand.
Conclusions and Final Thoughts
We’ve explored the primary lessons from by Kahneman, Sibony, and Sunstein. The book highlights how noise is the proverbial elephant within the room — ever present yet rarely acknowledged or addressed. Unlike bias, noise in judgment is silent, but its impact is real: it costs money, shapes decisions, and affects lives. Kahneman and his co-authors make a compelling case for systematically analyzing noise and its consequences wherever vital decisions are made.

In this text, we examined the different sorts of choices — e— and the corresponding forms of noise, including, , , and. We also linked noise to bias through the noise equation, highlighting the importance of addressing each. While bias is commonly more visible, the book makes clear that noise is equally damaging, and efforts to cut back it are only as essential.
Noise is less visible than bias not since it can’t be seen, but since it rarely proclaims itself without systematic comparison. Bias is systematic: after a handful of cases, you’ll be able to spot a consistent drift in a single direction, akin to a judge who’s all the time harsher than average. Noise, in contrast, shows up as inconsistency — lenient someday, harsh the subsequent. In principle, this variance is visible, but in practice each decision, viewed in isolation, still feels reasonable. Unless judgments are lined up and compared side by side — a process Kahneman and colleagues call a— the silent cost of variability goes unnoticed.
Thankfully, there are concrete steps we will take to enhance our judgments and make our decisions noise-aware: we touched on the importance of a noise audit to firstly accept noise as a possibility which may be a problem. Based on that, and depending on the situation, we will embrace higher decision hygiene through, for instance, structured decision protocols, using independent multiple assessments or AI when used rigorously and responsibly— these are concrete shifts that help reduce variability and make our judgments more consistent.
📚Further Learning
Some suggested further reading to deepen your understanding of noise in judgment, forecasting, and decision hygiene:
- Noise: A Flaw in Human Judgment: An outline of the book — — its publication details, core concepts, and key examples.
- The Signal and the Noise (Nate Silver): A related work specializing in forecasting uncertainty and distinguishing meaningful signals from irrelevant noise — a thematic complement to Kahneman’s evaluation.
- Barron’s interview: “Daniel Kahneman Says Noise Is Wrecking Your Judgment. Here’s Why, and What to Do About It.” Elaborates on the forms of noise (level, occasion, and pattern) and offers practical “decision hygiene” strategies for noise reduction — in specific domains like insurance and investment.
- SuperSummary’s Study Guide for Noise: A structured and detailed breakdown of the book’s chapters, themes, and evaluation, ideal for writers or readers looking for a deeper structural understanding or quick reference material.
- LA Review of Books: “Dissecting ‘Noise’” by Vasant Dhar: Unpacks how noise manifests across real-world scenarios like sentencing variability amongst judges and the inconsistency of choices under different circumstances.
- Human Decisions and Machine Predictions (Kleinberg, Lakkaraju, Leskovec, Ludwig, Mullainathan). A landmark study showing how machine learning can outperform human judges in bail decisions by detecting rare but decisive patterns — so-called “broken legs” — hidden in large datasets.
- The Checklist Manifesto (Atul Gawande, 2009): Demonstrates how structured checklists dramatically improve outcomes in fields like surgery and aviation.
- The Hard Thing About Hard Things (Ben Horowitz, 2014): Shows how leaders can confront complex, high-stakes decisions by deliberately stress-testing their very own judgments — an approach akin to creating an “inner crowd.”
