How To not Mislead with Your Data-Driven Story

-

is all over the place. There are countless books, articles, tutorials, and videos, a few of which I even have written or created.

In my experience, most of those resources are inclined to present data storytelling in an overwhelmingly positive light. But recently, one concern has been on my mind:

What if our stories, as an alternative of clarifying, mislead?

Image 1. Change the attitude, and also you see a wholly different story. Photos by the creator

The image above shows one in every of the apartment buildings in my neighborhood. Now, take a have a look at the photo on the left and picture one in every of the apartments within the white constructing is up on the market. You’re considering buying it. You’ll likely give attention to the immediate surroundings, especially as presented in the vendor’s photos. Notice anything unusual? Probably not, at the least not instantly.

Should the immediate setting be a dealbreaker? For my part, not necessarily. It’s not probably the most picturesque or charming spot—only a typical block in a median neighborhood in Warsaw. Or is it?

Let’s take a brief walk around to the back of the constructing. And… surprise: there’s a public lavatory right there. Still feel good in regards to the location? Possibly yes, possibly no. One thing is obvious: you’d need to know that a public toilet sits slightly below your future balcony.

Moreover, the apartment is positioned within the lower a part of the constructing, while the remainder of the towers rise above it. That is one other factor which may be significant. Each these “issues” obviously may be brought up in price negotiations.

This easy example illustrates how easily stories (on this case, using photos) may be misinterpreted. From one angle, all the things looks superb, even inviting. Take a couple of steps to the correct, and… whoops.

The identical situation can occur in our “skilled” lives. What if audiences, convinced they’re making informed, data-backed decisions, are being subtly steered within the fallacious direction—not by false data, but by the best way it’s presented?

This post builds on an article I wrote in 2024 about misleading visualizations [1]. Here, I would like to take a bit broader perspective, exploring how the structure and flow of a story itself can unintentionally (or deliberately) lead people to incorrect conclusions, and the way we are able to avoid that.

Data storytelling is subjective

We regularly wish to imagine that “data speaks for itself.” But in point of fact, it rarely does. Every chart, dashboard, or headline built around a dataset is formed by human selections:

  • what to incorporate,
  • what to depart out,
  • frame the message?

This highlights a core challenge of data-driven storytelling: it’s inherently subjective. That subjectivity comes from the discretion we’ve in proving the purpose we intend to make:

  • selecting which data to present,
  • choosing appropriate evaluation technique,
  • deciding on arguments to stress,
  • and even what to to make use of.

Subjectivity also lies in interpretation — each ours and our audience’s — and of their willingness to act on the data. This opens the door to biases. If we will not be careful, we are able to easily cross the road from subjectivity into unethical storytelling.

This text examines the hidden biases embedded in data storytelling and the way we are able to transition from manipulation to meaningful insights.

We’d like stories

Subjective or not, we’d like stories. Stories are essential to us because they assist make sense of the world. They carry our values, preserve our history, and spark our imagination. Through stories, we connect with others, learn from past experiences, and explore what it means to be human. Irrespective of your nationality, culture, or religion, we’ve all heard countless stories which have shaped us. Told us by our grandparents, parents, teachers, friends, and colleagues at work. Stories evoke emotion, encourage motion, and shape our identity, each individually and collectively. In every culture and across all ages, storytelling has been a robust technique of understanding life, sharing knowledge, and constructing community.

But while stories can enlighten, they can even mislead. A compelling narrative has the facility to shape perception, even when it distorts facts or oversimplifies complex issues. Stories often depend on emotion, selective detail, and a transparent message, which may make them persuasive, but additionally dangerously reductive. When used carelessly or manipulatively, storytelling can reinforce biases, obscure truth, or drive decisions based more on feeling than reason.

In the following a part of this text, I’ll explore the potential problems with stories — especially in data-driven contexts — and the way their power can unintentionally (or intentionally) misguide our understanding.

Image 2. Stories have all the time been a necessary a part of our lives. Image generated by the creator in ChatGPT.

Narrative biases in data-driven storytelling

Bias 1. Data is much, far-off from interpretation

Here’s an example of a visible from a report titled “Kentucky Juvenile Justice Reform Evaluation: Assessing the Effects of SB 200 on Youth Dispositional Outcomes and Racial and Ethnic Disparities.”

Image 3. Image from “Kentucky Juvenile Justice Reform Evaluation…”, page 18 of the report.

The graph shows that young offenders in Kentucky are less prone to reoffend if, after their first offense, they’re routed through a diversion program. This program connects them with community support, reminiscent of social staff and therapists, to handle deeper life challenges. That’s a robust narrative with real-world implications: it supports reducing our reliance on an expensive criminal justice system, justifies increased funding for non-profits, and points toward meaningful ways to enhance lives.

But here’s the issue: unless you have already got strong data literacy and subject knowledge, those conclusions will not be immediately obvious from the graph. While the report does make this point, it doesn’t accomplish that until nearly 20 pages later. This can be a classic example of how the structure of educational reporting can mute the story’s impact. It results from the indisputable fact that data is presented visually in a single section and interpreted textually in several (and sometimes distant) sections of the document.

Bias 2. The Tale of the Missing Map: Selection Bias

Image 4. Photo Ashleigh Shea, Unsplash

Selecting which data points (cherries 😊) to incorporate (and which to disregard) is one in every of the strongest — and sometimes most missed — acts of bias. And maybe no industry illustrated this higher than Big Tobacco.

The now-famous summary of their legal strategy says all of it:

Yes, smoking causes lung cancer, but not in individuals who sue us.

That quote perfectly captures the tone of tobacco litigation within the late twentieth century, where firms faced a wave of lawsuits from customers affected by diseases linked to smoking. Despite overwhelming medical and scientific consensus, tobacco firms routinely deflected responsibility using a series of arguments that, while sometimes legally strategic, were scientifically absurd.

Listed here are 4 of probably the most egregious cherry-picking tactics they utilized in court, based on this text [2].

Cherry-pick tactic 1: use “exception fallacy” tactic in legal or rhetorical contexts.

Yes, smoking causes cancer — but not this one.

  • The plaintiff had a rare type of cancer, like bronchioloalveolar carcinoma (BAC) or mucoepidermoid carcinoma, which they claimed weren’t conclusively linked to smoking.
  • In a single case, they argued the cancer was from the thymus, not the lungs, despite overwhelming medical evidence.

Cherry-pick tactic 2: Highlight obscure exceptions or rare cancer types to challenge general epidemiological evidence.

It wasn’t our brand.

  • “Sure, tobacco could have caused the disease — but not our cigarettes.”
  • In Ierardi v. Lorillard, the corporate argued that the plaintiff’s exposure to asbestos-laced cigarette filters (Micronite) occurred outside the narrow 4-year window once they were used, regardless that 585 million packs were sold during that point.

Cherry-pick tactic 3: Concentrate on brand or product variation as a strategy to shift blame.

In several cases, reminiscent of Ierardi v. Lorillard and Lacy v. Lorillard, the defense admitted that cigarettes may cause cancer but argued that the plaintiff:

  • Didn’t use their brand on the time of exposure,
  • Or didn’t use the particular version of the product that was most dangerous (e.g., Kent cigarettes with the asbestos-containing Micronite filter),
  • Or didn’t use the particular version of the product that was most dangerous (e.g., Kent cigarettes with the asbestos-containing Micronite filter),
  • window years ago, making it unlikely the plaintiff was exposed.

This tactic shifts the narrative from

Our product caused harm.

to

Possibly smoking caused harm—but not ours.

Cherry-pick tactic 4: Emphasize every other possible risk factor — no matter plausibility — to deflect from tobacco’s role.

There have been other risk aspects.

  • In lots of lawsuits, firms pointed to alternative causes of illness: asbestos, diesel fumes, alcohol, genetics, food regimen, obesity, and even spicy food.
  • In Allgood v. RJ Reynolds, the defense blamed the plaintiff’s condition partly on his fondness for “Tex-Mex food.”

Cherry-picking isn’t all the time obvious. It will possibly hide in legal defenses, marketing copy, dashboards, and even academic reports. But when only the info that serves the story gets told, it stops being insight and starts becoming manipulation.

Bias 3: The Mirror within the Forest: How the Same Data Tells Different Tales

How we phrase results can skew interpretation. Should we are saying “Unemployment drops to 4.9%” or “Tens of millions still jobless despite gains”? Each may be accurate. The difference lies in emotional framing.

In essence, framing is a strategic storytelling technique that may significantly impact how a story is received, understood, and remembered. By understanding the facility of framing, storytellers can craft narratives that resonate deeply with their audience and achieve their desired goals. I present some examples in Table 1.

  Frame A Frame B Objective description
Unemployment Suggests progress, recovery, and robust leadership. Highlights the persistent problem and unmet needs. A modest drop within the unemployment rate.
Vaccine Effectiveness Emphasizes protection, encourages uptake. Focuses on vulnerability and doubt. A clinical trial showed a 95% relative risk reduction.
Climate Data Calls attention to the worldwide crisis. Implies nothing unusual is going on. Long-term temperature records.
Company Financial Reports Celebrates short-term gain. Signals underperformance in the long term. Quarterly earnings report.
Election Polls Creates a way of momentum. Emphasizes uncertainty. A poll with +/- 3% margin.
Health Warnings Sounds scientific, neutral. Sounds excessive and dangerous. 25 grams of sugar.
Table 1. Other ways of framing the identical story. Examples generated by the creator using ChatGPT.

Bias 4: “The Dragon of Design: How Beauty Beguiles the Truth”

Visuals simplify data, but they can even manipulate perception. In my older article [1], I listed 14 deceptive visualization tactics. Here’s a summary of them.

  1. Using the fallacious chart type: Selecting charts that confuse quite than make clear — like 3D pie charts or inappropriate comparisons — makes it harder to see the story the info tells.
  2. Adding distracting elements: Stuffing visuals with logos, decorations, dark gridlines, or clutter hides the essential insights behind noise and visual overload.
  3. Overusing colours: Using too many colours can distract from the main focus. And not using a clear color hierarchy, nothing stands out, and the viewer is overwhelmed.
  4. Random data ordering: Scrambling categories or time series data obscures patterns and prevents clear comparisons.
  5. Manipulating axis scales: Truncating the y-axis exaggerates differences. Extending it minimizes meaningful variation. Each distort perception.
  6. Creating trend illusions: Using inconsistent time frames, selective data points, or poorly spaced axes to make non-trends look significant.
  7. Cherry-picking data: Only showing the parts of the info that support your point, ignoring the total story or contradicting evidence.
  8. Omitting visual cues: Removing labels, legends, gridlines, or axis scales to make data hard to interpret, or hard to challenge.
  9. Overloading charts: Packing an excessive amount of data into one chart may be distracting and confusing, especially when critical data is buried in visual chaos.
  10. Showing only cumulative values: Using cumulative plots to imply smooth progress while hiding volatility or declines in individual periods.
  11. Using 3D effects: 3D charts skew perception and make comparisons tougher, often resulting in misleading details about size or proportion.
  12. Applying gradients and shading: Fancy textures or gradients shift focus and add visual weight to areas that may not deserve it.
  13. Misleading or vague titles: A neutral or technical title can downplay the urgency of findings. A dramatic one can exaggerate a minor change.
  14. Using junk charts: Visually overdesigned, complex, or overly artistic charts which are hard to interpret and straightforward to misread.

Bias 5: “The Story-Spinning Machine: But Who Holds the Thread?”

Modern tools like Power BI Copilot or Tableau Pulse are increasingly generating summaries and “insights” in your behalf. Not to say crafting summaries, narratives, or whole presentations prepared by LLMs like ChatGPT or Gemini.

But here’s the catch:
These tools are trained on patterns, not ethics.

AI can’t tell when it’s making a misleading story. In case your prompt or dataset is biased, the output will likely be biased as well, and at a machine scale.

This raises a critical query: Are we using AI to democratize insight, or to mass-produce narrative spin?

Image 5: Photo by Aerps.com on Unsplash

A recent BBC investigation found that leading AI chatbots continuously distort or misrepresent current events, even when using BBC articles as their source. Over half of the tested responses contained significant issues, including outdated facts, fabricated or altered quotes, and confusion between opinion and reporting. Examples ranged from incorrectly stating that Rishi Sunak was still the UK prime minister to omitting key legal context in high-profile criminal cases. BBC executives warned that these inaccuracies threaten public trust in news and urged AI firms to collaborate with publishers to enhance transparency and accountability.[3]

Feeling overwhelmed? You’ve only seen the start. Data storytelling can fall prey to quite a few cognitive biases, each subtly distorting the narrative.

Take confirmation bias, where the storyteller highlights only data that supports their assumptions—proclaiming, —while ignoring contradictory evidence. Then there’s end result bias, which credits success to sound strategy: —even when luck played a significant role.

Survivorship bias focuses only on the winners—startups that scaled or campaigns that went viral—while neglecting the various that failed using the identical methods. Narrative bias oversimplifies complexity, shaping messy realities into tidy conclusions, reminiscent of without sufficient context.

Anchoring bias causes people to fixate on the primary number presented—like a 20% forecast—distorting how subsequent information is interpreted. Omission bias arises when essential data is disregarded, for example, only highlighting top-performing regions while ignoring underperforming ones.

Projection bias assumes that others interpret data the identical way the analyst does: —yet it might not, especially for stakeholders unfamiliar with the context. Scale bias misleads with disproportionate framing— sounds impressive until you learn it went from only one to a few users.

Finally, causality bias draws unfounded conclusions from correlations: —without testing whether popups were the actual cause.

The right way to “Unbias” Data Storytelling

Every data story is a selection. In a world where attention spans are short and AI writes faster than humans, those selections are more powerful — and dangerous — than ever.

As data scientists, analysts, and storytellers, we must approach narrative selections with the identical level of rigor and thoughtfulness that we apply to statistical models. Crafting a story from data isn’t nearly clarity or engagement—it’s about responsibility. Every selection we make in framing, emphasis, and interpretation shapes how others perceive the reality. And at the tip of the day, probably the most dangerous stories will not be the false ones—they’re those that feel like facts.

On this a part of the article, I’ll share several practical strategies to assist you strengthen your data storytelling. These ideas will give attention to be each compelling and credible— craft narratives that engage your audience without oversimplifying or misleading them. Because when done well, data storytelling doesn’t just communicate insight—it builds trust.

Strategy 1: The Sensible Wizard’s Rule: Ask, Don’t Enchant

On this planet of knowledge and evaluation, probably the most insightful storytellers don’t announce their conclusions with dramatic flair—they lead with thoughtful questions. As a substitute of presenting daring declarations, they invite reflection by asking, “What do you see?” This approach encourages others to find insights on their very own, fostering understanding quite than passive acceptance.

Consider a graph showing a decline in test scores. A surface-level interpretation might immediately claim, “Our schools are failing,” sparking concern or blame. But a more careful, analytical response could be, Similarly, when sales rise following the launch of a brand new feature, it’s tempting to attribute the rise solely to the feature. Yet a more rigorous approach would ask,

By leading with questions, we create space for interpretation, dialogue, and deeper considering. This method guards against false certainty and encourages a more collaborative, thoughtful exploration of knowledge. A powerful narrative should guide the audience, quite than forcing them toward a predetermined conclusion.

Strategy 2: The Mirror of Many Truths: Offer Counter-Narratives

Good data storytelling doesn’t stop at a single interpretation. Complex datasets often allow for multiple valid perspectives, and it’s the storyteller’s responsibility to acknowledge them. Presenting a counter-narrative—“here’s one other strategy to have a look at this”—invites critical considering and builds credibility.

For instance, a chart may show that heart disease rates are declining overall. That looks like a hit. But a more in-depth look may reveal that the development is concentrated in higher-income areas, while rates in rural or underserved communities remain high. Presenting each views—progress and disparity—provides a more comprehensive and honest picture of the problem.

By offering counter-narratives, we guard against oversimplification and help our audience understand the nuance behind the numbers.

Image 6. Adding the income class dimension allows for higher insight discovery. Chart generated in ChatGPT, fake data.

Strategy 3: The Curse of Crooked Charts: Avoid Deceptive Visuals

Visuals are powerful, but that power have to be used responsibly. Misleading charts can distort perception through subtle tricks, reminiscent of truncated axes that exaggerate differences, unlabeled units that obscure the dimensions, or decorative clutter that distracts from the message. To avoid these pitfalls, all the time clearly label axes, start scales from zero when appropriate, and select chart types that best fit the info, not only their aesthetic appeal. Deception doesn’t all the time come from malice—sometimes it’s just careless design. But either way, it erodes trust. A clean, honest visual is much more persuasive than a flashy one which hides the main points.

Image 7. Two versions of the identical visual. One is telling the story, the opposite…?. Image by the creator.

Take, for instance, the 2 charts shown in Image 7. The one on the left is cluttered and hard to interpret. Its title is vague, the excessive use of color is distracting, and unnecessary elements—like heavy borders, gridlines, and shading—only add to the confusion. There aren’t any visual cues to guide the viewer, leaving the audience to guess what the creator is attempting to say.

In contrast, the chart on the correct is much simpler. It strips away the noise, using just three colours: grey for context, blue to focus on key information, and a clean white background. Most significantly, the title conveys the fundamental message, allowing the audience to know the purpose at a look.

Strategy 4: Speak Truthfully of Shadows: The Wisdom of Embracing Uncertainty

Uncertainty is an inherent a part of working with data, and acknowledging it doesn’t weaken your story—it strengthens your credibility. Transparency around uncertainty is a trademark of responsible data communication. If you communicate elements like confidence intervals, margins of error, or the assumptions behind a model, you’re not only being technically accurate—you’re demonstrating honesty and humility. It shows that you simply respect your audience’s ability to interact with complexity, quite than oversimplifying to take care of a clean narrative.

Uncertainty can arise from various sources, including limited sample sizes, noisy or incomplete data, changing conditions, or the assumptions inherent in predictive models. As a substitute of ignoring or smoothing over these limitations, good storytellers bring them to the forefront—visually and verbally. Doing so encourages critical considering and opens the door for discussion. It also protects your work from misinterpretation, misuse, or overconfidence in results. In brief, by being open about what the info can’t tell us, we give more weight to what it will probably. Below, I present several examples of how you would include information on uncertainty in your data story.

  1. Update on confidence intervals
    As a substitute of:
    Use:
  2. Leave a margin of error.
    As a substitute of:
    Use:
  3. Missing data indicators
    Use visual cues, reminiscent of faded bars, dashed lines, or shaded areas, on charts to point gaps.
    Add footnotes:
  4. Model assumptions
    Example:
  5. Multiple scenarios
    Present best-case, worst-case, and most-likely scenarios to reflect a variety of possible outcomes.
  6. Probabilistic language
    As a substitute of:
    Use:
  7. Data quality notes
    Highlight issues like small sample sizes or self-reported data:
  8. Error bars on charts
    Visually show uncertainty by including error bars or shaded confidence bands in graphs.
  9. Transparency in limitations
    Example:
  10. Qualitative clarification
    Use captions or callouts in presentations or dashboards:

You may wonder, “ Quite the opposite, acknowledging uncertainty doesn’t signal a insecurity; it shows depth, professionalism, and integrity. It conveys to your audience that you simply understand the complexity of the info and will not be attempting to oversell a simplistic conclusion. Sharing what you do know, alongside what you don’t, creates a more balanced and credible narrative. Individuals are much more prone to trust your insights once they see that you simply’re being honest about the restrictions. It’s not about dampening your story—it’s about grounding it in point of fact.

Strategy 5: Reveal the Roots of the Tale: Let Truth Travel with Its Sources

Every story needs roots, and on the planet of knowledge storytelling, those roots are your sources. A wonderful chart or striking number means little in case your audience can’t see where it got here from. Was it a randomized survey? Administrative data? Social media scraping? Similar to a traveler trusts a guide who knows the trail, readers usually tend to trust your insights once they can trace them back to their origins. Transparency about data sources, collection methods, assumptions, and even limitations isn’t an indication of weakness—it’s a mark of integrity. Once we reveal the roots of the story, we give our story depth, credibility, and resilience. Informed decisions can only grow in well-tended soil.

Image 8: Image generated by the creator in ChatGPT.

Closing remarks

Data-driven storytelling is each an art and a responsibility. It gives us the facility to make information meaningful—but additionally the facility to mislead, even unintentionally. In this text, we’ve explored a forest of biases, design traps, and narrative temptations that may subtly shape perception and warp the reality. Whether you’re an information scientist, communicator, or decision-maker, your stories carry weight—not only for what they show, but for the way they’re told.

So allow us to tell stories that illuminate, not obscure. Allow us to lead with questions, not conclusions. Allow us to reveal uncertainty, not hide behind false clarity. And above all, allow us to anchor our insights in transparent sources and humble interpretation. The goal isn’t perfection—it’s integrity. Because in a world full of noise and narrative spin, probably the most powerful story you possibly can tell is one which’s each clear and honest.

Ultimately, storytelling isn’t about controlling the message—it’s about earning trust. And trust, once lost, isn’t easily won back. So select your stories rigorously. Shape them with care. And remember: the reality may not all the time be flashy, nevertheless it all the time finds its strategy to the sunshine.

And another thing: in the event you’ve ever spotted (or unintentionally created) a biased data story, share your experience within the comments. The more we surface these narratives, the higher all of us get at telling data truths, not only data tales.

References

[1] How to not Cheat with Data Visualizations, Michal Szudejko, Towards Data Science

[2] Tobacco manufacturers’ defence against plaintiffs’ claims of cancer causation: throwing mud on the wall and hoping a few of it can stick, Multiple Authors, National Library of Medicine

[3] AI chatbots distort and mislead when asked about current affairs, BBC finds, Matthew Weaver

Disclaimer

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x