Data Visualization Explained: What It Is and Why It Matters

-

attract all of the hype today inside data science, but I’d argue they’re each secondary to a more vital—and often-ignored—section of the sector.

When coping with data, there are two essential steps:

  1. Processing and analyzing the info to extract meaningful insights.

The second point is crucial and infrequently missed. The world’s most advanced algorithm or useful insight is useless if nobody can understand it. As a knowledge scientist, you will need to learn to convey your insights to others. There’s multiple reason for this, with the apparent one being that if the proper people understand the info, the world at large will profit. Nonetheless, there may be one other equally vital reason: It is usually in describing our findings to others that we discover errors, more profound knowledge, or further areas for exploration.

In this text, we’ll examine a strong and effective tool which will help achieve the second step above: data visualization. That is the primary in a series of articles that may take absolute beginners deep into the realm of information visualization. This primary article is general and lightweight, intended as an introduction to the sector as a complete. In later articles, I’ll get into the more technical points, eventually concluding by teaching you the right way to construct your individual data visualizations.

With that knowledge, you’ll be armed to tackle your data in recent, exciting ways.

What Counts as a Data Visualization?

Many individuals view data visualization through a restricted lens, only classifying standard graphs, reminiscent of bar charts, line charts, and the like, as true data visualizations. Viewed from this angle, data visualization didn’t materialize until the center of the 18th century. (We’ll see some examples below.)

Nonetheless, we might do well to broaden our minds. Visual transformations of information are under no circumstances limited to our traditional ideas. They’ve been around for hundreds of years. For instance, here is the [1], the oldest known map on the planet, discovered as a relic of the traditional city of Babylon:

Image Source: Wikimedia Commons

This map places Babylon at the middle and was likely a particularly useful gizmo for visualizing what we now formally call geospatial data. It’s certainly one of the world’s earliest data visualizations.

There are a plethora of comparable figures and pictures from various ancient civilizations—cave paintings, calendars, stone carvings, even Egyptian hieroglyphics—these are all effectively visual representations of information that were obscure of their initial form. Viewing these examples as data visualizations leads us to a crucial principle:

At its core, data visualization is nothing greater than taking some data—be it numerical, textual, or otherwise—and applying a metamorphosis to represent it visually.

This foundational principle results in several related topics primarily involving essentially the most effective methods to conduct these transformations, where loosely translates to “honest, easy to grasp, and informative.”

Early Examples of Data Visualizations

Now that we’ve got broadened our perspectives concerning what constitutes a knowledge visualization, allow us to take a take a look at some modern examples. Below is a chart from 1644 developed by Michael Florent Van Langren [2]. It’s certainly one of the earliest graphical representations of what we consider to be traditional statistical data, depicting estimates of the difference in longitude between Rome and Toledo.

This map depicts 12 estimates of the difference in longitude between the cities of Rome and Toledo.

Let’s consider a more involved example next—one which directly highlights Tukey’s quote above.

Below is a map of London’s Soho District in 1854 [3]. It was designed by John Snow with the intention to determine if there have been any patterns within the cholera outbreak that was debilitating the town on the time:

A map of London’s Soho District depicting deaths from cholera during an outbreak in 1854. Image Source: Picryl Public Domain

Looking toward the middle of the map, we will see an exceptionally large variety of deaths near the water pump on Broad Street. An investigation determined that this pump was contaminated and was a significant explanation for the spread of the disease.

This instance highlights precisely the principle from John Tukey we noted above: The most effective uses of information visualization is to quickly see insights which are difficult to search out in the info’s initial form.

Precision and Flexibility

Data visualization is a broad and deep topic that will be approached in some ways. That said, there are two principles that you need to take into account regardless of the precise form of information visualization you engage in: precision and flexibility.

A superb data visualization doesn’t try to perform ill-defined tasks, reminiscent of displaying the or summarizing about a knowledge set. Statements like these are subjective and essentially unattainable to attain.

Quite, a very good data visualization highlights a selected and well-defined aspect of the relevant data in a way that makes it easier to grasp for the user. It is best to at all times articulate exactly what you desire to express about your data before you even begin designing a visualization.

To internalize this principle, it is useful to recall what the aim of a knowledge visualization is to start with: to display insights from a knowledge set in a transparent and useful way. . Being precise ensures we achieve this goal. A visualization that attempts to do an excessive amount of might find yourself confusing the viewer much more. It’s a lot better to provide a visualization which covers less data in a clearer way. Quality is more vital than quantity.

Take a take a look at the info table below, which accommodates details about salaries from different cities around the US.

Name City Income Occupation
Sarah Mitchell Denver, CO $72,500 Marketing Manager
Jamal Rodriguez Houston, TX $58,300 Electrician
Priya Desai Seattle, WA $91,200 Software Engineer
Thomas Nguyen Chicago, IL $64,800 Nurse

Which of the next is the higher visualization selection for the above data?

  1. A visualization that attempts to simplify the knowledge in the info table using a bar chart that has names on one axis and salaries on the opposite axis, uses color to distinguish amongst cities, and uses a texture on the bars (dashed lines, diagonal lines, etc.) to differentiate amongst careers.
  2. The identical visualization as above, but this time excluding the majors. In other words, a bar chart of names and salaries which colours the bars based on location.

It’s tempting to decide on the primary one, but the actual fact is, it tries to do an excessive amount of. Higher to display limited, targeted information than to confuse your audience.

Along with being precise, maintaining flexibility can be vital. There is no such thing as a such thing as an ideal data visualization. There’s at all times room for improvement, and data visualizations generally change into higher with each revision. In fact, in some unspecified time in the future, a knowledge visualization should be shared with others and serve its purpose.

This results in a quandary—how much revision is enough revision? There is no such thing as a definitive answer to this query. The technique of revising a visualization should be undertaken with care. Asking too many individuals for advice will likely end in a bunch of half-baked, conflicting opinions. Alternatively, publishing the primary draft of a visualization—i.e., not revising it in any respect—is prone to result in a subpar result.

Although there is no such thing as a perfect solution, there are just a few guidelines you may follow:

  • Discover 2-3 people to provide you feedback in your visualization.
  • Try to make sure your list of individuals encompasses the next:
    • A reviewer who’s proficient in designing data visualizations
    • A reviewer who has a robust understanding of the info that’s getting used to develop the visualization (e.g., a political scientist for election data)
    • A reviewer who is a component of the intended audience for the visualization
  • Undergo 2-3 rounds of feedback and revision with this same list of individuals. It will make sure that improvements to the visualization are continuous and logical.

Final Thoughts and Looking Forward

In some ways, data visualization is akin to writing. Even essentially the most prolific and talented authors have editors, and their books undergo extensive revision before being approved for publishing. Why? For the straightforward reason that good writing is essentially depending on the audience, and thoroughly curated revision ensures the perfect experience for the eventual readers of a book. The identical idea applies to data visualization.

By following these guidelines, you may make sure you develop a strong data visualization which is grounded in best practices, appropriately displays the info at hand, and is comprehensible for the intended audience.

They’re the important thing to effective data visualization, and the muse for advanced visualization techniques that will likely be discussed in future articles. Until then.

References

[1] https://commons.wikimedia.org/wiki/File:The_Babylonian_map_of_the_world,_from_Sippar,_Mesopotamia..JPG
[2] , Edward Tufte
[3] https://picryl.com/media/snow-cholera-map-1-cbadea

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x