Visualization of Data with Pie Charts in Matplotlib

-

Examples of the way to create various kinds of pie charts using Matplotlib to visualise the outcomes of database evaluation in a Jupyter Notebook with Pandas

Photo by Niko Nieminen on Unsplash

While working on my Master’s Thesis titled “Aspects Related to Impactful Scientific Publications in NIH-Funded Heart Disease Research”, I actually have used various kinds of pie charts as an example a number of the key findings from the database evaluation.

A pie chart could be an efficient selection for data visualization when a dataset comprises a limited variety of categories representing parts of a complete, making it well-suited for displaying categorical data with an emphasis on comparing the relative proportions of every category.

In this text, I’ll exhibit the way to create 4 various kinds of pie charts using the identical dataset to supply a more comprehensive visual representation and deeper insight into the info. To attain this, I’ll use Matplotlib, Python’s plotting library, to display pie chart visualizations of the statistical data stored within the dataframe. When you are usually not acquainted with Matplotlib library, a superb start is Python Data Science Handbook by Jake VanderPlas, specifically chapter on Visualization with Matplotlib and matplotlib.org.

First, let’s import all of the needed libraries and extensions:

Next, we’ll prepare the CSV file for processing:

The mini dataset utilized in this text highlights the highest 10 journals for heart disease research publications from 2002 to 2020 and is a component of a bigger database collected for the Master’s Thesis research. The columns “Female,” “Male,” and “Unknown” represent the gender of the primary writer of the published articles, while the “Total” column reflects the full variety of heart disease research articles published in each journal.

Image by the writer and represents output of the Pie_Chart_Artcile_2.py sample code above.

For smaller datasets with fewer categories, a pie chart with exploding slices can effectively highlight a key category by pulling it out barely from the remainder of the chart. This visual effect draws attention to specific categories, making them stand out from the entire. Each slice represents a portion of the full, with its size proportional to the info it represents. Labels could be added to every slice to point the category, together with percentages to point out their proportion to the full. This visual technique makes the exploded slice stand out without losing the context of the complete data representation.

Image by the writer and represents output of the Pie_Chart_Artcile_3.py sample code above.

The identical exploding slices technique could be applied to all other entries within the sample dataset, and the resulting charts could be displayed inside a single figure. This sort of visualization helps to spotlight the over representation or under representation of a selected category throughout the dataset. In the instance provided, presenting all 10 charts in a single figure reveals that not one of the top 10 journals in heart disease research published more articles authored by women than men, thereby emphasizing the gender disparity.

Gender distributions for top 10 journals for heart disease research publications, 2002–2020. Image by the writer and represents output of the Pie_Chart_Artcile_4.py sample code above.

A variation of the pie chart, often called a donut chart, can be used to visualise data. Donut charts, like pie charts, display the proportions of categories that make up an entire, but the middle of the donut chart can be utilized to present additional data. This format is less cluttered visually and may make it easier to match the relative sizes of slices in comparison with a typical pie chart. In the instance utilized in this text, the donut chart highlights that among the many top 10 journals for heart disease research publications, the American Journal of Physiology, Heart and Circulatory Physiology published probably the most articles, accounting for 21.8%.

Image by the writer and represents output of the Pie_Chart_Artcile_5.py sample code above.

We are able to enhance the visualization of additional information from the sample dataset by constructing on the previous donut chart and making a nested version. The add_artist() method from Matplotlib’s figure module is used to include any additional Artist (equivalent to figures or objects) into the bottom figure. Much like the sooner donut chart, this variation displays the distribution of publications across the highest 10 journals for heart disease research. Nevertheless, it also includes an extra layer that shows the gender distribution of first authors for every journal. This visualization highlights that a bigger percentage of the primary authors are male.

Image by the writer and represents output of the Pie_Chart_Artcile_6.py sample code above.

In conclusion, pie charts are effective for visualizing data with a limited variety of categories, as they allow viewers to quickly understand crucial categories or dominant proportions at a look. On this specific example, the usage of 4 various kinds of pie charts provides a transparent visualization of the gender distribution amongst first authors in the highest 10 journals for heart disease research publications, based on the 2002 to 2020 mini dataset utilized in this study. It is obvious that a better percentage of the publication’s first authors are males, and not one of the top 10 journals for heart disease research published more articles authored by females than by males in the course of the examined period.

Jupyter Notebook and dataset used for this text could be found at GitHub

Thanks for reading,

Diana

Note: I used GitHub embeds to publish this text.

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x