Tips on how to Overlay a Heatmap on a Real Map with Python

need a straightforward, engaging, and scalable method for presenting geospatial data, they often turn to a . This 2D visualization divides a map into equal-sized grid cells and uses color to represent the magnitude of aggregated data values inside the cells.

Overlaying heatmaps on geographical maps permits quick visualization of spatial phenomena. Patterns equivalent to clusters, hotspots, outliers, or gradients grow to be immediately obvious. This format will be useful to decision-makers and the general public, who won’t be well-versed in raw statistical outputs.

Heatmaps could also be composed of square cells (called or heatmaps) or easily contoured “continuous” values (called or heatmaps). The next maps show the density of tornado locations using each approaches.

Example of a grid-based heatmap (top) and kernel-density heatmap (bottom) (by the writer)

When you squint your eyes while the highest map, it’s best to see trends much like those in the underside map.

I prefer grid-based heatmaps since the sharp, distinct boundaries make it easier to check the values of adjoining cells, and outliers don’t get “smoothed away.” I even have a soft spot for his or her pixelated appearance as my first video games were Pong and Wolfenstein 3D.

As well as, kernel density heatmaps will be computationally expensive and sensitive to the input parameters. Their appearance is very depending on the chosen kernel function and its bandwidth or radius. A poor parameterization selection can either over-smooth or under-smooth the info, obscuring patterns.

On this project, we’ll use Python to make static, grid-based heatmaps for tornado activity within the continental United States.

The Dataset

For this tutorial, we’ll use tornado data from the wonderful public-domain database. Extending back to 1950, it covers tornado starting and ending locations, magnitudes, injuries, fatalities, financial costs, and more.

The info is accessible through NOAA’s Storm Prediction Center. The CSV-format dataset, highlighted yellow in the next figure, covers the period 1950 to 2023. Don’t trouble downloading it. For convenience, I’ve provided a link within the code to access it programmatically.

From the NOAA Severe Weather Database download page (by the writer)

This CSV file incorporates 29 columns for nearly 69,000 tornadoes. You could find a key to the column names here. We’ll work primarily with the tornado starting locations (slat, slon), the yr (yr), and the storm magnitudes (mag).

Installing Libraries

You’ll wish to install NumPy, Matplotlib, pandas, and GeoPandas. The previous links provide installation instructions. We’ll also use the Shapely library, which is an element of the GeoPandas installation.

For reference, this project used the next versions:

python 3.10.18

numpy 2.2.5

matplotlib 3.10.0

pandas 2.2.3

geopandas 1.0.1

shapely 2.0.6

The Simplified Code

It doesn’t take numerous code to overlay a heatmap on a geographical map. The next code illustrates the essential process, although much of it serves ancillary purposes, equivalent to limiting the US data to the Lower 48 states and improving the look of the colorbar.

In the following section, we’ll refactor and expand this code to perform additional filtering, employ more configuration constants for straightforward updates, and use helper functions for clarity.

# --- Plotting ---
print("Plotting results...")
fig, ax = plt.subplots(figsize=FIGURE_SIZE)
fig.subplots_adjust(top=0.85)  # Make room for titles.

# Plot state boundaries and heatmap:
clipped_states.plot(ax=ax, color='none', 
                    edgecolor='white', linewidth=1)
vmax = np.max(heatmap)
img = ax.imshow(heatmap.T,
                extent=[x_edges[0], x_edges[-1], 
                        y_edges[0], y_edges[-1]],
                origin='lower', 
                cmap=cmap, norm=norm, 
                alpha=1.0, vmin=0, vmax=vmax)

# Annotations:
plt.text(TITLE_X, TITLE_Y, 'Tornado Density Heatmap', 
         fontsize=22, fontweight='daring', ha='left')
plt.text(x=SUBTITLE_X, y=SUBTITLE_Y, s=(
         f'{START_YEAR}–{END_YEAR}    EF Magnitude 3–5    ' 
         f'{GRID_SIZE_MILES}×{GRID_SIZE_MILES} mile cells'),
         fontsize=15, ha='left')
plt.text(x=SOURCE_X, y=SOURCE_Y, 
         s='Data Source: NOAA Storm Prediction Center',
         color='white', fontsize=11, 
         fontstyle='italic', ha='left')

# Clean up the axes:
ax.set_xlabel('')
ax.set_ylabel('')
ax.axis('off')

# Post the Colorbar:
ticks = np.linspace(0, vmax, 6, dtype=int)
cbar = plt.colorbar(img, ax=ax, shrink=0.6, ticks=ticks)
cbar.set_label('nTornado Count per Grid Cell', fontsize=15)
cbar.ax.set_yticklabels(list(map(str, ticks)))

print(f"Saving plot as '{SAVE_FILENAME}'...")
plt.savefig(SAVE_FILENAME, bbox_inches='tight', dpi=SAVE_DPI)
print("Plot saved.n")
plt.show()

Here’s the result:

Tornado density heatmap of all recorded tornado starting locations (by the writer)

The Expanded Code

The next Python code was written in JupyterLab and is described by cell.

Importing Libraries / Assigning Constants

After importing the obligatory libraries, we define a series of configuration constants that allow us to simply adjust the filter criteria, map boundaries, plot dimensions, and more. For this evaluation, we deal with tornadoes inside the contiguous United States, filtering out states and territories outside this area and choosing only significant events (those rated EF3 to EF5 on the Enhanced Fujita Scale) from the whole dataset spanning 1950 to 2023. The outcomes are aggregated to 50×50-mile grid cells.

import matplotlib.pyplot as plt
import pandas as pd
import geopandas as gpd
import numpy as np
from shapely.geometry import box


# --- Configuration Constants ---
# Data URLs:
TORNADO_DATA_URL = 'https://bit.ly/40xJCMK'
STATES_DATA_URL = ("https://www2.census.gov/geo/tiger/TIGER2020/STATE/"
                   "tl_2020_us_state.zip")

# Geographic Filtering:
EXCLUDED_STATES_ABBR = ['AK', 'HI', 'PR', 'VI']
TORNADO_MAGNITUDE_FILTER = [3, 4, 5]

# 12 months Filtering (inclusive):
START_YEAR = 1950
END_YEAR = 2023

# Coordinate Reference Systems (CRS):
CRS_LAT_LON = "EPSG:4326"  # WGS 84 geographic CRS (lat/lon)
CRS_ALBERS_EQUAL_AREA = "EPSG:5070"  # NAD83/Conus Albers (projected CRS in m)

# Box for Contiguous US (CONUS) in Albers Equal Area (EPSG:5070 meters):
CONUS_BOUNDS_MIN_X = -2500000
CONUS_BOUNDS_MIN_Y = 100000
CONUS_BOUNDS_MAX_X = 2500000
CONUS_BOUNDS_MAX_Y = 3200000

# Grid Parameters for Heatmap (50x50 mile cells):
GRID_SIZE_MILES = 50
HEATMAP_GRID_SIZE = 80500  # ~50 miles in meters.

# Plotting Configuration:
FIGURE_SIZE = (15, 12)
SAVE_DPI = 600
SAVE_FILENAME = 'tornado_heatmap.png'

# Annotation positions (in EPSG:5070 meters, relative to plot extent):
TITLE_X = CONUS_BOUNDS_MIN_X
TITLE_Y = CONUS_BOUNDS_MAX_Y + 250000  # Offset above max Y
SUBTITLE_X = CONUS_BOUNDS_MIN_X
SUBTITLE_Y = CONUS_BOUNDS_MAX_Y + 100000  # Offset above max Y
SOURCE_X = CONUS_BOUNDS_MIN_X + 50000  # Barely indented from min X
SOURCE_Y = CONUS_BOUNDS_MIN_Y + 20000  # Barely above min Y

Defining Helper Functions

The subsequent cell defines two helper functions for loading and filtering the dataset and for loading and filtering the state boundaries. Note that we call on previous configuration constants through the process.

# --- Helper Functions ---
def load_and_filter_tornado_data():
    """Load data, apply filters, and create a GeoDataFrame."""
    print("Loading and filtering tornado data...")
    df_raw = pd.read_csv(TORNADO_DATA_URL)
    
    # Mix filtering steps into one chained operation:
    mask = (~df_raw['st'].isin(EXCLUDED_STATES_ABBR) &
            df_raw['mag'].isin(TORNADO_MAGNITUDE_FILTER) &
            (df_raw['yr'] >= START_YEAR) & (df_raw['yr'] <= END_YEAR))
    df = df_raw[mask].copy()
    
    # Create and project a GeoDataFrame:
    geometry = gpd.points_from_xy(df['slon'], df['slat'], 
                                  crs=CRS_LAT_LON)
    temp_gdf = gpd.GeoDataFrame(df, geometry=geometry).to_crs(
                                CRS_ALBERS_EQUAL_AREA)
    return temp_gdf


def load_and_filter_states():
    """Load US state boundaries and filter for CONUS."""
    print("Loading state boundary data...")
    states_temp_gdf = gpd.read_file(STATES_DATA_URL)
    states_temp_gdf = (states_temp_gdf[~states_temp_gdf['STUSPS']
                  .isin(EXCLUDED_STATES_ABBR)].copy())
    states_temp_gdf = states_temp_gdf.to_crs(CRS_ALBERS_EQUAL_AREA)
    return states_temp_gdf

Note that the tilde (~) in front of the mask and states_temp_gdf expressions inverts the outcomes, so we select rows where the state abbreviation is within the excluded list.

Running Helper Functions and Generating the Heatmap

The next cell calls the helper functions to load and filter the dataset after which clip the info to the map boundaries, generate the heatmap (with the NumPy histogram2d() method), and define a continuous colormap for the heatmap. Note that we again call on previous configuration constants through the process.

# --- Data Loading and Preprocessing ---
gdf = load_and_filter_tornado_data()
states_gdf = load_and_filter_states()

# Create bounding box and clip state boundaries & tornado points:
conus_bounds_box = box(CONUS_BOUNDS_MIN_X, CONUS_BOUNDS_MIN_Y,
                       CONUS_BOUNDS_MAX_X, CONUS_BOUNDS_MAX_Y)
clipped_states = gpd.clip(states_gdf, conus_bounds_box)
gdf = gdf[gdf.geometry.within(conus_bounds_box)].copy()

# --- Heatmap Generation ---
print("Generating heatmap bins...")
x_bins = np.arange(CONUS_BOUNDS_MIN_X, CONUS_BOUNDS_MAX_X + 
                   HEATMAP_GRID_SIZE, HEATMAP_GRID_SIZE)
y_bins = np.arange(CONUS_BOUNDS_MIN_Y, CONUS_BOUNDS_MAX_Y + 
                   HEATMAP_GRID_SIZE, HEATMAP_GRID_SIZE)

print("Computing 2D histogram...")
heatmap, x_edges, y_edges = np.histogram2d(gdf.geometry.x, 
                                           gdf.geometry.y, 
                                           bins=[x_bins, y_bins])

# Define continuous colormap (e.g., 'hot', 'viridis', 'plasma', etc.):
print("Defining continuous colormap...")
cmap = plt.cm.hot
norm = None

print("Finished execution.")

Because this process can take a number of seconds, the print() function keeps us up-to-date on this system’s progress:

Loading and filtering tornado data...
Loading state boundary data...
Generating heatmap bins...
Computing 2D histogram...
Defining continuous colormap...
Finished execution.

Plotting the Results

The ultimate cell generates and saves the plot. Matplotlib’s imshow() method is answerable for plotting the heatmap. For more on imshow(), see this text.

# --- Plotting ---
print("Plotting results...")
fig, ax = plt.subplots(figsize=FIGURE_SIZE)
fig.subplots_adjust(top=0.85)  # Make room for titles.

# Plot state boundaries and heatmap:
clipped_states.plot(ax=ax, color='none', 
                    edgecolor='white', linewidth=1)
vmax = np.max(heatmap)
img = ax.imshow(heatmap.T,
                extent=[x_edges[0], x_edges[-1], 
                        y_edges[0], y_edges[-1]],
                origin='lower', 
                cmap=cmap, norm=norm, 
                alpha=1.0, vmin=0, vmax=vmax)

# Annotations:
plt.text(TITLE_X, TITLE_Y, 'Tornado Density Heatmap', 
         fontsize=22, fontweight='daring', ha='left')
plt.text(x=SUBTITLE_X, y=SUBTITLE_Y, s=(
         f'{START_YEAR}–{END_YEAR}    EF Magnitude 3–5    ' 
         f'{GRID_SIZE_MILES}×{GRID_SIZE_MILES} mile cells'),
         fontsize=15, ha='left')
plt.text(x=SOURCE_X, y=SOURCE_Y, 
         s='Data Source: NOAA Storm Prediction Center',
         color='white', fontsize=11, 
         fontstyle='italic', ha='left')

# Clean up the axes:
ax.set_xlabel('')
ax.set_ylabel('')
ax.axis('off')

# Post the Colorbar:
ticks = np.linspace(0, vmax, 6, dtype=int)
cbar = plt.colorbar(img, ax=ax, shrink=0.6, ticks=ticks)
cbar.set_label('nTornado Count per Grid Cell', fontsize=15)
cbar.ax.set_yticklabels(list(map(str, ticks)))

print(f"Saving plot as '{SAVE_FILENAME}'...")
plt.savefig(SAVE_FILENAME, bbox_inches='tight', dpi=SAVE_DPI)
print("Plot saved.n")
plt.show()

This produces the next map:

Tornado density map of EF 3-5 tornado starting locations (by the writer)

That is an attractive map and more informative than a smooth KDE alternative.

Adding Tornado Ending Locations

Our tornado density heatmaps are based on the locations of tornadoes. But tornadoes move after touching down. The typical tornado track is about 3 miles long, but if you take a look at stronger storms, the numbers increase. EF‑3 tornadoes average 18 miles, and EF‑4 tornadoes average 27 miles. Long-track tornadoes are rare, nevertheless, comprising lower than 2% of all tornadoes.

Because the typical tornado track length is lower than the 50-mile dimension of our grid cells, we will expect them to cover just one or two cells typically. Thus, including the tornado ending location should help us improve our density measurement.

To do that, we’ll have to mix the starting and ending lat-lon locations and endpoints that share the as their corresponding starting points. Otherwise, we’ll “double-dip” when counting. The code for it is a bit long, so I’ve stored it on this Gist.

Here’s a comparison of the “start only” map with the combined starting and ending locations:

Comparison of heatmap of starting locations (top) with heatmap of starting and ending locations (bottom) (by the writer)

The prevailing winds drive most tornadoes toward the east and northeast. You possibly can see the impact in states like Missouri, Mississippi, Alabama, and Tennessee. These areas have brighter cells in the underside map on account of tornadoes starting in a single cell and ending within the adjoining easterly cell. Note also that the utmost variety of tornadoes in a given cell has increased from 29 (within the upper map) to 33 (in the underside map).

Recap

We used Python, pandas, GeoPandas, and Matplotlib to project and overlay heatmaps onto geographical maps. Geospatial heatmaps are a highly effective method to visualize regional trends, patterns, hotspots, and outliers in statistical data.

When you enjoy all these projects, you'll want to try my book, , available in bookstores and online.

Tips on how to Overlay a Heatmap on a Real Map with Python

The Dataset

Installing Libraries

The Simplified Code

The Expanded Code

Importing Libraries / Assigning Constants

Defining Helper Functions

Running Helper Functions and Generating the Heatmap

Plotting the Results

Adding Tornado Ending Locations

Recap

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

High quality-tuning Florence-2 – Microsoft’s Cutting-edge Vision Language Models

The Importance of Data Quality

The Machine Learning “Advent Calendar” Bonus 2: Gradient Descent Variants in Excel

a Powerful Embedding Model Tailored for Patents and IP with Expert Support from Hugging Face

Chunk Size as an Experimental Variable in RAG Systems

Tips on how to Overlay a Heatmap on a Real Map with Python

The Dataset

Installing Libraries

The Simplified Code

The Expanded Code

Importing Libraries / Assigning Constants

Defining Helper Functions

Running Helper Functions and Generating the Heatmap

Plotting the Results

Adding Tornado Ending Locations

Recap

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.