Home Artificial Intelligence From GeoJSON to Network Graph: Analyzing World Country Borders in Python

From GeoJSON to Network Graph: Analyzing World Country Borders in Python

0
From GeoJSON to Network Graph: Analyzing World Country Borders in Python

Utilizing NetworkX for Graph-Based Country Border Evaluation

Maksim Shutov in Unsplash

Python offers a wide selection of libraries that allow us to simply and quickly address problems in various research areas. Geospatial data evaluation and graph theory are two research areas where Python provides a robust set of useful libraries. In this text, we’ll conduct a straightforward evaluation of world borders, specifically exploring which countries share borders with others. We’ll begin by utilizing information from a GeoJSON file containing polygons for all countries worldwide. The final word goal is to create a graph representing the assorted borders using NetworkX and utilize this graph to perform multiple analyses.

GeoJSON files enable the representation of assorted geographical areas and are widely utilized in geographical evaluation and visualizations. The initial stage of our evaluation involves reading the countries.geojson file and converting it right into a GeoDataFrame using GeoPandas. This file has been sourced from the next GitHub repository and comprises polygons representing different countries worldwide.

GeoDataFrame with Comprehensive Country Information (Image created by the creator)

As shown above, the GeoDataFrame comprises the next columns:

  1. ADMIN: Represents the executive name of the geographical area, reminiscent of the country or region name.
  2. ISO_A3: Stands for the ISO 3166–1 alpha-3 country code, a three-letter code uniquely identifying countries.
  3. ISO_A2: Denotes the ISO 3166–1 alpha-2 country code, a two-letter code also used for country identification.
  4. geometry: This column comprises the geometrical information that defines the form of the geographical area, represented as MULTIPOLYGON data.

You possibly can visualize all of the multi polygons that make up the GeoDataFrame using theplot method, as demonstrated below.

Visual Representation of the GeoDataFrame (Image created by the creator)

The multi polygons throughout the geometry column belong to the category shapely.geometry.multipolygon.MultiPolygon. These objects contain various attributes, one in all which is the centroid attribute. The centroid attribute provides the geometric center of the MULTIPOLYGON and returns a POINT that represents this center.

Subsequently, we will use this POINT to extract the latitude and longitude of every MULTIPOLYGON and store the ends in two columns throughout the GeoDataFrame. We perform this calculation because we’ll later use these latitude and longitude values to visualise the nodes on the graph based on their real geographic positions.

Now it’s time to proceed with the development of the graph that can represent the borders between different countries worldwide. On this graph, the nodes will represent countries, while the sides will indicate the existence of a border between these countries. If there’s a border between two nodes, the graph may have an edge connecting them; otherwise, there will likely be no edge.

The function create_country_network processes the knowledge throughout the GeoDataFrame and constructs a Graph representing country borders.

Initially, the function iterates through each row of the GeoDataFrame, where each row corresponds to a special country. Then, it creates a node for the country while adding latitude and longitude as attributes to the node.

Within the event that the geometry isn’t valid, it rectifies it using the buffer(0) method. This method essentially fixes invalid geometries by applying a small buffer operation with a distance of zero. This motion resolves problems reminiscent of self-intersections or other geometric irregularities within the multipolygon representation.

After creating the nodes, the following step is to populate the network with the relevant edges. To do that, we iterate through the various countries, and if there’s an intersection between the polygons representing each countries, it implies they share a typical border, and, in consequence, an edge is created between their nodes.

The following step involves visualizing the created network, where nodes represent countries worldwide, and edges signify the presence of borders between them.

The function plot_country_network_on_map is answerable for processing the nodes and edges of the graph G and displaying them on a map.

Network of Country Borders (Image created by the creator)

The positions of the nodes on the graph are determined by the latitude and longitude coordinates of the countries. Moreover, a map has been placed within the background to offer a clearer context for the created network. This map was generated using the boundary attribute from the GeoDataFrame. This attribute provides information concerning the geometrical boundaries of the represented countries, aiding within the creation of the background map.

It’s essential to notice one detail: within the used GeoJSON file, there are islands which can be considered independent countries, regardless that they administratively belong to a selected country. This is the reason it’s possible you’ll see quite a few points in maritime areas. Be mindful that the graph created relies on the knowledge available within the GeoJSON file from which it was generated. If we were to make use of a special file, the resulting graph could be different.

The country border network we’ve created can swiftly assist us in addressing multiple questions. Below, we’ll outline three insights that may easily be derived by processing the knowledge provided by the network. Nevertheless, there are various other questions that this network will help us answer.

Insight 1: Examining Borders of a Chosen Nation

On this section, we’ll visually assess the neighbors of a selected country.

The plot_country_borders function enables quick visualization of the borders of a selected country. This function generates a subgraph of the country provided as input and its neighboring countries. It then proceeds to visualise these countries, making it easy to watch the neighboring countries of a selected nation. On this instance, the chosen country is Mexico, but we will easily adapt the input to visualise every other country.

Network of Country Borders in Mexico (Image created by the creator)

As you’ll be able to see within the generated image, Mexico shares its border with three countries: the US, Belize, and Guatemala.

Insight 2: Top 10 Countries with the Most Borders

On this section, we’ll analyze which countries have the best variety of neighboring countries and display the outcomes on the screen. To realize this, we now have implemented the calculate_top_border_countries function. This function assesses the variety of neighbors for every node within the network and displays only those with the best variety of neighbors (top 10).

Top 10 Nations with the Most Borders (Image created by the creator)

We must reiterate that the outcomes obtained are depending on the initial GeoJSON file. On this case, the Siachen Glacier is coded as a separate country, which is why it appears as sharing a border with China.

Insight 3: Exploring the Shortest Country-to-Country Routes

We conclude our evaluation with a route assessment. On this case, we’ll evaluate the minimum variety of borders one must cross when traveling from an origin country to a destination country.

The find_shortest_path_between_countries function calculates the shortest path between an origin country and a destination country. Nevertheless, it’s essential to notice that this function provides only one in all the possible shortest paths. This limitation arises from its use of the shortest_path function from NetworkX, which inherently finds a single shortest path as a result of the character of the algorithm used.

To access all possible paths between two points, including multiple shortest paths, there are alternatives available. Within the context of the find_shortest_path_between_countries function, one could explore options reminiscent of all_shortest_paths or all_simple_paths. These alternatives are able to returning multiple shortest paths as a substitute of only one, depending on the precise requirements of the evaluation.

We employed the function to seek out the shortest path between Spain and Poland, and the evaluation revealed that the minimum variety of border crossings required to travel from Spain to Poland is 3.

Finding the Optimal Route from Spain to Poland (Image created by the creator)

Python offers a plethora of libraries spanning various domains of data, which could be seamlessly integrated into any data science project. On this instance, we now have utilized libraries dedicated to each geometric data evaluation and graph evaluation to create a graph representing the world’s borders. Subsequently, we now have demonstrated use cases for this graph to rapidly answer questions, enabling us to conduct geographical evaluation effortlessly.

Thanks for reading.

Amanda Iglesias

LEAVE A REPLY

Please enter your comment!
Please enter your name here