Home Artificial Intelligence Structure and Relationships: Graph Neural Networks and a Pytorch Implementation

Structure and Relationships: Graph Neural Networks and a Pytorch Implementation

0
Structure and Relationships: Graph Neural Networks and a Pytorch Implementation

Let’s implement a regression example where the aim is to coach a network to predict the worth of a node given the worth of all other nodes i.e. each node has a single feature (which is a scalar value). The aim of this instance is to leverage the inherent relational information encoded within the graph to accurately predict numerical values for every node. The important thing thing to notice is that we input the numerical value for all nodes except the goal node (we mask the goal node value with 0) then predict the goal node’s value. For every data point, we repeat the method for all nodes. Perhaps this might come across as a bizarre task but lets see if we will predict the expected value of any node given the values of the opposite nodes. The information used is the corresponding simulation data to a series of sensors from industry and the graph structure I actually have chosen in the instance below is predicated on the actual process structure. I actually have provided comments within the code to make it easy to follow. You could find a replica of the dataset here (Note: that is my very own data, generated from simulations).

This code and training procedure is removed from being optimised nevertheless it’s aim is for example the implementation of GNNs and get an intuition for the way they work. A problem with the currently way I actually have done that ought to definitely not be done this fashion beyond learning purposes is the masking of node feature value and predicting it from the neighbours feature. Currently you’d must loop over each node (not very efficient), a a lot better option to do is the stop the model from include it’s own features within the aggregation step and hence you wouldn’t have to do one node at a time but I assumed it is less complicated to construct intuition for the model with the present method:)

Preprocessing Data

Importing the mandatory libraries and Sensor data from CSV file. Normalise all data within the range of 0 to 1.

import pandas as pd
import torch
from torch_geometric.data import Data, Batch
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy as np
from torch_geometric.data import DataLoader

# load and scale the dataset
df = pd.read_csv('SensorDataSynthetic.csv').dropna()
scaler = MinMaxScaler()
df_scaled = pd.DataFrame(scaler.fit_transform(df), columns=df.columns)

Defining the connectivity (edge index) between nodes within the graph using a PyTorch tensor — i.e. this provides the system’s graphical topology.

nodes_order = [
'Sensor1', 'Sensor2', 'Sensor3', 'Sensor4',
'Sensor5', 'Sensor6', 'Sensor7', 'Sensor8'
]

# define the graph connectivity for the information
edges = torch.tensor([
[0, 1, 2, 2, 3, 3, 6, 2], # source nodes
[1, 2, 3, 4, 5, 6, 2, 7] # goal nodes
], dtype=torch.long)

The Data imported from csv has a tabular structure but to make use of this in GNNs, it have to be transformed to a graphical structure. Each row of information (one commentary) is represented as one graph. Iterate through Each Row to Create Graphical representation of the information

A mask is created for every node/sensor to point the presence (1) or absence (0) of information, allowing for flexibility in handling missing data. In most systems, there could also be items with no data available hence the necessity for flexibility in handling missing data. Split the information into training and testing sets

graphs = []

# iterate through each row of information to create a graph for every commentary
# some nodes won't have any data, not the case here but created a mask to permit us to cope with any nodes that should not have data available
for _, row in df_scaled.iterrows():
node_features = []
node_data_mask = []
for node in nodes_order:
if node in df_scaled.columns:
node_features.append([row[node]])
node_data_mask.append(1) # mask value of to point present of information
else:
# missing nodes feature if mandatory
node_features.append(2)
node_data_mask.append(0) # data not present

node_features_tensor = torch.tensor(node_features, dtype=torch.float)
node_data_mask_tensor = torch.tensor(node_data_mask, dtype=torch.float)

# Create a Data object for this row/graph
graph_data = Data(x=node_features_tensor, edge_index=edges.t().contiguous(), mask = node_data_mask_tensor)
graphs.append(graph_data)

#### splitting the information into train, test commentary
# Split indices
observation_indices = df_scaled.index.tolist()
train_indices, test_indices = train_test_split(observation_indices, test_size=0.05, random_state=42)

# Create training and testing graphs
train_graphs = [graphs[i] for i in train_indices]
test_graphs = [graphs[i] for i in test_indices]

Graph Visualisation

The graph structure created above using the sting indices could be visualised using networkx.

import networkx as nx
import matplotlib.pyplot as plt

G = nx.Graph()
for src, dst in edges.t().numpy():
G.add_edge(nodes_order[src], nodes_order[dst])

plt.figure(figsize=(10, 8))
pos = nx.spring_layout(G)
nx.draw(G, pos, with_labels=True, node_color='lightblue', edge_color='gray', node_size=2000, font_weight='daring')
plt.title('Graph Visualization')
plt.show()

Model Definition

Let’s define the model. The model incorporates 2 GAT convolutional layers. The primary layer transforms node features to an 8 dimensional space, and the second GAT layer further reduces this to an 8-dimensional representation.

GNNs are highly at risk of overfitting, regularation (dropout) is applied after each GAT layer with a user defined probability to stop over fitting. The dropout layer essentially randomly zeros among the elements of the input tensor during training.

The GAT convolution layer output results are passed through a completely connected (linear) layer to map the 8-dimensional output to the ultimate node feature which on this case is a scalar value per node.

Masking the worth of the goal Node; as mentioned earlier, the aim of this of task is to regress the worth of the goal node based on the worth of it’s neighbours. That is the explanation behind masking/replacing the goal node’s value with zero.

from torch_geometric.nn import GATConv
import torch.nn.functional as F
import torch.nn as nn

class GNNModel(nn.Module):
def __init__(self, num_node_features):
super(GNNModel, self).__init__()
self.conv1 = GATConv(num_node_features, 16)
self.conv2 = GATConv(16, 8)
self.fc = nn.Linear(8, 1) # Outputting a single value per node

def forward(self, data, target_node_idx=None):
x, edge_index = data.x, data.edge_index
edge_index = edge_index.T
x = x.clone()

# Mask the goal node's feature with a price of zero!
# Aim is to predict this value from the features of the neighbours
if target_node_idx is just not None:
x[target_node_idx] = torch.zeros_like(x[target_node_idx])

x = F.relu(self.conv1(x, edge_index))
x = F.dropout(x, p=0.05, training=self.training)
x = F.relu(self.conv2(x, edge_index))
x = F.relu(self.conv3(x, edge_index))
x = F.dropout(x, p=0.05, training=self.training)
x = self.fc(x)

return x

Training the model

Initialising the model and defining the optimiser, loss function and the hyper parameters including learning rate, weight decay (for regularisation), batch_size and variety of epochs.

model = GNNModel(num_node_features=1) 
batch_size = 8
optimizer = torch.optim.Adam(model.parameters(), lr=0.0002, weight_decay=1e-6)
criterion = torch.nn.MSELoss()
num_epochs = 200
train_loader = DataLoader(train_graphs, batch_size=1, shuffle=True)
model.train()

The training process is fairly standard, each graph (one data point) of information is passed through the forward pass of the model (iterating over each node and predicting the goal node. The loss from the prediction is collected over the defined batch size before updating the GNN through backpropagation.

for epoch in range(num_epochs):
accumulated_loss = 0
optimizer.zero_grad()
loss = 0
for batch_idx, data in enumerate(train_loader):
mask = data.mask
for i in range(1,data.num_nodes):
if mask[i] == 1: # Only train on nodes with data
output = model(data, i) # get predictions with the goal node masked
# check the feed forward a part of the model
goal = data.x[i]
prediction = output[i].view(1)
loss += criterion(prediction, goal)
#Update parameters at the top of every set of batches
if (batch_idx+1) % batch_size == 0 or (batch_idx +1 ) == len(train_loader):
loss.backward()
optimizer.step()
optimizer.zero_grad()
accumulated_loss += loss.item()
loss = 0

average_loss = accumulated_loss / len(train_loader)
print(f'Epoch {epoch+1}, Average Loss: {average_loss}')

Testing the trained model

Using the test dataset, pass each graph through the forward pass of the trained model and predict each node’s value based on it’s neighbours value.

test_loader = DataLoader(test_graphs, batch_size=1, shuffle=True)
model.eval()

actual = []
pred = []

for data in test_loader:
mask = data.mask
for i in range(1,data.num_nodes):
output = model(data, i)
prediction = output[i].view(1)
goal = data.x[i]

actual.append(goal)
pred.append(prediction)

Visualising the test results

Using iplot we will visualise the expected values of nodes against the bottom truth values.

import plotly.graph_objects as go
from plotly.offline import iplot

actual_values_float = [value.item() for value in actual]
pred_values_float = [value.item() for value in pred]

scatter_trace = go.Scatter(
x=actual_values_float,
y=pred_values_float,
mode='markers',
marker=dict(
size=10,
opacity=0.5,
color='rgba(255,255,255,0)',
line=dict(
width=2,
color='rgba(152, 0, 0, .8)',
)
),
name='Actual vs Predicted'
)

line_trace = go.Scatter(
x=[min(actual_values_float), max(actual_values_float)],
y=[min(actual_values_float), max(actual_values_float)],
mode='lines',
marker=dict(color='blue'),
name='Perfect Prediction'
)

data = [scatter_trace, line_trace]

layout = dict(
title='Actual vs Predicted Values',
xaxis=dict(title='Actual Values'),
yaxis=dict(title='Predicted Values'),
autosize=False,
width=800,
height=600
)

fig = dict(data=data, layout=layout)

iplot(fig)

Despite a scarcity of effective tuning the model architecture or hyperparameters, it has done a good job actually, we could tune the model further to get improved accuracy.

This brings us to the top of this text. GNNs are relatively newer than other branches of machine learning, it should be very exciting to see the developments of this field but additionally it’s application to different problems. Finally, thanks for taking the time to read this text, I hope you found it useful in your understanding of GNNs or their mathematical background.

Unless otherwise noted, all images are by the creator

LEAVE A REPLY

Please enter your comment!
Please enter your name here