Home Artificial Intelligence Introduction to PyTorch: from training loop to prediction Install PyTorch and other dependencies Import and explore the dataset Create the datasets and dataloaders classes Training, validation and test datasets Data normalization Neural network implementation in PyTorch Utility functions for plotting and accuracy calculation Model training Neural network performance evaluation Create predictions Conclusions Really useful Reads Useful Links (written by me)

Introduction to PyTorch: from training loop to prediction Install PyTorch and other dependencies Import and explore the dataset Create the datasets and dataloaders classes Training, validation and test datasets Data normalization Neural network implementation in PyTorch Utility functions for plotting and accuracy calculation Model training Neural network performance evaluation Create predictions Conclusions Really useful Reads Useful Links (written by me)

Introduction to PyTorch: from training loop to prediction
Install PyTorch and other dependencies
Import and explore the dataset
Create the datasets and dataloaders classes
Training, validation and test datasets
Data normalization
Neural network implementation in PyTorch
Utility functions for plotting and accuracy calculation
Model training
Neural network performance evaluation
Create predictions
Really useful Reads
Useful Links (written by me)

Image by creator.

On this post we are going to cover how you can implement a logistic regression model using PyTorch in Python.

by the community of information scientists and machine learning engineers on this planet, and thus learning this tool becomes a necessary step in your learning path if you should construct a profession in the sphere of applied AI.

It joins TensorFlow, one other very famous deep learning framework developed by Google.

There aren’t any notable fundamental differences, apart from the structure and organization of their APIs, which might be very different.

While each frameworks allow us to create very complex neural networks, PyTorch is usually preferred as a result of its more pythonic style and the liberty it allows the developer to integrate custom logic into the software.

We are going to use the , an open source dataset already used previously in a few of my previous article to coach a binary classification model.

The goal is to clarify how you can:

  • go from a pandas dataframe to PyTorch’s Datasets and DataLoaders
  • create a neural network for binary classification in PyTorch
  • create predictions
  • evaluate the performance of our model with utility functions and matplotlib
  • use this network to make predictions

By the tip of this text

Let’s start!

We start our project by making a virtual environment in a dedicated folder.

Visit this link to learn how you can create a virtual environment with Conda.

Once our virtual environment has been created, we are able to run the command

$ pip install torch -U

within the terminal. This command will install the most recent version of PyTorch, which as of this writing is version 2.0.

Starting a notebook, we are able to check the library version using torch.__version__ after doing import torch.

We will confirm that PyTorch is accurately installed within the environment by importing and launching a small test script, as shown within the official guide.

import torch

x = torch.rand(5, 3)

>>> tensor([[0.3890, 0.6087, 0.2300],
[0.1866, 0.4871, 0.9468],
[0.2254, 0.7217, 0.4173],
[0.1243, 0.1482, 0.6797],
[0.2430, 0.4608, 0.8886]])

If the script executes accurately then we’re able to proceed with the project. Otherwise I suggest the reader to confer with the official guide positioned here https://pytorch.org/get-started/locally/.

Let’s proceed with the installation of the extra dependencies:

  • Sklearn; pip install scikit-learn
  • Pandas; pip install pandas
  • Matplotlib; pip install matplotlib

Libraries like Numpy are mechanically install if you install PyTorch.

Let’s start by importing the installed libraries and breast cancer dataset from Sklearn with the next code snippet

import torch
import pandas as pd
import numpy as np

from sklearn.datasets import load_breast_cancer

import matplotlib.pyplot as plt

breast_cancer_dataset = load_breast_cancer(as_frame=True, return_X_y=True)

Let’s create a dataframe dedicated to holding our X and y like this

df = breast_cancer_dataset[0]
df['target'] = breast_cancer_dataset[1]
Example of the dataframe. Image by creator.

Our goal is to create a model that may predict the goal column based on the characteristics in the opposite columns.

Let’s go do a minimum of exploratory evaluation to get some awareness of the dataset. We are going to use the sweetviz library to mechanically create an evaluation report.

We will install sweetviz with the command pip install sweetviz and create an EDA (exploratory data evaluation) report with this piece of code

import sweetviz as sv

eda_report = sv.analyze(df)

Sweetviz analyzing our dataset. Image by creator.

Sweetviz will create a report right in our notebook for us to explore.

“Association” tab in Sweetviz. Image by creator.

We see how several columns are highly related to a worth of 0 or 1 of our goal column.

Being a multidimensional dataset and having variables with different distributions, a neural network is a legitimate choice to model this data. That said, this dataset may also be modeled by simpler models, corresponding to decision trees.

We are going to now import two other libraries so as to visualize the dataset. from Sklearn and Seaborn to visualise the multidimensional dataset.

PCA will help us compress the massive variety of variables into just two, which we are going to use because the X and Y axis in a Seaborn scatterplot. Seaborn takes a further parameter called hue to paint the dots based on a further variable. We are going to use our goal.

import seaborn as sns
from sklearn import decomposition

pca = decomposition.PCA(n_components=2)

X = df.drop("goal", axis=1).values
y = df['target'].values

vecs = pca.fit_transform(X)
x0 = vecs[:, 0]
x1 = vecs[:, 1]

sns.scatterplot(x=x0, y=x1, hue=y)
plt.title("Proiezione PCA")
plt.xlabel("PCA 1")
plt.ylabel("PCA 2")

PCA projection of the breast cancer dataset. Image by creator.

We see how class 1 data points group based on common characteristics. It’ll be the goal of our neural network to categorise the rows between targets 0 or 1.

PyTorch provides Dataset and DataLoader objects to permit us to efficiently organize and cargo our data into the neural network.

It will be possible to make use of pandas directly, but this may have disadvantages because it could make our code less efficient.

The Dataset class allows us to specify the suitable format to your data and apply the retrieval and transformation logics which are often fundamental (consider the information augmentation applied to photographs).

Let’s see how you can create a PyTorch Dataset object.

from torch.utils.data import Dataset

class BreastCancerDataset(Dataset):
def __init__(self, X, y):
# create feature tensors
self.features = torch.tensor(X, dtype=torch.float32)
# create label tensors
self.labels = torch.tensor(y, dtype=torch.long)

def __len__(self):
# we define a technique to retrieve the length of the dataset
return self.features.shape[0]

def __getitem__(self, idx):
# crucial override of the __getitem__ method which helps to index our data
x = self.features[idx]
y = self.labels[idx]
return x, y

It is a class that inherits from Dataset and allows the DataLoader, which we are going to create shortly, to efficiently retrieve batches of information.

The category takes X and y as input.

Before proceeding to the next steps, it can be crucial to create training, validation and test sets.

These will help us evaluate the performance of our model and understand the standard of the predictions.

For the interested reader, I suggest reading the article 6 Things You Should Do Before Training Your Model and what’s cross-validation in machine learning to raised understand why splitting our data into three partitions is an efficient method for performance evaluation.

With Sklearn this becomes easy with the train_test_split method.

from sklearn import model_selection

train_ratio = 0.50
validation_ratio = 0.20
test_ratio = 0.20

x_train, x_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=1 - train_ratio)
x_val, x_test, y_val, y_test = model_selection.train_test_split(x_test, y_test, test_size=test_ratio/(test_ratio + validation_ratio))

print(x_train.shape, x_val.shape, x_test.shape)

>>> (284, 30) (142, 30) (143, 30)

With this small snippet of code we created our training, validation and test sets in accordance with controllable splits.

When doing deep learning, even for a sure bet like binary classification, it’s at all times crucial to normalize our data.

Normalizing means bringing all of the values of the assorted columns within the dataset to the identical numerical scale. This helps the neural network converge more effectively and thus make accurate predictions faster.

We are going to use Sklearn’s StandardScaler.

from sklearn import preprocessing

scaler = preprocessing.StandardScaler()

x_train_scaled = scaler.fit_transform(x_train)
x_val_scaled = scaler.transform(x_val)
x_test_scaled = scaler.transform(x_test)

Notice how fit_trasform is applied only to the training set, while transform is applied to the opposite two datasets. That is to avoid data leakage — when information from our validation or test set is unintentionally leaked into our training set. We wish our training set to be the one source of learning, unaffected by test data.

This data is now able to be input to the BreastCancerDataset class.

train_dataset = BreastCancerDataset(x_train_scaled, y_train)
val_dataset = BreastCancerDataset(x_val_scaled, y_val)
test_dataset = BreastCancerDataset(x_test_scaled, y_test)

We import the dataloader and initialize the objects.

from torch.utils.data import DataLoader

train_loader = DataLoader(

val_loader = DataLoader(

test_loader = DataLoader(

The ability of the DataLoader is that it allows us to specify whether to shuffling our data and in what variety of batches the information ought to be supplied to the model. of the model and subsequently can impact the outcomes of our inferences.

Making a model in PyTorch might sound complex, but it surely really only requires understanding a couple of basic concepts.

  1. When writing a model in PyTorch, we are going to use an , like with datasets. It means that we’ll create a category like class MyModel which inherits from PyTorch’s nn.Module class.
  2. PyTorch is an autodifferentiation software. It signifies that once we write a neural network based on the backpropagation algorithm, the calculation of the derivatives to calculate the loss is finished mechanically behind the scenes. This requires writing some dedicated code which may get confusing the primary time around.

I counsel the reader who desires to know the fundamentals of how neural networks work to seek the advice of the article Introduction to neural networks — weights, biases and activation

That said, let’s see what the code for writing a logistic regression model looks like.

class LogisticRegression(nn.Module):
Our neural network accepts num_features and num_classes.

num_features - variety of features to learn from
num_classes: variety of classes in output to expect (on this case, 1 or 2, for the reason that output is binary (0 or 1))

def __init__(self, num_features, num_classes):
super().__init__() # initialize the init approach to nn.Module

self.num_features = num_features
self.num_classes = num_classes

# create a single layer of neurons on which to use the log reg
self.linear1 = nn.Linear(in_features=num_features, out_features=num_classes)

def forward(self, x):
logits = self.linear1(x) # pass our data through the layer
probs = torch.sigmoid(logits) # we apply a sigmoid function to acquire the chances of belonging to a category (0 or 1)
return probs # return probabilities

Our class inherits from nn.Module. This class provides the methods behind the scenes that make the model work.

__init__ method

The __init__ approach to a category incorporates the logic that runs when instantiating a category in Python. Here we pass two arguments: the variety of features and the variety of classes to predict.

num_features corresponds to the variety of columns that make up our dataset minus our goal variable, while num_classes corresponds to the variety of results that the neural network must return.

Along with the 2 arguments and their class variables, we see the super().__init__() line. The super function initializes the init approach to the parent class. This enables us to have the functionality of nn.Module inside our model.

At all times within the init block, we implement a linear layer called self.linear1, which takes as arguments the variety of features and the variety of results to return.

forward() method

By writing the forward method we tell Python to override the identical method inside PyTorch’s nn.Module parent class. In reality, this method known as when performing a forward pass — that’s, when our data passes from one layer to a different.

forward accepts input x which incorporates the features on which the model will calibrate its performance.

The input passes through the primary layer, creating the logits variable. The logits are the neural network calculations that will not be yet converted into probabilities by the ultimate activation function, which on this case is a sigmoid. In reality, they’re the inner representation of the neural network before being mapped to a function that enables it to be interpreted.

On this case the sigmoid function will map the logits to probabilities between 0 and 1. If the output is lower than 0, then the category shall be 0 otherwise it should be 1. This happens in the road self.probs = torch.sigmoid(x).

Let’s create utility functions to make use of within the training loop that we’ll see shortly. These two are used to compute the accuracy at the tip of every epoch and to display the performance curves at the tip of the training.

def compute_accuracy(model, dataloader):
This function puts the model in evaluation mode (model.eval()) and calculates the accuracy with respect to the input dataloader
model = model.eval()
correct = 0
total_examples = 0
for idx, (features, labels) in enumerate(dataloader):
with torch.no_grad():
logits = model(features)
predictions = torch.where(logits > 0.5, 1, 0)
lab = labels.view(predictions.shape)
comparison = lab == predictions

correct += torch.sum(comparison)
total_examples += len(comparison)
return correct / total_examples

def plot_results(train_loss, val_loss, train_acc, val_acc):
This function takes lists of values and creates side-by-side graphs to point out training and validation performance
fig, ax = plt.subplots(1, 2, figsize=(15, 5))
train_loss, label="train", color="red", linestyle="--", linewidth=2, alpha=0.5
val_loss, label="val", color="blue", linestyle="--", linewidth=2, alpha=0.5
train_acc, label="train", color="red", linestyle="--", linewidth=2, alpha=0.5
val_acc, label="val", color="blue", linestyle="--", linewidth=2, alpha=0.5

Now we come to the part where most deep learning newcomers struggle: the PyTorch training loop.

Let’s have a look at the code after which comment it

import torch.nn.functional as F

model = LogisticRegression(num_features=x_train_scaled.shape[1], num_classes=1)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

num_epochs = 10

train_losses, val_losses = [], []
train_accs, val_accs = [], []

for epoch in range(num_epochs):

model = model.train()
t_loss_list, v_loss_list = [], []
for batch_idx, (features, labels) in enumerate(train_loader):

train_probs = model(features)
train_loss = F.binary_cross_entropy(train_probs, labels.view(train_probs.shape))


if batch_idx % 10 == 0:
f"Epoch {epoch+1:02d}/{num_epochs:02d}"
f" | Batch {batch_idx:02d}/{len(train_loader):02d}"
f" | Train Loss {train_loss:.3f}"


model = model.eval()
for batch_idx, (features, labels) in enumerate(val_loader):
with torch.no_grad():
val_probs = model(features)
val_loss = F.binary_cross_entropy(val_probs, labels.view(val_probs.shape))


train_acc = compute_accuracy(model, train_loader)
val_acc = compute_accuracy(model, val_loader)


f"Train accuracy: {train_acc:.2f}"
f" | Val accuracy: {val_acc:.2f}"

Unlike TensorFlow, PyTorch requires us to jot down a training loop in pure Python.

Let’s see the procedure step-by-step:

  1. We instantiate the model and the optimizer
  2. We settle on quite a few epochs
  3. We create a for loop that iterates through the epochs
  4. For every epoch, we set the model to training mode with model.train() and cycle through the train_loader
  5. For every batch of the train_loader, calculate the loss, bring the calculation of the derivatives to 0 with optimizer.zero_grad() and update the weights of the network with optimizer.step()

At this point the training loop is complete, and in case you want you’ll be able to integrate the identical logic on the validation dataloader as written within the code.

Here is the results of the training after the launch of this code

Training in progress. Image by creator.

We use the previously created utility function to plot loss in training and validation.

plot_results(train_losses, val_losses, train_accs, val_accs)
Performances of the neural network. Image by creator.

Our binary classification model quickly converges to high accuracy, and we see how the loss drops at the tip of every epoch.

The dataset seems to be easy to model and the low variety of examples doesn’t help to see a more gradual convergence towards high performance by the network.

I emphasize that it is feasible to integrate the TensorBoard software into PyTorch to have the option to log performance metrics mechanically between the assorted experiments.

We’ve got reached the tip of this guide. Let’s see the code to create predictions for our entire dataset.

# we transform all our features with the scaler
X_scaled_all = scaler.transform(X)

# transform in tensors
X_scaled_all_tensors = torch.tensor(X_scaled_all, dtype=torch.float32)

# we set the model in inference mode and create the predictions
with torch.inference_mode():
logits = model(X_scaled_all_tensors)
predictions = torch.where(logits > 0.5, 1, 0)

df['predictions'] = predictions.numpy().flatten()

Now let’s import the metrics package from Sklearn which allows us to quickly calculate the confusion matrix and classification report directly on our pandas dataframe.

from sklearn import metrics
from pprint import pprint

pprint(metrics.classification_report(y_pred=df.predictions, y_true=df.goal))

Summary of performance on your complete dataset with a classification report. Image by creator.

And the confusion matrix, which shows the variety of correct answers on the diagonal

metrics.confusion_matrix(y_pred=df.predictions, y_true=df.goal)

>>> array([[197, 15],
[ 13, 344]])

Here’s a small function to create a classification line that separates the classes within the PCA graph

def plot_boundary(model):

w1 = model.linear1.weight[0][0].detach()
w2 = model.linear1.weight[0][1].detach()
b = model.linear1.bias[0].detach()

x1_min = -1000
x2_min = (-(w1 * x1_min) - b) / w2

x1_max = 1000
x2_max = (-(w1 * x1_max) - b) / w2

return x1_min, x1_max, x2_min, x2_max

sns.scatterplot(x=x0, y=x1, hue=y)
plt.title("PCA Projection")
plt.xlabel("PCA 1")
plt.ylabel("PCA 2")
plt.plot([x1_min, x1_max], [x2_min, x2_max], color="k", label="Classification", linestyle="--")

And here’s how the model separates benign from malignant cells

Classification boundary visualized. Image by creator.

In this text now we have seen how you can create a binary classification model with PyTorch, ranging from a Pandas dataframe.

We’ve seen what the training loop looks like, how you can evaluate the model, and how you can create predictions and visualizations to assist interpretation.

With PyTorch it is feasible to create very complex neural networks … just think that Tesla, the manufacturer of electrical cars based on AI, uses PyTorch to create its models.

For individuals who want to start out their deep learning journey, learning PyTorch as early as possible becomes a high priority task because it permits you to construct essential technologies that may solve complex data-driven problems.



Please enter your comment!
Please enter your name here