Entropy based Uncertainty Prediction

Artificial Intelligence

Entropy based Uncertainty Prediction

admin

September 3, 2023

This text explores how Entropy may be employed as a tool for uncertainty estimation in image segmentation tasks. We are going to walk through what Entropy is, and find out how to implement it with Python.

While working at Cambridge University as a Research Scientist in Neuroimaging and AI, I faced the challenge of performing image segmentation on intricate brain datasets using the most recent Deep Learning techniques, especially the nnU-Net. During this endeavor, I observed a major gap: the overlooking of uncertainty estimation. Yet, uncertainty is crucial for reliable decision-making.

Before delving into the specifics, be at liberty to ascertain out my Github repository which comprises all of the code snippets discussed in this text.

On the earth of computer vision and machine learning, image segmentation is a central problem. Whether it’s in medical imaging, self-driving cars, or robotics, accurate segmentation are vital for effective decision-making. Nevertheless, one often missed aspect is the measure of uncertainty related to these segmentations.

Why should we care about uncertainty in image segmentation?

In lots of real-world applications, an incorrect segmentation could lead to dire consequences. For instance, if a self-driving automotive misidentifies an object or a medical imaging system incorrectly labels a tumor, the results could possibly be catastrophic. Uncertainty estimation gives us a measure of how ‘sure’ the model is about its prediction, allowing for better-informed decisions.

We can even use Entropy as a measure of uncertainty to enhance the training of our neural networks. This area is knows as ‘energetic learning’. This concept will likely be explored in further articles however the major idea is to discover the zones on which the models are probably the most uncertain to concentrate on them. For instance we could have a CNN performing medical image segmentation on the brain, but performing very poorly on subjects with tumours. Then we could concentrate our efforts to amass more labels of this sort.

Entropy is an idea borrowed from thermodynamics and data theory, which quantifies the quantity of uncertainty or randomness in a system. Within the context of machine learning, entropy may be used to measure the uncertainty of model predictions.

Mathematically, for a discrete random variable X with probability mass function P(x), the entropy H(X) is defined as:

Or within the continous case:

The upper the entropy, the greater the uncertainty, and vice versa.

A classic example to totally grasp the concept:

Situation 1: A biased coin

Imagine a biased coin, which lands on head with a probability p=0.9, and tail with a probability 1-p=0.1.

Its entropy is

Situation 2: Balanced coin

Now let’s imagine a balanced coin which lands on head and tail with probability p=0.5

Its entropy is:

The entropy is larger, which is coherent with what we said before: more uncertainty = more entropy.

Actually it’s interesting to notice that p=0.5 corresponds to the utmost entropy:

Intuitively, keep in mind that a uniform distribution is the case with maximal entropy. If every final result is equally probable, then this corresponds to the maximal uncertainty.

To link this to image segmentation, consider that in deep learning, the ultimate softmax layer normally provides the category probabilities for every pixel. One can easily compute the entropy for every pixel based on these softmax outputs.

But How does it work?

When a model is confident about a specific pixel belonging to a particular class, the softmax layer shows high probability (~1) for that class, and really small probabilities (~0) for the opposite classes.

Softmax layer, confident case, Image by writer

Conversely, when the model is uncertain, the softmax output is more evenly spread across multiple classes.

Softmax layer, uncertain case, Image by writer

The possibilities are far more diffuse, near the uniform case for those who remember, since the model cannot determine which class is related to the pixel.

If you’ve got made it until now, great! You need to have an ideal intuition of how entropy works.

Let’s illustrate this with a hands-on example using medical imaging, specifically T1 Brain scans of fetuses. All codes and pictures for this case study can be found in my Github repository.

1. Computing Entropy with Python

As we said before, we’re working with the softmax output tensor, given by our Neural Network. This approach is model-free, it only uses the chances of every class.

Let’s make clear something essential in regards to the dimensions of the tensors we’re working with.

In the event you are working with 2D Images, the form of your softmax layer must be:

Meaning that for every pixel (or voxel), now we have a vector of size Classes, which supplies us the chances of a pixel to belong to every of the classes now we have.

Subsequently the entropy must be computer along the primary dimension:


def compute_entropy_4D(tensor):
"""
Compute the entropy on a 4D tensor with shape (number_of_classes, 256, 256, 256).Parameters:
tensor (np.ndarray): 4D tensor of shape (number_of_classes, 256, 256, 256)
Returns:
np.ndarray: 3D tensor of shape (256, 256, 256) with entropy values for every pixel.
"""
# First, normalize the tensor along the category axis in order that it represents probabilities
sum_tensor = np.sum(tensor, axis=0, keepdims=True)
tensor_normalized = tensor / sum_tensor
# Calculate entropy
entropy_elements = -tensor_normalized * np.log2(tensor_normalized + 1e-12)  # Added a small value to avoid log(0)
entropy = np.sum(entropy_elements, axis=0)
entropy = np.transpose(entropy, (2,1,0))
total_entropy = np.sum(entropy)
return entropy, total_entropy

2. Visualizing Entropy-based Uncertainty

Now let’s visualize the uncertainties through the use of a heatmap, on each slice of our image segmentation.

T1 scan (left), Segmentation (middle), Entropy (Right), Image by writer

Let’s have a look at an other example:

The outcomes look great! Indeed we will see that that is coherent since the zones of high entropy are on the contour of the shapes. That is normal since the model does not likely doubt the points at the center of every zone, but its fairly the delimitation or contour that’s difficult to identify.

This uncertainty may be utilized in loads of alternative ways:

As medical examiners work an increasing number of with AI as a tool, being aware of the uncertainty of the model is crucial. This mean that medical examiners could spend more times on the zone where more fine-grained attention is required.

2. Within the context of Energetic Learning or Semi-Supervised Learning, we will leverage Entropy based Uncertainty to concentrate on the examples with maximal uncertainty, and improve the efficiency of learning (more about this in coming articles).

Entropy is a particularly powerful concept to measure the randomness or uncertainty of a system.
It is feasible to leverage Entropy in Image Segmentation. This approach is model free and only uses the softmax output tensor.
Uncertainty estimation is missed, nevertheless it is crucial. Good Data Scientists know find out how to make good models. Great Data Scientists know where their model fail and use this to enhance learning.