Feature Detection, Part 1: Image Derivatives, Gradients, and Sobel Operator

Computer vision is an enormous area for analyzing images and videos. While many individuals are inclined to think mostly about machine learning models once they hear computer vision, in point of fact, there are numerous more existing algorithms that, in some cases, perform higher than AI!

In computer vision, the realm of feature detection involves identifying distinct regions of interest in a picture. These results can then be used to create feature descriptors — numerical vectors representing local image regions. After that, the feature descriptors of multiple photos from the identical scene may be combined to perform image matching and even reconstruct a scene.

In this text, we are going to make an analogy from calculus to introduce image derivatives and gradients. It should be essential for us to grasp the logic behind the convolutional kernel and the Sobel operator particularly — a pc vision filter used to detect edges within the image.

Image intensity

is considered one of the major characteristics of a picture. Every pixel of the image has three components: R (red), G (green), and B (blue), taking values between 0 and 255. The upper the worth is, the brighter the pixel is. The intensity of a pixel is only a weighted average of its R, G, and B components.

In truth, there exist several standards defining different weights. Since we’re going to deal with OpenCV, we are going to use their formula, which is given below:

Intensity formula

image = cv2.imread('image.png')
B, G, R = cv2.split(image)
grayscale_image = 0.299 * R + 0.587 * G + 0.114 * B
grayscale_image = np.clip(grayscale_image, 0, 255).astype('uint8')
intensity = grayscale_image.mean()
print(f"Image intensity: {intensity:2f}")

Grayscale images

Images may be represented using different color channels. If RGB channels represent an original image, applying the intensity formula above will transform it into grayscale format, consisting of just one channel.

Because the sum of weights within the formula is the same as 1, the grayscale image will contain intensity values between 0 and 255, identical to the RGB channels.

Big Ben shown in RGB (left) and grayscale (right)

In OpenCV, RGB channels may be converted to grayscale format using the cv2.cvtColor() function, which is a better way than the strategy we just saw above.

image = cv2.imread('image.png')
grayscale_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
intensity = grayscale_image.mean()
print(f"Image intensity: {intensity:2f}")

Image derivative

Image derivatives are used to measure how briskly the pixel intensity changes across the image. Images may be considered a function of two arguments, I(x, y), where x and y specify the pixel position and I represents the intensity of that pixel.

We could write formally:

But given the proven fact that images exist within the discrete space, their derivatives are frequently approximated through convolutional kernels:

For the horizontal X-axis: [-1, 0, 1]
For the vertical Y-axis: [-1, 0, 1]ᵀ

In other words, we are able to rewrite the equations above in the next form:

To raised understand the logic behind the kernels, allow us to consult with the instance below.

Example

Suppose we now have a matrix consisting of 5×5 pixels representing a grayscale image patch. The weather of this matrix show the intensity of pixels.

To calculate the image derivative, we are able to use convolutional kernels. The concept is straightforward: by taking a pixel within the image and a number of other pixels in its neighborhood, we discover the sum of an element-wise multiplication with a given kernel that represents a set matrix (or vector).

In our case, we are going to use a three-element vector [-1, 0, 1]. From the instance above, allow us to take a pixel at position (1, 1) whose value is -3, for example.

Because the kernel size (in yellow) is 3×1, we are going to need the left and right elements of -3 to match the scale, so in consequence, we take the vector [4, -3, 2]. Then, by finding the sum of the element-wise product, we get the worth of -2:

The worth of -2 represents a derivative for the initial pixel. If we take an attentive look, we are able to notice that the derivative of pixel -3 is just the difference between the rightmost pixel (2) of -3 and its leftmost pixel (4).

Based on the derivative value, we are able to make some observations:

If the derivative value is important in a given image region, it implies that the intensity changes drastically there. Otherwise, there aren’t any noticeable changes by way of brightness.
If the worth of the derivative is positive, it implies that from left to right, the image region becomes brighter; whether it is negative, the image region becomes darker within the direction from left to right.

By making the analogy to linear algebra, kernels may be considered linear operators on images that transform local image regions.

You’ll be able to notice that after applying a convolution filter to the unique 5×5 image, it became 3×3. It’s normal because we cannot apply convolution in the identical approach to edge pixles (otherwise we are going to get out of bounds).

To preserve the image dimensionality, the padding technique is generally used which consists of temporarily extending / interpolating image borders or filling them with zeros, so the convolution may be calculated for edge pixels as well.

By default, libraries like OpenCV mechanically pad the borders to ensure the identical dimensionality for input and output images.

Image gradient

A picture gradient shows how briskly the intensity (brightness) changes at a given pixel in each directions (X and Y).

Formally, image gradient may be written as a vector of image derivatives with respect to X- and Y-axis.

Gradient magnitude

Gradient magnitude represents a norm of the gradient vector and may be found using the formula below:

Gradient orientation

Using the found Gx and Gy, it’s also possible to calculate the angle of the gradient vector:

Example

Allow us to take a look at how we are able to manually calculate gradients based on the instance above. For that, we are going to need the computed 3×3 matrices after the convolution kernel was applied.

If we take the top-left pixel, it has the values and . We will easily calculate the gradient magnitude and orientation:

For the entire 3×3 matrix, we get the next visualization of gradients:

Sobel operator

Having learned the basics of image derivatives and gradients, it’s now time to tackle the Sobel operator, which is used to approximate them. As compared to previous kernels of sizes 3×1 and 1×3, the Sobel operator is defined by a pair of three×3 kernels (for each axes):

This provides a bonus to the Sobel operator because the kernels before measured only 1D changes, ignoring other rows and columns within the neighbourhood. The Sobel operator considers more details about local regions.

One other advantage is that Sobel is more robust to handling noise. Allow us to take a look at the image patch below. If we calculate the derivative across the red element in the middle, which is on the border between dark (2) and vivid (7) pixels, we should always get 5. The issue is that there’s a noisy pixel with the worth of 10.

If we apply the horizontal 1D kernel near the red element, it’ll give significant importance to the pixel value 10, which is a transparent outlier. At the identical time, the Sobel operator is more robust: it’ll take 10 into consideration, in addition to the pixels with a price of seven around it. In some sense, the Sobel operator applies smoothing.

Within the case of the Sobel and Scharr operators, they’re commonly used to detect edges — zones where pixel intensity (and its gradient) drastically changes.

OpenCV

To use Sobel operators, it’s sufficient to make use of the OpenCV function cv2.Sobel. Allow us to take a look at its parameters:

derivative_x = cv2.Sobel(image, cv2.CV_64F, 1, 0)
derivative_y = cv2.Sobel(image, cv2.CV_64F, 0, 1)

The primary parameter is an input NumPy image.
The second parameter (cv2.CV_64F) is the info depth of the output image. The issue is that, on the whole, operators can produce output images containing values outside the interval 0–255. That’s the reason we want to specify the style of pixels we would like the output image to have.
The third and fourth parameters represent the order of the derivative within the x direction and the y direction, respectively. In our case, we only want the primary derivative within the x direction and y direction, so we pass values (1, 0) and (0, 1)

Allow us to take a look at the next example, where we’re given a Sudoku input image:

Allow us to apply the Sobel filter:

import cv2
import matplotlib.pyplot as plt

image = cv2.imread("data/input/sudoku.png")

image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
derivative_x = cv2.Scharr(image, cv2.CV_64F, 1, 0)
derivative_y = cv2.Scharr(image, cv2.CV_64F, 0, 1)

derivative_combined = cv2.addWeighted(derivative_x, 0.5, derivative_y, 0.5, 0)

min_value = min(derivative_x.min(), derivative_y.min(), derivative_combined.min())
max_value = max(derivative_x.max(), derivative_y.max(), derivative_combined.max())

print(f"Value range: ({min_value:.2f}, {max_value:.2f})")

fig, axes = plt.subplots(1, 3, figsize=(16, 6), constrained_layout=True)

axes[0].imshow(derivative_x, cmap='gray', vmin=min_value, vmax=max_value)
axes[0].set_title("Horizontal derivative")
axes[0].axis('off')

image_1 = axes[1].imshow(derivative_y, cmap='gray', vmin=min_value, vmax=max_value)
axes[1].set_title("Vertical derivative")
axes[1].axis('off')

image_2 = axes[2].imshow(derivative_combined, cmap='gray', vmin=min_value, vmax=max_value)
axes[2].set_title("Combined derivative")
axes[2].axis('off')

color_bar = fig.colorbar(image_2, ax=axes.ravel().tolist(), orientation='vertical', fraction=0.025, pad=0.04)

plt.savefig("data/output/sudoku.png")

plt.show()

Because of this, we are able to see that horizontal and vertical derivatives detect the lines thoroughly! Moreover, the mixture of those lines allows us to detect each kinds of features:

Scharr operator

One other popular alternative to the Sober kernel is the Scharr operator:

Despite its substantial similarity with the structure of the Sobel operator, the Scharr kernel achieves higher accuracy in edge detection tasks. It has several critical mathematical properties that we will not be going to contemplate in this text.

OpenCV

The usage of the Scharr filter in OpenCV may be very much like what we saw above with the Sobel filter. The one difference is one other method name (other parameters are the identical):

derivative_x = cv2.Scharr(image, cv2.CV_64F, 1, 0)
derivative_y = cv2.Scharr(image, cv2.CV_64F, 0, 1)

Here is the result we get with the Scharr filter:

On this case, it’s difficult to note the differences in results for each operators. Nevertheless, by the colour map, we are able to see that the range of possible values produced by the Scharr operator is way larger (-800, +800) than it was for Sobel (-200, +200). That’s normal for the reason that Scharr kernel has larger constants.

It’s also a superb example of why we want to make use of a special type cv2.CV_64F. Otherwise, the values would have been clipped to the usual range between 0 and 255, and we’d have lost useful information in regards to the gradients.

Conclusion

By applying calculus fundamentals to computer vision, we now have studied essential image properties that allow us to detect intensity peaks in images. This data is useful since feature detection is a standard task in image evaluation, especially when there are constraints on image processing or when machine learning algorithms will not be used.

We’ve got also checked out an example using OpenCV to see how edge detection works with Sobel and Scharr operators. In the next articles, we are going to study more advanced algorithms for feature detection and examine OpenCV examples.

Feature Detection, Part 1: Image Derivatives, Gradients, and Sobel Operator

Image intensity

Grayscale images

Image derivative

Example

Image gradient

Gradient magnitude

Gradient orientation

Example

Sobel operator

OpenCV

Scharr operator

OpenCV

Conclusion

Resources

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

a Leaderboard for Real World Use Cases

Patch Time Series Transformer in Hugging Face

Constitutional AI with Open LLMs

Hugging Face Text Generation Inference available for AWS Inferentia2

The best way to Leverage Slash Commands to Code Effectively

Feature Detection, Part 1: Image Derivatives, Gradients, and Sobel Operator

Image intensity

Grayscale images

Image derivative

Example

Image gradient

Gradient magnitude

Gradient orientation

Example

Sobel operator

OpenCV

Scharr operator

OpenCV

Conclusion

Resources

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.