Home Artificial Intelligence Getting Began with NumPy and OpenCV for Computer Vision

Getting Began with NumPy and OpenCV for Computer Vision

1
Getting Began with NumPy and OpenCV for Computer Vision

Start Your Coding for Computer vision with Python

Photo by Dan Smedley on Unsplash

Motivation

We, human beings, perceive the environment and surroundings with our vision system. The human eye, brain, and limbs work together to perceive the environment and act accordingly. An intelligent system can perform those tasks which require some level of intelligence if done by a human. So, for performing intelligent tasks, artificial vision system is one among the essential things for a pc. Normally, the camera and image are used to assemble information needed to do the job. Computer vision and Image processing techniques help us to perform similar tasks done by humans, like image recognition, object tracking, etc.

In computer vision, the camera works as a human eye to capture the image, and the processor works as a brain to process the captured image and generate significant results. But there may be a basic difference between humans and computers. The human brain works mechanically, and intelligence is a by-born acquisition. Quite the opposite, the pc has no intelligence without human instruction (program). Computer vision is the option to provide the suitable instruction in order that it might probably work compatible with the human vision system. However the capability is restricted.

Within the upcoming sections, we are going to discuss the essential idea of how the image is formed and could be manipulated using python.

How Image is Formed and Displayed

The image is nothing but a mix of pixels with different color intensities. The jargon ‘pixels’ and ‘color intensity’ could also be unknown to you. Don’t worry. It should be crystal clear, just read the article till the top.

Pixel is the smallest unit/element of the digital image. Details are within the image below.

Image By Creator

The display is formed with pixels. Within the above figure, there are 25 columns and 25 rows. Each small square is taken into account a pixel. The setup can house 625 pixels. It represents a display with 625 pixels. If we shine the pixels with different color intensity (brightness), it should form a digital image.

How does the pc store the image within the memory?

If we take a look at the image rigorously, we will compare it with a 2D matrix. A matrix has rows and columns, and its elements could be addressed with its index. The matrix structure is analogous to an array. And computer store the image in an array of computer memory.

Each array element holds the intensity value of a color. Generally, the intensity value ranges from 0 to 255. For demonstration purposes, I actually have included an array representation of a picture.

Sample Array Representation of a Grayscale Image (Image By Creator)

Grayscale and Coloured Image

The grayscale image is a black-and-white image. It’s formed with just one color. A pixel value near 0 represents darkness and becomes brighter with higher intensity values. The best value is 255, which represents the white color. A 2D array is sufficient to carry the grayscale image, because the last figure shows.

Coloured images can’t be formed with just one color; there is perhaps a whole lot of 1000’s of color combos. Mainly, there are three primary color channels RED (R), GREEN(G), and Blue(B). And every color channel is stored in a 2D array and holds its intensity values, and the ultimate image is the mix of those three color channels.

RGB Color Channel (Image By Creator)

This color model has (256 x 256 x 256) = 16,777,216 possible color combos. You could visualize the mix here.

But in computer memory, the image is stored otherwise.

Image Stored in Computer Memory (Image By Creator)

The pc doesn’t know the RGB channels. It knows the intensity value. The red channel is stored with high intensity, and the green and blue channels are stored with medium and low-intensity values, respectively.

NumPy Basics to Work with Python

NumPy is a fundamental python package for scientific computation. It really works mainly as an array object, but its operation isn’t limited to the array. Nevertheless, the library can handle various numeric and logical operations on numbers [1]. You're going to get NumPy official documentation here.

Let’s start our journey. Very first thing first.

  • Importing the NumPy library.

It’s time to work with NumPy. As we all know, NumPy works with an array. So, let’s attempt to create our first 2D array of all zeros.

It’s so simple as that. We can even create a NumPy array with all ones just as follows.

Interestingly, NumPy also provides a technique to fill the array with any values. The straightforward syntax array.fill(value) can do the job.

The array ‘b’ with all ones is now crammed with 3.

  • The Function of Seed in case of Random Number Generation

Just have a take a look at the next coding examples.

In the primary code cell, we have now used np.random.seed(seed_value), but we haven’t used any seeding for the opposite two code cells. There’s a significant difference between random number generation with and without seeding. Within the case of random seeding, the generated random number stays the identical for a selected seed value. Then again, with no seed value, random number changes for every execution.

  • Basic operations (max, min, mean, reshape, etc.) with NumPy

NumPy has made our life easier by providing quite a few functions to do mathematical operations. array_name.min(), array_name.max(), array_name.mean() syntaxes help us find an array’s minimum, maximum, and mean values. Coding example —

Indeies of the minimum and maximum values could be extracted with the syntaxes array_name.argmax(), array_name.argmin(). Example —

Array reshaping is one among the essential operations of NumPy. array_name.reshape(row_no, column_no) is the syntax for reshaping an array. While reshaping the array, we should be careful concerning the variety of array elements before and after reshaping. In each cases, the whole variety of elements should be the identical.

  • Array Indexing and Slicing

Each array element could be addressed with its column and row number. Let’s generate one other array with 10 rows and columns.

Suppose we wish to seek out the worth of the primary value of the array. It will probably be extracted by passing the row and column index (0 , 0).

Specific row and column values could be sliced with the syntax array_name[row_no,:], array_name[:,column_no].

Let’s attempt to slice the central elements of the array.

OpenCV Basics

OpenCV is an open-source python library for Computer Vision developed by Intel [2]. I’ll discuss a number of usages of OpvenCv though its scope is vast. You can see the official documentation here.

I actually have used the next image for demonstration purposes.

Image by jackouille21 from Pixabay
  • Importing OpenCV and Matplotlib library

Matplotlib is a visualization library. It helps to visualise the image.

  • Loading the image with OpenCV and visualize with matplotlib

We have now read the image with OpenCV and visualized it with the matplotlib library. The colour has been modified because OpenCV reads the image in BGR format as an alternative of RGB, but matplotlib expects the image in RGB format. So, we want to convert the image from BGR to RGB.

  • Converting the image from BGR to RGB format

Now, the image seems okay.

  • Converting image to grayscale

We are able to easily convert the image from BGR to grayscale with cv2.COLOR_BGR2GRAY is as follows.

The above image will not be properly gray though it has been converted to grayscale. It has been visualized with matplotlib. By default, matplotlib uses color mapping aside from grayscale. To properly visualize it, we want to specify the grayscale color mapping in matplotlib. Let’s try this.

Rotating can be a simple task with OpenCV. cv2.rotate() function helps us to try this. Clockwise and anticlockwise 90-degree and 180-degree rotation have shown below.

We are able to resize the image by passing the width and height pixel values to the cv2.resize() function.

Sometimes we want to attract on an existing image. For instance, we want to attract a bounding box on a picture object to discover it. Let’s draw a rectangle on the flower. cv2.rectangle() function helps to attract on it. It takes some parameters just like the image on which we draw the rectangle, the coordinate point of the upper left corner (pt1) and the lower right corner (pt2), and the thickness of the boundary line. A coding example is given below.

There are other drawing functions cv.line(), cv.circle() , cv.ellipse(), cv.putText(), etc. The total official documentation is accessible here [3].

Play with NumPy

We are going to change the intensity value of a picture. I’ll try to maintain it easy. So, consider the grayscale image shown previously. Find the form of the image.

It shows it’s a 2D array with a size of 1200 x 1920. In the essential NumPy operation, we learned easy methods to slice an array.

Using the concept, we have now taken the grayscale image array slice [400:800, 750:1350] and replaced the intensity values with 255. Finally, we visualize it and find the above image.

Conclusion

Computer vision is one among the promising fields in modern computer science technology. I at all times emphasize the essential knowledge of any domain. I actually have discussed just the first knowledge of computer vision and shown some hands-on coding. The concepts are quite simple but may play a major role for the beginner of computer vision.

That is the primary article of the pc vision series. Get connected to read the upcoming articles.

[N.B. Instructor Jose Portilla’s course helps me to gather knowledge.]

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here