On the earth of neural networks, padding refers back to the technique of adding extra values, normally zeros, around the sides of an information matrix. This method is often utilized in convolutional neural networks (CNNs), that are widely utilized in computer vision and image processing tasks. On this blog post, we’ll explore why padding is utilized in CNNs, and the way it really works.
The essential idea behind a convolutional neural network is to learn spatially invariant features from raw input data. The primary layer in a CNN is often a convolutional layer, which applies a set of filters to the input data to extract features. These filters are typically small in size, equivalent to 3×3 or 5×5, and so they are applied at multiple spatial locations across the input data.
One issue with this approach is that the filters can’t be applied at the sides of the input data, since there usually are not enough pixels to use the filter to. This will result in a loss of knowledge at the sides of the image, and may also make it difficult for the network to learn features which are positioned near the sides.
To deal with this problem, padding is used so as to add extra values around the sides of the input data. By adding zeros, the input data is expanded, allowing the filters to be applied to the sides of the image. This has the effect of preserving information at the sides, and likewise makes it easier for the network to learn features which are positioned near the sides.
Padding is a comparatively easy process that involves adding extra rows and columns of zeros around the sides of the input data. There are two principal varieties of padding: valid padding and same padding.
, also often called no padding, refers back to the technique of not adding any extra values around the sides of the input data. Because of this the output size of the convolutional layer can be smaller than the input size because the filters can’t be applied to the sides of the input data.
Consider that the input data is an nxn image and the filter size is fxf, then the output size of the convolutional layer is given by (n-f+1)x(n-f+1) with valid padding or no padding. For instance, if the image size is 5×5 and the filter size is 3×3 then the output size of the feature map could be (5–3+1)x(5–3+1) i.e., 3×3. Thus it might probably be seen that the dimensions of the feature map(3×3) is lower than the unique image(5×5).
, alternatively, refers back to the technique of adding enough extra values around the sides of the input data in order that the output size of the convolutional layer is similar because the input size. This is often achieved by adding p=(f-1)/2 rows and columns of zeros around the sides of the input data, where f is the dimensions of the filter.
Again, consider a picture of size nxn and filter of size fxf, then the output of the convolutional layer(feature map) is given by (n+2p-f+1)x(n+2p-f+1) with ‘same’ padding, where p is the padding added. For instance, if the image size is 5×5 and the filter size is 3×3 the p=(3–2)/1 =1 and the feature map size obtained is (5+2(1)-3+1)x(5+2(1)-3+1) i.e., 5×5 which involves be the identical size as the unique image.
Now we understood what’s padding and the way padding works. Let’s delve into the explanation, why zero is used for padding? Why not 1 or another number?
A normalized image has an pixel range of 0 to 1 and one which shouldn’t be normalized has a pixel range of 0 to 255. In each the cases 0 is the minimum number which eventually becomes the selection as computationally we’re considering the minimum number amongst those pixels.
But what if we’ve got values starting from -0.5 to 0.5 or we’ve got tanh as our activation function, which ranges the values from -1 to 1. In such a case minimum number i.e., -0.5 and -1 is to used, and if we use padding of 0 around it, it will be a gray boundary as an alternative of black.
However it shouldn’t be all the time possible as to where we will calculate or what is going to occur to the inactivation values. So one of the best option available that may satisfy a lot of the cases is 0 padding. Although we might have a gray boundary as an alternative of black within the above two mentioned cases that wont be an enormous loss. Most often we use relu as our activation function, which scales down the negative number to 0.
Now that we’re clear in regards to the concept of padding let’s construct a model using valid and same padding respectively.
Importing Libraries
import tensorflow
from tensorflow import keras
from keras.layers import Dense,Conv2D,Flatten
from keras import Sequential
Valid Padding
model = Sequential()model.add(Conv2D(32,kernel_size=(3,3),padding='valid', activation='relu', input_shape=(28,28,1)))
model.add(Conv2D(32,kernel_size=(3,3),padding='valid', activation='relu'))
model.add(Conv2D(32,kernel_size=(3,3),padding='valid', activation='relu'))
model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(Dense(10,activation='softmax'))
Same Padding
model = Sequential()model.add(Conv2D(32,kernel_size=(3,3),padding='same', activation='relu', input_shape=(28,28,1)))
model.add(Conv2D(32,kernel_size=(3,3),padding='same', activation='relu'))
model.add(Conv2D(32,kernel_size=(3,3),padding='same', activation='relu'))
model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(Dense(10,activation='softmax'))
Conclusion
In conclusion, padding is a critical technique utilized in neural networks for various reasons. It helps preserve spatial dimensions in convolutional layers, stopping loss of knowledge during convolution.
Your article gave me a lot of inspiration, I hope you can explain your point of view in more detail, because I have some doubts, thank you.
yıldıznamesine baktırmak isteyenler http://www.medyumnazar.com