This text is inspired by Andrej Karpathy , I might highly recommend to undergo the below playlist.Because it is probably the most step-by-step spelled-out explanation of Back Propagation and training of neural networks.Back Propagation...
Deriving optimal initial variance of weight matrices in neural network layers with ReLU activation functionInitialization techniques are one in every of the prerequisites for successfully training a deep learning architecture. Traditionally, weight initialization methods...