cover image

Optimization in Deep Learning

Optimizers are essential components of machine learning algorithms, responsible for adjusting the parameters of a model to minimize the loss function. A loss function measures how well the model’s predictions match the true values, and the optimizer helps find the parameters (weights) that minimize this loss. Essentially, the optimizer guides the model to find the optimal set of weights that result in the best performance. In this blog post, we will dive into the basics of some popular optimizers used in machine learning: Stochastic Gradient Descent (SGD), SGD with Momentum, and Adam. We will also visualize how each of these optimizers performs in finding the minimum of a given loss curve. ...

November 23, 2024 · 11 min · Anil Paudel

Initialization in Deep Learning

When training a neural network, after defining the model architecture, a crucial step is to properly initialize the weights. This initialization is essential to achieve stable and efficient training. Proper weight initialization helps prevent issues such as exploding or vanishing gradients, which can significantly hinder the learning process. It turns out that if you do it wrong, it can lead to exploding or vanishing weights and gradients. That means that either the weights of the model explode to infinity, or they vanish to 0 (literally, because computers can’t represent infinitely accurate floating point numbers), which make training deep neural networks very challenging. ...

November 21, 2024 · 8 min · Anil Paudel

Why Sigmoid Fails in Deep Neural Network?

Deep neural networks rely on the backpropagation algorithm to train effectively. A crucial component of backpropagation is the calculation of gradients, which dictate how much the weights in the network should be updated. However, when the sigmoid activation function is used in hidden layers, it can lead to the vanishing gradient problem. In this post, we’ll focus on how gradients are calculated and why sigmoid makes them vanish in deep networks. ...

November 19, 2024 · 6 min · Anil Paudel