2. Back Propagation- Introduction
• Backpropagation is a supervised learning algorithm used to train artificial
neural networks by adjusting the weights of the network based on the error (or
loss) between the predicted and actual outputs.
• Introduced in the 1970s, the backpropagation algorithm is the method for fine-
tuning the weights of a neural network with respect to the error rate obtained
in the previous iteration or epoch, and this is a standard method of training
artificial neural networks.
• Backpropagation is a method used to train neural networks by teaching them
how to improve their predictions. It works by adjusting the internal parameters
(called weights) of the network based on the errors (or mistakes) it makes.
• It is a key part of the gradient descent optimization process, where the model
learns by minimizing the loss function through iterative updates to the
network's parameters.
3. • You can think of it as a feedback system where, after each round of
training or 'epoch,' the network reviews its performance on tasks.
• It calculates the difference between its output and the correct
answer, known as the error. Then, it adjusts its internal parameters, or
'weights,' to reduce this error next time.
• This method is essential for tuning the neural network's accuracy and
is a foundational strategy in learning to make better predictions or
decisions
4. How Does Backpropagation Work?
• Now that you know what
backpropagation is, let’s dive
into how it works. Below is an
illustration of the
backpropagation algorithm
applied to a neural network of:
• Two inputs X1 and X2
• Two hidden layers N1X and N2X,
where X takes the values of 1, 2
and 3
• One output layer
6. Backpropagation Algorithm:
There are overall four main
steps in the backpropagation
algorithm:
• Forward pass
• Errors calculation
• Backward pass
• Weights update
7. Forward pass
This is the first step of the backpropagation process, and it’s
illustrated below:
• The data ( inputs X1 and X2) is fed to the input layer
• Then, each input is multiplied by its corresponding weight,
and the results are passed to the neurons N1X and N2X of the
hidden layers.
• Those neurons apply an activation function to the weighted
inputs they receive, and the result passes to the next layer.
8. Errors calculation
• The process continues until the output layer generates the
final output (o/p).
• The output of the network is then compared to the ground
truth (desired output), and the difference is calculated,
resulting in an error value.
9. Backward pass
• This is an actual backpropagation step, and can not be performed without the above forward and
error calculation steps.
• Backpropagation refers to the process of propagating the error (or loss) backward through the network
to update the weights. This is done using the chain rule of calculus to calculate the gradient of the
loss with respect to each weight and bias. The gradients are used to determine how the weights should
be adjusted to minimize the error
Here is how it works:
• The error value obtained previously is used to calculate the gradient of the loss function.
• The gradient of the error is propagated back through the network, starting from the output layer to the
hidden layers.
• As the error gradient propagates back, the weights (represented by the lines connecting the nodes) are
updated according to their contribution to the error. This involves taking the derivative of the error
with respect to each weight, which indicates how much a change in the weight would change the error.
• The learning rate determines the size of the weight updates. A smaller learning rate means than the
weights are updated by a smaller amount, and vice-versa
10. Weights update
• The weights are updated in the
opposite direction of the gradient,
leading to the name “gradient
descent.” It aims to reduce the
error in the next forward pass.
• This process of forward pass, error
calculation, backward pass, and
weights update continues for
multiple epochs until the network
performance reaches a
satisfactory level or stops
improving significantly.
11. • Iterative Process:
Backpropagation is performed iteratively for multiple training
examples (or mini-batches in case of batch gradient descent), and the
weights are updated continuously after each pass. This helps the
model converge toward optimal weights that minimize the error on
the training data.
12. What is a Gradient?
• In the context of neural networks, a gradient is the partial derivative of the error (or
loss) with respect to each weight in the network. It tells us how much a small change
in a weight will affect the network's output and ultimately the error.
• In simple terms, the gradient tells us how sensitive the error is to changes in a
weight. If a weight's gradient is large, it means changing that weight slightly will
cause a large change in the error. If the gradient is small, then changing the weight
will have a smaller effect on the error.
• In the context of neural networks and backpropagation, the gradient tells us how
much a change in each weight (or parameter) will affect the overall error (or loss) of
the network.
• Gradient represents the slope or steepness of the function at a particular point.
• In backpropagation, the gradient tells us how to adjust the weights to minimize the
error and improve the network’s predictions.
13. 2. Why Do We Calculate Gradients in Backpropagation?
• Backpropagation is the process used to train neural networks. During training, we want to
minimize the error (or loss) between the predicted output and the true output. To do this,
we need to know how to adjust the weights in the network to reduce the error. The
gradient provides this information.
• In neural networks, we use gradient descent to update the weights. The gradient tells us
the direction and magnitude of the change in the weights that will reduce the error.
• Steps Involved in Backpropagation:
1.Forward Pass: We send input data through the network to compute the predicted output.
2.Error Calculation: We calculate how far the predicted output is from the actual target
output (e.g., using mean squared error).
3.Backward Pass (Backpropagation): We calculate the gradients (partial derivatives) of the
error with respect to each weight in the network.
4.Weight Update: We update the weights using gradient descent based on the gradients.
14. 3. How Gradients Work in Neural Networks:
In a neural network, there are typically following types of weights:
• Weights between the input layer and the hidden layer(s)
• Weights between the hidden layer(s) and the output layer
• The process of backpropagation calculates the gradient for each of these weights, so we
can update them to reduce the error.
Example of Gradient in Neural Networks:
• Let’s consider a simple network with:
• 1 input layer (with input x)
• 1 hidden layer (with 1 neuron)
• 1 output layer (with 1 neuron)
• We start by calculating the output for a given input x, then compare it to the true output
to calculate the error.
• The gradient helps us understand how much the weights in the network contribute to the
error. By calculating gradients for each weight, we know how much to adjust each weight
to minimize the error.
15. Summary of Steps in Backpropagation
1.Forward Propagation:
• Compute activations for each layer using the current weights and biases.
2.Loss Calculation:
• Compute the error between the predicted output and true output.
3.Backpropagate the Error:
• Compute the gradients of the error with respect to each weight, starting from the output layer and
moving backward to the input layer.
4.Update Weights:
• Adjust the weights and biases using gradient descent, where each weight is updated based on the
calculated gradient.
5.Repeat the process:
• Repeat the process for multiple training examples (or batches) until the network converges to an
optimal set of weights.
16. Why is Backpropagation Important?
• Efficiency:
Backpropagation makes it computationally feasible to train large neural
networks. Without backpropagation, calculating gradients manually for each
weight in a deep network would be difficult.
• Optimization:
By efficiently updating the weights, backpropagation allows the network to
minimize the loss function and improve its ability to make accurate predictions.
• Foundation of Deep Learning:
Backpropagation is a key algorithm behind deep learning and is used to train
most types of neural networks, including feedforward networks, convolutional
neural networks (CNNs), and recurrent neural networks (RNNs).
17. Example 1: Back Propagation
• Let's walk through a simple mathematical example of the
backpropagation algorithm step-by-step, using a small neural
network. We'll focus on a network with:
• 1 input layer
• 1 hidden layer
• 1 output layer
• This example will show the forward pass, how to calculate the error,
and then the backpropagation process to update the weights.
25. • Summary
In this simple example, we've:
• Calculated the network's prediction.
• Computed the error.
• Backpropagated the error to find gradients.
• Updated the weights to reduce the error.
• Through this process, the network learns and improves its predictions
over time by adjusting its weights using backpropagation.
26. Example 2 of backpropagation
https://www.youtube.com/watch?v=tUoUdOdTkRw&t=137s
30. Step 3: Calculating the gradients- Backward pass or Backpropagation step
In this case, we are going find the gradients to reduce the error in the previous step
32. Steps 4: Update the
weights, find the weights for
w45,w35,w13,w24,w14, w23
36. Error step: If we are not satisfied with this error,
calculate the gradients and update the wieghts
37. Example 3 on Back Propagation video link
• https://www.youtube.com/watch?v=n2L1J5JYgUk
38. Example 4 of backpropagation
• Let’s walk through an example of backpropagation in machine
learning. Assume the neurons use the sigmoid activation function for
the forward and backward pass. The target output is 0.5, and the
learning rate is 1.
49. Challenges with Backpropagation
• While backpropagation is powerful, it does face some challenges:
1.Vanishing Gradient Problem: In deep networks, the gradients
can become very small during backpropagation, making it
difficult for the network to learn. This is common when using
activation functions like sigmoid or tanh.
2.Exploding Gradients: The gradients can also become excessively
large, causing the network to diverge during training.
3.Overfitting: If the network is too complex, it might memorize the
training data instead of learning general patterns.
50. Conclusion
• Backpropagation is the engine that drives neural network learning. By
propagating errors backward and adjusting the weights and biases,
neural networks can gradually improve their predictions.
• Though it has some limitations like vanishing gradients, many
techniques, such as using ReLU activation or optimizing learning rates,
have been developed to address these issues.