Unit 1: Introduction to Neural Networks and Deep Learning
Unit 1: Introduction to Neural Networks and Deep Learning
1. Artificial Neural Networks (ANNs):
- Definition: Inspired by biological neural networks, ANNs are computing systems composed of
layers of interconnected nodes (neurons).
- Structure:
- Input Layer: Receives raw data.
- Hidden Layer(s): Performs transformations and identifies patterns.
- Output Layer: Produces the final predictions or outputs.
- Computation: Each neuron processes a weighted sum of inputs, adds a bias, and passes it
through an activation function.
Advantages:
1. Parallel Processing: Ability to perform multiple tasks simultaneously.
2. Fault Tolerance: Can still function even with partial damage to the network.
3. Memory Distribution: Data is stored across the network, not centrally.
4. Handles Incomplete Data: Can make predictions with missing information.
Disadvantages:
1. Uncertain Structure: No fixed rules for designing network architectures.
2. Lack of Transparency: Limited insight into the internal workings ("black box").
3. Hardware Dependence: Requires high computational power.
4. Time-consuming Training: Requires substantial time to train complex networks.
2. Deep Learning:
- Definition: A specialized branch of machine learning using neural networks with multiple hidden
layers to learn complex features and patterns.
- Characteristics:
- Automatically extracts features from raw data.
- Can handle large datasets with high-dimensional features.
- Applications:
- Image recognition, speech processing, natural language understanding, and more.
Architectures:
1. Deep Neural Networks (DNNs): Networks with multiple hidden layers for complex non-linear
relationships.
2. Deep Belief Networks (DBNs): Consist of stacked layers of Restricted Boltzmann Machines.
3. Recurrent Neural Networks (RNNs): Suitable for sequential data like time-series and text.
3. Components of Neural Networks:
1. Weights and Biases: Weights determine the strength of the connection. Bias shifts the activation
function's output.
2. Activation Functions:
- Linear: Simple linear relationships, not suitable for complex patterns.
- ReLU: Rectified Linear Unit, widely used in hidden layers.
- Sigmoid: Produces outputs between 0 and 1, used for binary classification.
- Tanh: Produces outputs between -1 and 1, often used in hidden layers.
- Softmax: Used for multi-class classification.
3. Error and Loss Functions: Measure the deviation between predicted and actual outcomes.
4. Learning Rate (alpha): Determines how quickly the network updates weights during training.
4. Learning Rules in ANNs:
1. Hebbian Learning Rule: Increases the connection strength between neurons that activate
together.
2. Delta Rule: Minimizes error through gradient descent.
3. Perceptron Rule: Adjusts weights for binary classifications.
4. Competitive Learning: Nodes compete to represent the input, and the winner is activated.
5. Out-Star Learning: Adjusts weights to match target outputs in layered networks.
5. Perceptron and Backpropagation:
1. Perceptron:
- A single-layer neural network for binary classification.
- Steps:
1. Calculate weighted sum.
2. Apply an activation function.
- Limitations: Can only classify linearly separable data.
2. Backpropagation:
- A method to train multi-layer networks using gradient descent.
- Two stages:
- Forward Pass: Compute outputs using current weights.
- Backward Pass: Adjust weights based on error gradients.
- Iterative process until the error converges to a minimum.
6. Applications of Neural Networks:
1. Facial Recognition: Using Convolutional Neural Networks (CNNs) for robust surveillance and
authentication.
2. Stock Market Prediction: Multilayer Perceptrons analyze historical stock data for forecasts.
3. Healthcare:
- CNNs for medical image analysis.
- Recurrent Neural Networks (RNNs) for voice and patient data recognition.
- Generative networks for drug discovery.
4. Weather Forecasting: Combines CNNs and RNNs to predict weather patterns.
5. Aerospace and Defense:
- Fault detection, autopilot systems, and underwater mine detection.
6. Social Media: Analyzing user behavior and preferences for targeted content.
7. Signature Verification: Ensuring document authenticity through ANN analysis.
7. Activation and Loss Functions:
- Activation Functions: Add non-linearity to model outputs and allow networks to learn complex
mappings.
- Loss Functions: Define objectives to minimize errors (e.g., MSE, Binary Cross-Entropy, and
Categorical Cross-Entropy).
8. Summary of Key Takeaways:
- Neural networks mimic the brain's functioning to solve problems.
- Deep learning extends traditional ANNs with additional layers and capabilities.
- Training involves defining architectures, updating weights, and minimizing errors.
- Applications are vast, spanning industries like healthcare, defense, and finance.