LinearRegressionProject/
βββ data/ # Store dataset files here
β
βββ models/
β βββ linear_regression.py # OLS, Batch GD, SGD, Mini-Batch GD
β βββ regularization.py # Ridge (L2), Lasso (L1)
β βββ scaling.py # Standardization, Normalization
β βββ encoding.py # One-Hot and Label Encoding
β
βββ metrics/
β βββ evaluation.py # Evaluation metrics: RΒ², MSE
β
βββ notebook/
β βββ LinearRegressionDemo.ipynb # Jupyter Notebook for step-by-step usage
β
βββ main.py # Script to train/test model and run pipeline
βββ README.md # Project overview, instructions, and structureThis project is a complete implementation of Linear Regression from scratch using NumPy, designed to be clean, modular, and educational. It covers everything from preprocessing and encoding to regularization and different gradient descent techniques β all without using any machine learning libraries like scikit-learn for the model itself.
-
π Optimization Methods
- OLS (Ordinary Least Squares)
- Batch Gradient Descent
- Stochastic Gradient Descent
- Mini-batch Gradient Descent
-
π§© Regularization
- L1 Regularization (Lasso)
- L2 Regularization (Ridge)
-
π§ Preprocessing
- Standardization
- Normalization
-
ποΈ Encoding
- One-Hot Encoding
- Label Encoding
-
π Evaluation Metrics
- RΒ² Score
- Mean Squared Error (MSE)
This project is built for learning and experimentation. Instead of relying on external ML libraries, every step β from optimization to regularization β is written from scratch using only NumPy. Itβs a great resource if you're learning:
- The math behind regression
- How regularization affects optimization
- How preprocessing affects learning
- How to structure ML code modularly