SlideShare a Scribd company logo
Start Machine learning programming in
5 simple steps
By Renjith M P
https://www.linkedin.com/in/renjith-m-p-bbb67860/
To start with machine learning, we need to follow five basic steps.
Steps
1. Choose a use case / problem statement :- Define your objective
2. Prepare data to train the system :- for any machine learning project, first your need to train
the system with some data
3. Choose a programming language and useful libraries for machine learning :- Yes, obviously
you need to choose a programming language to implement your machine learning
4. Training and prediction implementation :- Implement your solution using the programming
language that you have selected
5. Evaluate the result accuracy :- validate the results (Based on accuracy results, we could
accept the model or we could fine tune the model with various parameters and improve the
model until we get a satisfactory result )
Warning:
Target Audience : Basic knowledge on python (execute python scripts, install packages etc ) is
mandatory to follow this course.
Lets get into action. We will choose a use case and implement the machine learning for the same.
1. Choose a use case / problem statement
Usecase : Predict the species of iris flower based on the lengths and widths of sepals and
petals .
Iris setosa Iris versicolor Iris virginica
2. Prepare data to train the system
We will be using iris flower data set (https://en.wikipedia.org/wiki/Iris_flower_data_set )
which consist of 150 rows. Each row will have 5 columns
1. sepal length
2. sepal width
3. petal length
4.petal width
5.species of iris plant
out of 150 rows, only 120 rows will be used to train the model and rest will be used to
validate the accuracy of predictions.
3. Choose a programming language and libaries for machine learning
There are quite few options available however the famous once are R & Python.
My choice is Python. Unlike R, Python is a complete language and platform that you can
use for both research and development and to develop production systems
Ecosystem & Libraries
Machine learning needs plenty of numeric computations, data mining, algorithms and
plotting.
Python offers a few ecosystems and libraries for multiple functionalities.One of the
commonly used ecosystem is SciPy,which is a collection of open source software for
scientific computing in Python, which has many packages or libraries.
Out of that please find the list of packages from SciPy ecosystem,that we are going to use
Package Desciption
NumPy The fundamental package for numerical computation. It defines the
numerical array and matrix types and basic operations on them.
MatplotLib a mature and popular plotting package, that provides publication-
quality 2D plotting as well as rudimentary 3D plotting
SciPy Library One of the components of the SciPy stack, providing many numerical
routines
Pandas Providing high-performance, easy to use data structures
sklearn Simple and efficient tools for data mining and data analysis
Accessible to everybody, and reusable in various contexts
Built on NumPy, SciPy, and matplotlib
4. Training, Prediction and validation implementation
4.1. Import libraries (before importing make sure you install them using pip/pip3)
4.2. Load data to train the model
import pandas
import matplotlib.pyplot as plt
from sklearn import model_selection
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.discriminant_analysis import
LinearDiscriminantAnalysis
from sklearn.naive_bayes import GaussianNB
from sklearn.svm import SVC
# Load dataset
url =
"https://raw.githubusercontent.com/renjithmp/machinelearning/maste
r/python/usecases/1_irisflowers/iris.csv"
names = ['sepal-length', 'sepal-width', 'petal-length',
'petal-width', 'class']
dataset = pandas.read_csv(url, names=names)
#print important information about dataset
print(dataset.shape)
print (dataset.head(20))
print (dataset.describe())
print(dataset.groupby('class').size())
#visualize the data
dataset.plot(kind='box', subplots=True, layout=(2,2),
sharex=False, sharey=False)
plt.show()
Explanation :
dataset.plot() function :-
The box plot (a.k.a. box and whisker diagram) is a standardized way of displaying the
distribution of data based on the five number summary: minimum, first quartile, median, third
quartile, and maximum. In the simplest box plot the central rectangle spans the first quartile to the
third quartile (the interquartile range or IQR). A segment inside the rectangle shows the median and
"whiskers" above and below the box show the locations of the minimum and maximum.
In machine learning, it is important to analys the data using different parameters.
Visualize them using plot methods makes it much easier than analyze data in tabular format.
For our use case, we will get below plots for sepal,petal length’s and widths.
4.3. split the data for training and validation
Explanation :-
X_train – training data (120 rows consist of petal ,sepal lengths and widths)
Y_train – training data (120 rows consist of class of plant)
x_validate – validation data (30 rows conist of petal,sepal lengths and widths)
Y_train -validation data(30 rows consist of class of plant)
4.4.Train few models using training data
Lets use X_train and Y_train to train few models
models=[]
models.append(('LR',LogisticRegression()))
models.append(('LDA',LinearDiscriminantAnalysis()))
models.append(('KNN',KNeighborsClassifier()))
models.append(('CART',DecisionTreeClassifier()))
models.append(('NB',GaussianNB()))
models.append(('SVM',SVC()))
array=dataset.values
X=array[:,0:4]
Y=array[:,4]
validation_size=0.20
seed=7
scoring='accuracy'
X_train,X_validation,Y_train,Y_validation=model_selection.train_t
est_split(X,Y,test_size=validation_size,random_state=seed)
The explanation of algorithms can be found @ scikit-learn.org e.g
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.ht
ml
I am not covering them here as it need a much deeper explanation. For now, we need to keep in mind that
a model is something that has the capability to learn by it self using the training data and predict the output
for future use cases
Explanation :
Kfold :- it is a very useful function to divide and shuffle the data in dataset.
Here we are dividing the data in to 10 equal parts.
Cross_val_score :– This is the most important step. We are feeding the model with training data (X_train
-input and Y_train -corresponding output ). The method will execute the model and provide accuracy for
each of the fold (remember we used 10 folds)
take the mean and std deviation of 10 fold’s to see the accuracy for the entire training set.
4.5. Choose the best model which seems to be more accurate
As you can see, we have executed 5 different models for the training data (5 different algorithms) and
results shows that (cv_results.mean())
KneighborsClassifier() gives the most accurate results (0.98 or 98 %)
4.6.Predict and validate the results using validation data set
results=[]
names=[]
for name,model in models:
kfold=model_selection.KFold(n_splits=10,random_state=seed)
cv_results=model_selection.cross_val_score(model,X_train,Y_train,c
v=kfold,scoring=scoring)
results.append(cv_results)
names.append(name)
msg="%s: %f (%f)" % (name,cv_results.mean(),cv_results.std())
print(msg)
knn=KNeighborsClassifier()
knn.fit(X_train,Y_train)
predictions=knn.predict(X_validation)
print(accuracy_score(Y_validation,predictions))
Lets choose KNN and find predict the output for validation data
5. Publish results
The accuracy_score() function can be used to see the accuracy of the prediction. In our use case we
can see an accuracy of 0.90 (90%)
You can find the source code here
https://github.com/renjithmp/machinelearning/blob/master/python/usecases/1_irisflowers/
flowerclassprediction.py
Reference
Jason Brownlee article
https://machinelearningmastery.com/machine-learning-in-python-step-by-step/
Scikit
https://scikit-learn.org

More Related Content

What's hot (20)

PPT
Machine Learning presentation.
butest
 
PDF
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Edureka!
 
DOCX
Feature extraction for classifying students based on theirac ademic performance
Venkat Projects
 
PDF
Object Oriented Programming Lab Manual
Abdul Hannan
 
PPTX
How to Win Machine Learning Competitions ?
HackerEarth
 
PPTX
Machine learning with scikitlearn
Pratap Dangeti
 
PPTX
10 R Packages to Win Kaggle Competitions
DataRobot
 
PDF
Object Oriented Programming in Matlab
AlbanLevy
 
PPTX
Tweets Classification using Naive Bayes and SVM
Trilok Sharma
 
PDF
Using Optimal Learning to Tune Deep Learning Pipelines
Scott Clark
 
PDF
Kaggle presentation
HJ van Veen
 
PPTX
Presentation on BornoNet Research Paper and Python Basics
Shibbir Ahmed
 
PDF
machine learning
Mounisha A
 
PDF
H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
Sri Ambati
 
PDF
An introduction to Machine Learning
butest
 
PPTX
Machine learning
Saurabh Agrawal
 
PPTX
Presentation on supervised learning
Tonmoy Bhagawati
 
PDF
Feature Engineering - Getting most out of data for predictive models
Gabriel Moreira
 
PPTX
Machine Learning Fundamentals
SigOpt
 
PPTX
Supervised Machine Learning in R
Babu Priyavrat
 
Machine Learning presentation.
butest
 
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Edureka!
 
Feature extraction for classifying students based on theirac ademic performance
Venkat Projects
 
Object Oriented Programming Lab Manual
Abdul Hannan
 
How to Win Machine Learning Competitions ?
HackerEarth
 
Machine learning with scikitlearn
Pratap Dangeti
 
10 R Packages to Win Kaggle Competitions
DataRobot
 
Object Oriented Programming in Matlab
AlbanLevy
 
Tweets Classification using Naive Bayes and SVM
Trilok Sharma
 
Using Optimal Learning to Tune Deep Learning Pipelines
Scott Clark
 
Kaggle presentation
HJ van Veen
 
Presentation on BornoNet Research Paper and Python Basics
Shibbir Ahmed
 
machine learning
Mounisha A
 
H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
Sri Ambati
 
An introduction to Machine Learning
butest
 
Machine learning
Saurabh Agrawal
 
Presentation on supervised learning
Tonmoy Bhagawati
 
Feature Engineering - Getting most out of data for predictive models
Gabriel Moreira
 
Machine Learning Fundamentals
SigOpt
 
Supervised Machine Learning in R
Babu Priyavrat
 

Similar to Start machine learning in 5 simple steps (20)

PDF
Workshop: Your first machine learning project
Alex Austin
 
ODP
Quick Machine learning projects steps in 5 mins
Naveen Davis
 
PPTX
[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...
DevDay Da Nang
 
PDF
Introduction to Machine Learning with Python ( PDFDrive.com ).pdf
bisan3
 
PDF
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET Journal
 
PDF
Pycon 2012 Scikit-Learn
Anoop Thomas Mathew
 
PPTX
Introduction to Machine Learning
Andrew Ferlitsch
 
PPTX
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
Rabi Das
 
PPTX
Session 06 machine learning.pptx
bodaceacat
 
PPTX
Session 06 machine learning.pptx
Sara-Jayne Terp
 
DOCX
AIMLProgram-6 AIMLProgram-6 AIMLProgram-6 AIMLProgram-6
RaghuBR9
 
PPTX
Machine_Learning_Basics_Presentation.pptx
GAURAVSHARMA512929
 
PDF
AIRLINE FARE PRICE PREDICTION
IRJET Journal
 
PPTX
MACHINE LEARNING WITH PYTHON PPT.pptx
SkillUp Online
 
PDF
Machine Learning Crash Course by Sebastian Raschka
PawanJayarathna1
 
PDF
Introduction To Machine Learning With Python A Guide For Data Scientists 1st ...
geyzelgarban
 
PDF
Predictive modeling
Prashant Mudgal
 
PDF
ML with python.pdf
n58648017
 
PPTX
Machine Learning: Transforming Data into Insights
pemac73062
 
PPTX
wk5ppt2_Iris
AliciaWei1
 
Workshop: Your first machine learning project
Alex Austin
 
Quick Machine learning projects steps in 5 mins
Naveen Davis
 
[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...
DevDay Da Nang
 
Introduction to Machine Learning with Python ( PDFDrive.com ).pdf
bisan3
 
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET Journal
 
Pycon 2012 Scikit-Learn
Anoop Thomas Mathew
 
Introduction to Machine Learning
Andrew Ferlitsch
 
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
Rabi Das
 
Session 06 machine learning.pptx
bodaceacat
 
Session 06 machine learning.pptx
Sara-Jayne Terp
 
AIMLProgram-6 AIMLProgram-6 AIMLProgram-6 AIMLProgram-6
RaghuBR9
 
Machine_Learning_Basics_Presentation.pptx
GAURAVSHARMA512929
 
AIRLINE FARE PRICE PREDICTION
IRJET Journal
 
MACHINE LEARNING WITH PYTHON PPT.pptx
SkillUp Online
 
Machine Learning Crash Course by Sebastian Raschka
PawanJayarathna1
 
Introduction To Machine Learning With Python A Guide For Data Scientists 1st ...
geyzelgarban
 
Predictive modeling
Prashant Mudgal
 
ML with python.pdf
n58648017
 
Machine Learning: Transforming Data into Insights
pemac73062
 
wk5ppt2_Iris
AliciaWei1
 
Ad

Recently uploaded (20)

PPTX
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PPTX
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
PDF
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
PPT
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
PPTX
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
PDF
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
PDF
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
PDF
Merits and Demerits of DBMS over File System & 3-Tier Architecture in DBMS
MD RIZWAN MOLLA
 
PDF
Copia de Strategic Roadmap Infographics by Slidesgo.pptx (1).pdf
ssuserd4c6911
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
PPTX
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
PDF
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
PDF
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
PPTX
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
PPTX
Dr djdjjdsjsjsjsjsjsjjsjdjdjdjdjjd1.pptx
Nandy31
 
PPTX
apidays Helsinki & North 2025 - APIs at Scale: Designing for Alignment, Trust...
apidays
 
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
Merits and Demerits of DBMS over File System & 3-Tier Architecture in DBMS
MD RIZWAN MOLLA
 
Copia de Strategic Roadmap Infographics by Slidesgo.pptx (1).pdf
ssuserd4c6911
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
Dr djdjjdsjsjsjsjsjsjjsjdjdjdjdjjd1.pptx
Nandy31
 
apidays Helsinki & North 2025 - APIs at Scale: Designing for Alignment, Trust...
apidays
 
Ad

Start machine learning in 5 simple steps

  • 1. Start Machine learning programming in 5 simple steps By Renjith M P https://www.linkedin.com/in/renjith-m-p-bbb67860/ To start with machine learning, we need to follow five basic steps. Steps 1. Choose a use case / problem statement :- Define your objective 2. Prepare data to train the system :- for any machine learning project, first your need to train the system with some data 3. Choose a programming language and useful libraries for machine learning :- Yes, obviously you need to choose a programming language to implement your machine learning 4. Training and prediction implementation :- Implement your solution using the programming language that you have selected 5. Evaluate the result accuracy :- validate the results (Based on accuracy results, we could accept the model or we could fine tune the model with various parameters and improve the model until we get a satisfactory result ) Warning: Target Audience : Basic knowledge on python (execute python scripts, install packages etc ) is mandatory to follow this course. Lets get into action. We will choose a use case and implement the machine learning for the same. 1. Choose a use case / problem statement Usecase : Predict the species of iris flower based on the lengths and widths of sepals and petals . Iris setosa Iris versicolor Iris virginica
  • 2. 2. Prepare data to train the system We will be using iris flower data set (https://en.wikipedia.org/wiki/Iris_flower_data_set ) which consist of 150 rows. Each row will have 5 columns 1. sepal length 2. sepal width 3. petal length 4.petal width 5.species of iris plant out of 150 rows, only 120 rows will be used to train the model and rest will be used to validate the accuracy of predictions. 3. Choose a programming language and libaries for machine learning There are quite few options available however the famous once are R & Python. My choice is Python. Unlike R, Python is a complete language and platform that you can use for both research and development and to develop production systems Ecosystem & Libraries Machine learning needs plenty of numeric computations, data mining, algorithms and plotting. Python offers a few ecosystems and libraries for multiple functionalities.One of the commonly used ecosystem is SciPy,which is a collection of open source software for scientific computing in Python, which has many packages or libraries. Out of that please find the list of packages from SciPy ecosystem,that we are going to use Package Desciption NumPy The fundamental package for numerical computation. It defines the numerical array and matrix types and basic operations on them. MatplotLib a mature and popular plotting package, that provides publication- quality 2D plotting as well as rudimentary 3D plotting SciPy Library One of the components of the SciPy stack, providing many numerical routines Pandas Providing high-performance, easy to use data structures sklearn Simple and efficient tools for data mining and data analysis Accessible to everybody, and reusable in various contexts Built on NumPy, SciPy, and matplotlib
  • 3. 4. Training, Prediction and validation implementation 4.1. Import libraries (before importing make sure you install them using pip/pip3) 4.2. Load data to train the model import pandas import matplotlib.pyplot as plt from sklearn import model_selection from sklearn.metrics import classification_report from sklearn.metrics import confusion_matrix from sklearn.metrics import accuracy_score from sklearn.linear_model import LogisticRegression from sklearn.tree import DecisionTreeClassifier from sklearn.neighbors import KNeighborsClassifier from sklearn.discriminant_analysis import LinearDiscriminantAnalysis from sklearn.naive_bayes import GaussianNB from sklearn.svm import SVC # Load dataset url = "https://raw.githubusercontent.com/renjithmp/machinelearning/maste r/python/usecases/1_irisflowers/iris.csv" names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class'] dataset = pandas.read_csv(url, names=names) #print important information about dataset print(dataset.shape) print (dataset.head(20)) print (dataset.describe()) print(dataset.groupby('class').size()) #visualize the data dataset.plot(kind='box', subplots=True, layout=(2,2), sharex=False, sharey=False) plt.show()
  • 4. Explanation : dataset.plot() function :- The box plot (a.k.a. box and whisker diagram) is a standardized way of displaying the distribution of data based on the five number summary: minimum, first quartile, median, third quartile, and maximum. In the simplest box plot the central rectangle spans the first quartile to the third quartile (the interquartile range or IQR). A segment inside the rectangle shows the median and "whiskers" above and below the box show the locations of the minimum and maximum. In machine learning, it is important to analys the data using different parameters. Visualize them using plot methods makes it much easier than analyze data in tabular format. For our use case, we will get below plots for sepal,petal length’s and widths.
  • 5. 4.3. split the data for training and validation Explanation :- X_train – training data (120 rows consist of petal ,sepal lengths and widths) Y_train – training data (120 rows consist of class of plant) x_validate – validation data (30 rows conist of petal,sepal lengths and widths) Y_train -validation data(30 rows consist of class of plant) 4.4.Train few models using training data Lets use X_train and Y_train to train few models models=[] models.append(('LR',LogisticRegression())) models.append(('LDA',LinearDiscriminantAnalysis())) models.append(('KNN',KNeighborsClassifier())) models.append(('CART',DecisionTreeClassifier())) models.append(('NB',GaussianNB())) models.append(('SVM',SVC())) array=dataset.values X=array[:,0:4] Y=array[:,4] validation_size=0.20 seed=7 scoring='accuracy' X_train,X_validation,Y_train,Y_validation=model_selection.train_t est_split(X,Y,test_size=validation_size,random_state=seed)
  • 6. The explanation of algorithms can be found @ scikit-learn.org e.g https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.ht ml I am not covering them here as it need a much deeper explanation. For now, we need to keep in mind that a model is something that has the capability to learn by it self using the training data and predict the output for future use cases Explanation : Kfold :- it is a very useful function to divide and shuffle the data in dataset. Here we are dividing the data in to 10 equal parts. Cross_val_score :– This is the most important step. We are feeding the model with training data (X_train -input and Y_train -corresponding output ). The method will execute the model and provide accuracy for each of the fold (remember we used 10 folds) take the mean and std deviation of 10 fold’s to see the accuracy for the entire training set. 4.5. Choose the best model which seems to be more accurate As you can see, we have executed 5 different models for the training data (5 different algorithms) and results shows that (cv_results.mean()) KneighborsClassifier() gives the most accurate results (0.98 or 98 %) 4.6.Predict and validate the results using validation data set results=[] names=[] for name,model in models: kfold=model_selection.KFold(n_splits=10,random_state=seed) cv_results=model_selection.cross_val_score(model,X_train,Y_train,c v=kfold,scoring=scoring) results.append(cv_results) names.append(name) msg="%s: %f (%f)" % (name,cv_results.mean(),cv_results.std()) print(msg) knn=KNeighborsClassifier() knn.fit(X_train,Y_train) predictions=knn.predict(X_validation) print(accuracy_score(Y_validation,predictions))
  • 7. Lets choose KNN and find predict the output for validation data 5. Publish results The accuracy_score() function can be used to see the accuracy of the prediction. In our use case we can see an accuracy of 0.90 (90%) You can find the source code here https://github.com/renjithmp/machinelearning/blob/master/python/usecases/1_irisflowers/ flowerclassprediction.py Reference Jason Brownlee article https://machinelearningmastery.com/machine-learning-in-python-step-by-step/ Scikit https://scikit-learn.org