Name: Monteloyola, Charles Ian S.
Date: 06 / 28 / 2022
CS158 – OL614 | E2-2 Prof. Mary Jane Samonte
Classification: Decision Tree and Naïve Bayes Model
Weekend (Example) Weather Parents Money Decision (Category)
W1 Sunny Yes Rich Cinema
W2 Sunny No Rich Tennis
W3 Windy Yes Rich Cinema
W4 Rainy Yes Poor Cinema
W5 Rainy No Rich Stay in
W6 Rainy Yes Poor Cinema
W7 Windy No Poor Cinema
W8 Windy No Rich Shopping
W9 Windy Yes Rich Cinema
W10 Sunny No Rich Tennis
Table 1. Decision: CATEGORY
a. Create a Decision Tree
b. Create a Model based on Naïve Bayes
c. Student Number ending in Prime Number: Determine the Decision
if
Weather: Rainy, Parents: Yes, Money: Rich
Student Number ending in Non-Prime Number: Determine the Decision
If
Weather: Sunny, Parents: Yes, Money: Poor
A. Decision Tree
Step 1: The Entropy Equation
Cinema = 6
Tennis = 2
Stay in = 1
Shopping = 1
H(Category) = H(3/5, 1/5, 1/10, 1/10)
H(3/5, 2/10, 1/10, 1/10)=-(3/5 log2 3/5) – (1/5 log2 1/5) – (1/10 log2 1/10) – (1/10 log2 1/10)
=0.444+ 0.464 + 0.332 + 0.332
H(Category) = 1.572
Step 2: Conditional Entropy
H(Category/weather), H(Category/parents), H(Category/money)
H(Category/weather)
Sunny * H + Windy * H + Rainy * H =3/10 (1/3, 2/3, 0/3, 0/3) + 4/10 (3/4, 0/4, 0/4, 1/4) + 3/10
(2/3, 0/3,
1/3, 0/3)
H(Category, Weather) = 3/10 ((-1/3 log2 1/3)-(2/3 log2 2/3) – (0/3 log2 0/3) – (0/3 log2 0/3)) +
4/10
((3/4 log2 ¾) – (0/4 log2 0/4) – (0/4 log2 0/4) – (1/4 log2 ¼)) + 3/10 ((-2/3 log2 2/3) – (0/3 log2
0/3) -
(1/3 log2 1/3) – (0/3 log2 0/3)
= 3/10 (0.92) + 4/10 (0.82) + 3/10 ( 0.92)
=0.276 + 0.328 + 0.276
H(Category/weather) = 0.88
H(Category/parents)
Yes * H + No * H = 5/10 (5/5, 0/5, 0/5, 0/5) + 5/10 (1/5, 2/5, 1/5, 1/5)
H(Category/parents) = 5/10 ((-5/5 log2 5/5) + 5/10 ((-1/5 log2 1/5) – (2/5 log2 2/5) – (1/5 log2
1/5) –
(1/5 log2 1/5))
=5/10 (0) + 5/10 (1.92)
= 0 + 0.96
H(Category/parents) = 0.96
H(Category/money)
Rich * H + Poor * H = 7/10 (3/7 , 2/7, 1/7, 1/7) + 3/10 (3/3, 0/3, 0/3)
H(Category/money) = 7/10 ((-3/7 log2 3/7) – (2/7 log2 2/7) – (1/7 log2 1/7) – (1/7 log2 1/7)) +
3/10 ((-
3/3 log2 3/3))
= 7/10 (1.85) + 3/10 (0)
= 1.295 + 0
H(Category/money) = 1.295
Step 3: Information Gain
H(Category/weather) = 0.88 | H(Category/parents) = 0.96 | H(Category/money) = 1.295
I(Category/weather) = 1.572 – 0.88 = 0.692
I(Category/parents) = 1.572 – 0.96 = 0.612
I(Category/money) = 1.572 – 1.295 = 0.277
Max(0.692, 0.612, 0.277) = 0.692, so Weather is best
Step 4: Building the Decision Tree
Sunny = H(1/3, 2/3, 0/3, 0/3) = 0.92
H (Category/parents)
1/3 (1/1, 0/1, 0/1, 0/1) + 2/3 (0/2, 2/2, 0/2, 0/2)
H (Category/parents) = 1/3 ((-1/1 log2 1/1)) + 2/3 ((-2/2 log2 2/2))
= 1/3 (0) + 2/3 (0)
H (Category/parents) = 0
I (Category/parents) = H(1/3, 2/3, 0/3, 0/3) – 0
I (Category/parents) = 0.92
H (Category/money)
3/3 ( 1/3, 2/3, 0/3, 0/3) + 0/3 (0,0,0,0)
H (Category/money) = 3/3 ((-1/3 log2 1/3) – (2/3 log2 2/3))
= 3/3 (0.92)
H (Category/money) = 0.92
I (Category/money) = 0.92 – 0.92 = 0
Max (0.92) = 0.92
Windy = H (3/4, 0/4, 0/4, ¼) = 0.815
H (Category/parents)
2/4 (2/2, 0/2, 0/2, 0/2) + 2/4 (1/2, 0/2, 0/2, ½)
H (Category/parents) = 2/4 ((-2/2 log2 2/2)) + 2/4 ((-1/2 log2 ½) - (-1/2 log2 ½))
= 2/4 (0) + 2/4 (1)
= 0 + 0.5
H (Category/parents) = 0.5
I (Category/parents) = 0.815 – 0.5 = 0.315
H (Category/money)
3/4 (2/3, 0/3, 0/3, 1/3) + ¼ (1/1, 0/1, 0/1, 0/1)
H (Category/money) = ¾ ((-2/3 log2 2/3) – (1/3 log2 1/3)) + ¼ (-1/1 log2 1/1)
= ¾ (0.92)
H (Category/money) = 0.69
I (Category/money) = 0.815 – 0.69 = 0.315
Final Decision Tree Result:
B. Bayesian Model & C. Decision Determination
My Student Number = 2021105140 ( ending with Non-Prime Number)
Therefore Given Data:
Weather = Sunny
Parent = Yes
Money = Poor
Step 1: Formulate the Possibility of “Yes” in the Total Data
P(C1) = P (Decision = Cinema) = 6/10 = 0.6
P (C2) = P (Decision = Tennis) = 2/10 = 0.2
P (C3) = P (Decision = Stay In) = 1/10 = 0.1
P (C4) = P (Decision = Shopping) = 1/10 = 0.1
Step 2: Relate Attribute to the Classifications
P (Weather: Sunny/ Decision = Cinema) – 1/6 = 0.167
P (Weather: Sunny/ Decision = Tennis) – 2/2 = 1
P (Weather: Sunny/ Decision = Stay In) – 0/1 = 0
P (Weather: Sunny/ Decision = Shopping) – 0/1 = 0
P (Parent = Yes / Decision = Cinema) – 5/6 = 0.833
P (Parent: Yes/ Decision = Tennis) – 0/2 = 0
P (Parent: Yes/ Decision = Stay In) – 0/1 = 0
P (Parent: Yes/ Decision = Shopping) – 0/1 = 0
P (Parent = No / Decision = Cinema) – 3/6 = 0.5
P (Parent = No / Decision = Tennis) – 0/2 = 0
P (Parent = No/ Decision = Stay In) – 0/1= 0
P (Parent = No / Decision = Shopping) – 0/1 = 0
P (Money = Poor / Decision = Cinema) – 3/6 = 0.5
P (Money = Poor/ Decision = Tennis) – 0/2 = 0
P (Money = Poor/ Decision = Stay In) – 0/1 = 0
P (Money = Poor / Decision = Shopping) – 0/1 = 0
Step 3: Compute the Value for the Yes and No Possibilities
P (X/Decision = Cinema) = P (Weather = Sunny) * P (Parents = Yes) * P (Money = Poor)
TOTAL: 0.167 * 0.833 * 0.5 = 0.070
P(X/Decision = Cinema) = 0.070
P (X/Decision = Tennis) = P (Weather = Sunny) * P (Parents = Yes) * P (Money = Poor)
TOTAL = 1 * 0 * 0 = 0
P (X/ Decision = Tennis) = 0
P (X/Decision = Stay In) = P (Weather = Sunny) * P (Parents = Yes) * P (Money = Poor)
TOTAL: 0 * 0 * 0 = 0
P(X/Decision = Stay In) = 0
P (X/Decision = Shopping) = P (Weather = Sunny) * P (Parents = Yes) * P (Money = Poor)
TOTAL = 0 * 0 * 0 = 0
P (X/ Decision = Shopping) = 0
Step 4: Find Class Ci that Maximizes P(X/Ci) * P(Ci)
P (Decision = Cinema) (0.6) * P(X/Decision = Cinema) (0.070)
= 0.042
Prediction = Yes, the decision is Cinema