Ch 1-2 NN classifier

Neural Networks & Deep Learning
CSE & IT Department
ECE School
Shiraz University
ONLY
GOD
Overview of
NN Classifier
2021

Pattern Classification
3
• Is one type of pattern recognition
• Each input vector Ԧ
𝑥 belongs/not belongs to a particular class
ቊ
Ԧ
𝑥 ∈ class ⇒ Ԧ
𝑥 ∈ Class 1
Ԧ
𝑥 ∉ class ⇒ Ԧ
𝑥 ∈ Class 2
• Set of training data: {< Ԧ
𝑠 𝑝 ; Ԧ
𝑡 𝑝 >, 𝑝 = 1, … , 𝑃}
• Data representation: binary ቊ
1
0
or bipolar ቊ
1
−1
< Ԧ
𝑠 1 ; Ԧ
𝑡 1 > = < −1, −1,1,1; 1 >
< Ԧ
𝑠 2 ; Ԧ
𝑡 2 > = < 1, −1,1, −1; −1 >
• In bipolar representation: Ԧ
𝑥𝑇 Ԧ
𝑥 = 𝑛 where 𝑛 is dimension of Ԧ
𝑥

Ex. of 2-class Patterns
4
• Classifying airplanes given their masses and speeds
• Construct an NN to classify any type of bomber or fighter
𝐌𝐚𝐬𝐬 Speed 𝐂𝐥𝐚𝐬𝐬
1.0 0.1 Bomber
2.0 0.2 Bomber
0.1 0.3 Fighter
2.0 0.3 Bomber
0.2 0.4 Fighter
3.0 0.4 Bomber
0.1 0.5 Fighter
1.5 0.5 Bomber
0.5 0.6 Fighter
1.6 0.7 Fighter

NN Classifier for 2-class Example
5
• Two inputs: masses and speeds
• One output neuron for each class
• Activation 1: yes
• Activation 0: no
• Just one output neuron
• Activation 1: fighter
• Activation 0: bomber
• Try the simplest network: a single layer net
• Replace the threshold (𝜃) by using a bias (𝑏)

2-class NN Classifier for 2D Data
6
• Using single-layer NN with one output neuron for two classes
𝑦_𝑖𝑛 = 𝑏 + σ𝑖=1
𝑛
𝑥𝑖 𝑤𝑖 = 𝑏 + 𝑤𝑇
Ԧ
𝑥
𝑦 = 𝑓 𝑦_𝑖𝑛 = ቊ
1 𝑖𝑓 𝑦_𝑖𝑛 ≥ 0
−1 𝑖𝑓 𝑦_𝑖𝑛 < 0
Decision boundary:
𝑏 + σ𝑖=1
𝑛
𝑥𝑖 𝑤𝑖 = 0

Bias in NN Classifier
7
The role of bias:
𝑦_𝑖𝑛 = 𝑥1𝑤1 + 𝑥2𝑤2 + 𝑏
𝑦 = 𝑓 𝑦_𝑖𝑛 = ቊ
1 if 𝑥1𝑤1 + 𝑥2𝑤2 + 𝑏 ≥ 0
−1 if 𝑥1𝑤1 + 𝑥2𝑤2 + 𝑏 < 0
𝑥1𝑤1 + 𝑥2𝑤2 + 𝑏 = 0 → 𝑥2 = −
𝑤1
𝑤2
𝑥1 −
𝑏
𝑤2
: Decision line

Linear Separability and Decision Hyper-planes
8
• If for a classification problem, there are weights so that all
positive training patterns lie on side of decision boundary and
all negative patterns lie on other side of decision boundary,
then problem is linearly separable
• For two inputs, decision boundary is 1D straight line in 2D input
space
• If we have 𝑛 inputs, decision boundary is (𝑛 − 1)D hyper-plane
in 𝑛D input space

Decision Boundary for AND & OR
9
• For simple logic gate problems, decision boundaries between
classes are linear:
• Decision boundary: 𝑥1𝑤1 + 𝑥2𝑤2 − 𝜃 = 0

Decision Boundary for XOR
10
• For XOR, there are two obvious remedies:
• Either change activation function so that it has more
than one decision boundary
• Use a more complex network that is able to generate
more complex decision boundaries

Characteristics of NN Classifier
11
Decision boundary is not unique
• If a problem is linearly separable, there are many different
decision boundary separating positive pattern from negative
ones

Characteristics of NN Classifier
12
The weights and bias are not unique
• For each decision boundary, there are many choices for 𝑤𝑖s
and 𝑏 that give exactly the same boundary
𝒔𝟏 𝒔𝟐 𝒕
1 1 1
1 -1 -1
-1 1 -1
-1 -1 -1
𝑥2 = −𝑥1 + 1
𝑥2 = −
𝑤1
𝑤2
𝑥1 −
𝑏
𝑤2
𝑤1 = 𝑤2 = −𝑏

General Decision Boundaries
13
Generally, we wish NNs:
• To deal with input patterns that are not binary
• To form complex decision boundaries
• To classify inputs into many classes
• Also, to produce outputs for input
patterns that were not originally
set up to classify
• Shown with question marks
• Their classes may be incorrect

Memorization and Generalization
14
Two important aspects of network’s operation:
• Memorization:
• The network must learn decision surfaces from a set of
training patterns so that these training patterns are
classified correctly (are memorized)
• Generalization:
• The network must also be able to correctly classify test
patterns (sufficiently similar to training patterns) it has
never seen before (to generalize)
• A good NN can memorize well; also, can generalize well

Memorization and Generalization
15
• Sometimes, training data may contain errors:
• Noise in experimental determination of input values
• Incorrect classifications
• In this case, learning training data perfectly may make
the generalization worse
• There is an important trade-off between
memorization and generalization that arises quite
generally

Generalization in Classification
16
• An NN wants to learn a classification decision boundary
• The aim is to generalize such that it can classify new inputs
appropriately
• If training data contains noise, no necessarily need whole
training data to be classified accurately as it is likely to reduce
generalization ability

Generalization in Function Approximation
17
• An NN wants to recover a function for which only noisy
data samples exist
• NN is expected:
• To give a better representation of underlying function if its
output curve does not pass through all data points
• To allow a larger error on training data as is likely to lead to
better generalization

Pattern Representation in Classification
18
Binary vs. bipolar representation:
In a simple net, form of data representation may change a solvable
problem with a non-solvable one
Binary: ቊ
1 positive
0 negetive
Bipolar:ቐ
+1 positive
0 missing
−1 negetive
• Binary representation is not as good as bipolar in generalization
• Using bipolar input, missing data can be distinguished from
mistaken data

Ch 1-2 NN classifier

More Related Content

What's hot (14)

Similar to Ch 1-2 NN classifier (20)

More from Zahra Amini (12)

Recently uploaded (20)

Ch 1-2 NN classifier