Neural Networks and recent advancement.pptx

INDIAN INSTITUTE OF TECHNOLOGY ROORKEE
Introduction to Neural networks and Recent
Advances
ASHA RANI

Perceptron learning algorithm
repeat until convergence (or for some # of iterations):
for each training example (f1, f2, …, fn, label):
if prediction * label ≤ 0: // they don’t agree
for each wi:
wi = wi + fi*label
b = b + label
Why is it called the “perceptron” learning algorithm if
what it learns is a line? Why not “line learning”
algorithm?

Our Nervous System
Neuron
What do you know?

Our nervous system:
the computer science view
the human brain is a large collection
of interconnected neurons
a NEURON is a brain cell
– they collect, process, and disseminate
electrical signals
– they are connected via synapses
– they FIRE depending on the
conditions of the neighboring neurons

Output y
Input x1
Input x2
Input x3
Input x4
Weight w1
Weight w2
Weight w3
Weight w4
A neuron/perceptron
activation function
How is this a linear
classifier (i.e. perceptron)?

Hard threshold = linear classifier
hard threshold:
…
output
x1
x2
xm
w1
w2
wm

Neural Networks
Neural Networks try to mimic the structure and function of our
nervous system
People like biologically motivated approaches

Artificial Neural Networks
Node (Neuron/perceptron)
Edge (synapses)
our approximation

9
W is the strength of signal sent between A and B.
If A fires and w is positive, then A stimulates B.
If A fires and w is negative, then A inhibits B.
Weight w
Node A Node B
(perceptron) (perceptron)

Other activation functions
hard threshold:
sigmoid
tanh x
why other threshold functions?

Neural network
inputs
Individual
perceptrons/neuro
ns

Neural network
inputs
some inputs are
provided/entered

Neural network
inputs
each perceptron computes
and calculates an answer

Neural network
inputs
those answers become inputs
for the next level

Neural network
inputs
finally get the answer after all levels
compute

http://www.youtube.com/watch?v=Yq7d4ROvZ6I
Activation spread

Computation (assume 0 bias)
0
1
0.5
-0.5
-1
0.5
0.5
1
0
1
1

Computation
-1
1
0.05
0.03
-0.02
0.01
0.5
1
-0.05-0.02= -0.07
-0.03+0.01=-0.02
0.483
0.495
0.483*0.5+0.495=0.7365
0.676

Neural networks
Different kinds/characteristics of networks
inputs
inputs inputs
How are these different?
inputs

Hidden units/layers
inputs
inputs
Feed forward networks
hidden units/layer

Hidden units/layers
inputs
Can have many layers of
hidden units of differing sizes
To count the number of
layers, you count all but the
inputs
…

Hidden units/layers
inputs
inputs
2-layer network 3-layer network

Alternate ways of visualizing
inputs
2-layer network
Sometimes the input layer will be drawn with nodes as
well
inputs
2-layer network

Multiple outputs
inputs
Can be used to model multiclass
datasets or more interesting
predictors, e.g. images
0 1

Multiple outputs
input output
(edge detection)

Neural networks
Recurrent network
Output is fed back to input
Can support memory!
Good for temporal data
inputs

NN decision boundary
…
output
x1
x2
xm
w1
w2
wm
What does the decision boundary of a perceptron look like?
Line (linear set of weights)

What does the decision boundary of a 2-layer network look like?
Is it linear?
What types of things can and can’t it model?

XOR
Input x1
Input x2
?
x1 x2 x1 xor x2
0 0 0
0 1 1
1 0 1
1 1 0
Output = x1 xor x2
?
?
?
?
?
b=?
b=?
b=?

XOR
Input x1
Input x2
1
x1 x2 x1 xor x2
0 0 0
0 1 1
1 0 1
1 1 0
Output = x1 xor x2
-1
-1
1
1
1
b=-0.5
b=-0.5
b=-0.5

What does the decision boundary look like?
Input x1
Input x2
1
x1 x2 x1 xor x2
0 0 0
0 1 1
1 0 1
1 1 0
Output = x1 xor x2
-1
-1
1
1
1
b=-0.5
b=-0.5
b=-0.5

Input x1
Input x2
1
x1 x2 x1 xor x2
0 0 0
0 1 1
1 0 1
1 1 0
Output = x1 xor x2
-1
-1
1
1
1
b=-0.5
b=-0.5
b=-0.5
What does this perceptron’s
decision boundary look like?

x1
x2
Input x1
Input x2
b=-0.5
-1
1
(-1,1)
(without the bias)
Let x2 = 0, then:

x1
x2
Input x1
Input x2
b=-0.5
-1
1

x1
x2
(1,-1)
(without the bias)
Input x1
Input x2
1
-1
b=-0.5
Let x2 = 0, then:

x1
x2
Input x1
Input x2
1
-1
b=-0.5

Input x1
Input x2
1
x1 x2 x1 xor x2
0 0 0
0 1 1
1 0 1
1 1 0
Output = x1 xor x2
-1
-1
1
1
1
b=-0.5
b=-0.5
b=-0.5
What operation does this
perceptron perform on the
result?

Fill in the truth table
1
1 b=-0.5 out1 out2
0 0 ?
0 1 ?
1 0 ?
1 1 ?

OR
1
1 b=-0.5 out1 out2
0 0 0
0 1 1
1 0 1
1 1 0

42
x1
x2
Input x1
Input x2
1
Output = x1 xor x2
-1
-1
1
1
1
b=-0.5
b=-0.5
b=-0.5

43
x1
x2
Input x1
Input x2
1
Output = x1 xor x2
-1
-1
1
1
1
b=-0.5
b=-0.5
b=-0.5
x1 x2 x1 xor x2
0 0 0
0 1 1
1 0 1
1 1 0

Input x1
Input x2
Output = x1 xor x2
linear splits of the
feature space
combination of
these linear
spaces

This decision boundary?
Input x1
Input x2
?
Output
?
?
?
?
?
b=?
b=?
b=?

This decision boundary?
Input x1
Input x2
1
Output
-1
-1
1
-1
-1
b=-0.5
b=0.5
b=0.5

-1
-1
b=0.5 out1 out2
0 0 ?
0 1 ?
1 0 ?
1 1 ?

-1
-1
b=0.5 out1 out2
0 0 1
0 1 0
1 0 0
1 1 0
NOR

NN decision boundaries
‘Or, in colloquial terms “two-layer networks
can approximate any function.”’

NN decision boundaries
For DT, as the tree gets larger, the model gets more complex
The same is true for neural networks: more hidden nodes =
more complexity
Adding more layers adds even more complexity (and much
more quickly)
Good rule of thumb:
number of 2-layer hidden nodes ≤
number of examples
number of dimensions

Neural Networks and recent advancement.pptx

More Related Content

Similar to Neural Networks and recent advancement.pptx (20)

Recently uploaded (20)

Neural Networks and recent advancement.pptx

Editor's Notes