ann 3- perceptron
TRANSCRIPT
-
8/2/2019 ANN 3- Perceptron
1/56
Neural NetworksNeural NetworksLectureLecture 33
ChapterChapter 0101
Resenblatts perceptron
Dr.Dr. HalaHala MousherMousher EbiedEbiedScientific Computing DepartmentScientific Computing Department
Faculty of Computers & Information SciencesFaculty of Computers & Information SciencesAin Shams UniversityAin Shams University
-
8/2/2019 ANN 3- Perceptron
2/56
Lecture 3-2
AgendaAgenda
SingleSingle--layer feedforward networks andlayer feedforward networks andClassification Problems (Classification Problems (McCulloch-Pitts neuronmodel )
Decision SurfaceDecision Surface
Rosenblatts Perceptron Learning Rule (ThePerceptron Convergence Theorem )
Illustrate ExampleIllustrate Example 11 Illustrate ExampleIllustrate Example 22 (AND)(AND)
-
8/2/2019 ANN 3- Perceptron
3/56
ASU-CSC445: Neural Networks 3
IntroductionIntroductionMcCulloch and Pitts proposed the McCulloch-Pitts neuron model1943
Hebb published his book The Organization of Behavior, in which the Hebbianlearning rule was proposed.
1949
Rosenblatt introduced the simple single layer networks now calledPerceptrons.
1958
Minsky and Paperts book Perceptrons demonstrated the limitation of singlelayer perceptrons, and almost the whole field went into hibernation.
1969
Hopfield published a series of papers on Hopfield networks.1982
Kohonen developed the Self-Organising Maps that now bear his name.1982
The Back-Propagation learning algorithm for Multi-Layer Perceptrons wasrediscovered and the whole field took off again.
1986
The sub-field of Radial Basis Function Networks was developed.1990s
The power of Ensembles of Neural Networks and Support Vector Machinesbecomes apparent.
2000s
-
8/2/2019 ANN 3- Perceptron
4/56
Lecture 3-4
going back togoing back to
The McCulloch-Pitts neuron model (1943)
Threshold Logic Units (TLU)Threshold Logic Units (TLU)
00
01)()( 2211
k
k
kkkkifv
ifvxwxwvy
-
8/2/2019 ANN 3- Perceptron
5/56
going back togoing back to
ClassificationClassificationThe goal of pattern classification is to assign a physical
object or event to one of a set of a pre-specified
classes (or categories)
Examples:
Boolean functions
Pixel Patterns
Lecture 3-5
-
8/2/2019 ANN 3- Perceptron
6/56
Lecture 3-6
The AND problemThe AND problem
x1 x2 y
1 1 1
1 0 0
0 1 0
0 0 0
11211 wwLet
,01
02 ?
05.1
)0(0)()1001(,00
)01(0)1()1101(,10
)01(0)1()0111(,01
)02(1)2()1111(,11
21
21
21
21
ifyxandxwith
ifyxandxwith
ifyxandxwith
ifyxandxwith
-
8/2/2019 ANN 3- Perceptron
7/56
Lecture 3-7
The OR problemThe OR problem
x1 x2 y
1 1 1
1 0 1
0 1 1
0 0 011211 wwLet
5.001 ,02
?0
)0(0)()1001(,00
)01(1)1()1101(,10
)01(1)1()0111(,01
)02(1)2()1111(,11
21
21
21
21
ifyxandxwith
ifyxandxwith
ifyxandxwith
ifyxandxwith
-
8/2/2019 ANN 3- Perceptron
8/56
Lecture 3-8
The XOR problemThe XOR problem
x1 x2 y
1 1 0
1 0 1
0 1 1
0 0 0
11211
wwLet
???
-
8/2/2019 ANN 3- Perceptron
9/56
Lecture 3-9
TheThe graph forgraph for AND, OR, and XORAND, OR, and XOR
ProblemsProblems
Y=0
Y=1
-
8/2/2019 ANN 3- Perceptron
10/56
Lecture 3-10
Whats NextWhats Next
SingleSingle--layer Feedforward Networks andlayer Feedforward Networks andClassification Problems (McCullochClassification Problems (McCulloch--Pitts neuronPitts neuronmodel )model )
Decision SurfaceDecision Surface
Rosenblatts Perceptron Learning Rule (TheRosenblatts Perceptron Learning Rule (ThePerceptron Convergence Theorem )Perceptron Convergence Theorem )
Illustrate ExampleIllustrate Example 11 Illustrate ExampleIllustrate Example 22 (AND)(AND)
-
8/2/2019 ANN 3- Perceptron
11/56
Decision Surface (Decision Surface (hyperplanehyperplane))))
For minput variables x1, x2, , xm. The decisionsurface is the surface at which the output of the nodeis precisely equal to the threshold (bias), and defined
by
On one side of this surface, the output of decision unit,y, will be 0, and on the other side it will be 1.
Lecture 3-11
01
m
i
iixw
-
8/2/2019 ANN 3- Perceptron
12/56
Lecture 3-12
LinearLinear separableseparable
A function f:{0, 1}n {0, 1} is linearly separable if thespace of input vectors yielding 1 can be separated fromthose yielding 0 by a linear surfacelinear surface (hyperplanehyperplane) in nn
dimensionsdimensions. Examples: in case of two-dimensional, a hyperpane is a
straight line.
linearly separable Non-linearly separable
-
8/2/2019 ANN 3- Perceptron
13/56
Lecture 3-13
Decision Surfaces in 2Decision Surfaces in 2--DD
Two classification regions are separated bythe decision boundary line L:
This line is shifted away from the origin
according to the bias . If the bias is 0, then the decision surface
passes through the origin.
This line is perpendicular to the weight
vector W
Adding a bias allows the neuron to solveproblems where the two sets of inputvectors are not located on different sides of
the ori in.
02211 xwxw
2
1
ww
-
8/2/2019 ANN 3- Perceptron
14/56
Decision Surfaces in 2Decision Surfaces in 2--D, cont.D, cont.
In 2-dim, the surface is:
Which we can write
Which is the equation of a line of gradient
With intercept
Lecture 3-14
02211 xwxw
2
1
2
12
wx
w
wx
2
1
w
w
2w
-
8/2/2019 ANN 3- Perceptron
15/56
Lecture 3-15
Examples forExamples for Decision Surfaces inDecision Surfaces in 22--DD ::
x1
1 2 3-3 -2 -1
x2
12
3
-3
-2
-1
Class 1
Let w1 = 1, w2 = 2, = -2
The decision boundary:
X1+2X2-2=0
x1
1 2 3-3 -2 -1
x2
12
3
-3
-2
-1
Class 1
Let w1 = -2, w2 = 1, =- 2
The decision boundary:
-2X1+X2-2=0
Depending on the values of the weights, the decision boundaryline will separate the possible inputs into two categories.
Class 0Class 0
-
8/2/2019 ANN 3- Perceptron
16/56
ExampleExample
Slope= (6-0)/(0-6)= -1
Intersection at x2-axis=6
then the hyperplane (straightline ) will be x2=-x1+6
Which is: xx11+x+x22--66==00
Lecture 3-16
-
8/2/2019 ANN 3- Perceptron
17/56
Lecture 3-17
Linear Separability for nLinear Separability for n--dimensionaldimensional So by varying the weights and the threshold, we can
realize any linear separation of the input space into aregion that yields output 1, and another region that yields
output 0. As we have seen, a twotwo--dimensionaldimensional input space can
be divided by any straight line.
A threethree--dimensionaldimensional input space can be divided by anytwo-dimensional plane.
In general, an nn--dimensionaldimensional input space can bedivided by an (n-1)-dimensional plane or hyperplane.
Of course, for n > 3 this is hard to visualize.
-
8/2/2019 ANN 3- Perceptron
18/56
Lecture 3-18
Rosenblatts Perceptron LearningRosenblatts Perceptron Learning
RuleRule
Whats Next
-
8/2/2019 ANN 3- Perceptron
19/56
The PerceptronThe Perceptron
by Frank Rosenblatt (by Frank Rosenblatt (19581958,, 19621962))
It is the simplestform of a neural network.
It is a single-layer network whose synapticweights apple
to adjustable.
is limitedto performpattern classification with only two
classes.
The two classes must be linearly Separable; patterns lie on
opposite sides of a hyperplane.
Lecture 3-19
-
8/2/2019 ANN 3- Perceptron
20/56
Lecture 3-20
The Perceptron ArchitectureThe Perceptron Architecture
A perceptron nonlinear neuron, which uses the hard-limittransfer function (signum activation function), returnsa +1 or -1.
Weights are adapted using an error-correction rule,
The training technique used is called thethe perceptronperceptronlearninglearning rulerule..
xwxw
bxwv
T
m
j
jj
m
j
jj
0
1
01
01)sgn()(
v
vvvyk
-
8/2/2019 ANN 3- Perceptron
21/56
The Perceptron Convergence AlgorithmThe Perceptron Convergence Algorithm
The learning algorithms find a weight vector w such that:
, then the training problem is then to find a weight vectorW such that the previous two inequalities are satisfied.This is achieved when updating the weights as follows:
Lecture 3-21
C1orinput vecteveryfor0 xxwT
C2orinput vecteveryfor0 xxwT
C1x(n)and0)(if)()1( nnwnnT
xwww
C2x(n)and0)(if)()1( nnwnn T xwww
-
8/2/2019 ANN 3- Perceptron
22/56
Lecture 3-22
The Perceptron Convergence TheoremThe Perceptron Convergence Theorem
Rosenblatt (1958,1962)
If a problem is linearly separable, thenThe perceptron will converges after some n iterations
i.e.The perceptron will learn in a finite number of steps.
Rosenblatt designed a learning law for weightedadaptation in the McCulloch-Pitts model.
-
8/2/2019 ANN 3- Perceptron
23/56
Lecture 3-23
Perceptron parameters and learning rulePerceptron parameters and learning rule
Training set consists of the pair {X,t}
X is an input vector and
t the desired target vector
Iterative process
Present a training example X ,
compute network output y ,
compare output y with target t,
and adjust weights. (addw)
Learning ruleSpecifies how to change the weights w of the network as a function
of
the inputs X, output y and target t.
-
8/2/2019 ANN 3- Perceptron
24/56
Lecture 3-24
Drive the errorDrive the error--correction learning rulecorrection learning rule
-- how to adjusting the Weight Vector?how to adjusting the Weight Vector? The net input signalnet input signal is the sum of all inputs after
passing the synapses:
m
j jj
xwnet0
This can be viewed as computing the inner productinner product of
the vectors wi and X:
)cos(.. jj xwnet
where is the angleangle between the two vectors.
wwii
x1
x2
xx
-
8/2/2019 ANN 3- Perceptron
25/56
Lecture 3-25
Drive the errorDrive the error--correction learning rule ,correction learning rule ,
cont.cont.
w
w = w + axx
Move w in thedirection of x
Case 1. ifError=t-y=0, then make a change w=0.
Case 2. ifTarget t=1, Output y=-1, and the Error=t-y=2
w
x then
The weight vector results in an incorrectly classifies the vector X.The input vector X is added to the weight vector w.This makes the weight vector point closer tocloser to the input vector,increasing the chance that the input vector will be classified as a 1in the future.
x
-
8/2/2019 ANN 3- Perceptron
26/56
Lecture 3-26
Drive the errorDrive the error--correction learning rule ,correction learning rule ,
cont.cont.
w
x
w = w - ax
w
x
x
Move w awayfrom thedirection of x
Case 3. ifTarget t=-1, Output y=1, and the Error=t-y=-2
then
The weight vector results in an incorrectly classifies the vector X.
The input vector X is subtracted from the weight vector w.This makes the weight vector point farther awayfarther away from the inputvector, increasing the chance that the input vector will be classifiedas a -1 in the future.
-
8/2/2019 ANN 3- Perceptron
27/56
Lecture 3-27
Drive the errorDrive the error--correction learning rule ,correction learning rule ,
cont.cont. The perceptron learning rule can be written more succinctly in terms ofthe error e = t y and the change to be made to the weight vector
w:
We can see that the sign of X is the same as the sign on theerror, e. Thus, all three cases can then be written with a single
expression:
With w(0)= random values. Where n denotes the time-step in applying thealgorithm.
The parameter is called the learning rate. It determines the
ma nitude of wei ht u dates w .
w= (t-y)
x
W(n+1) = w(n)+w(n) = w(n) + (t-y) x
CASE 1. If e = 0, then make a change w equal to 0.
CASE 2. If e = 2, then make a change w proportional to x.
CASE 3. If e = 2, then make a change w proportional to x.
-
8/2/2019 ANN 3- Perceptron
28/56
Lecture 3-28
Learning Rate
governs the rate at which the training rule
converges toward the correct solution.
Typically
-
8/2/2019 ANN 3- Perceptron
29/56
Lecture 3-29
PerceptronPerceptron ConvergenceConvergence AlgorithmAlgorithm
variables and parameters:
X(n) = input vector of dim (m+1) by 1 = [+1, x1, x2, , xm]T
W(n) = weight vector of dim (m+1) by 1 = [b, w1, w2, , wm]T
b = bias
y(n) = actual response
t (n) = desired (target) response
= learning rate parameter, a positive constant less than unity
-
8/2/2019 ANN 3- Perceptron
30/56
Lecture 3-30
PerceptronPerceptron ConvergenceConvergence Algorithm, cont.Algorithm, cont.1- Start with a randomly chosen weight vector w(0)2- Repeat
3- for each training vector pair (Xi,ti)evaluate the output yi when Xi is the input
4- if yiti then
form a new weight vector wi+1 according towi+1 = wi + wi = wi + (t-y) Xi
else
do nothingend if
end for
5- Until the stopping criterion (convergence) is reached
convergence : epoch doesnt produce an error
(i..e all training vector classified correctly (y=t) for the same w(n) )(i.e. until error is zero for all training patterns).(i.e. no weights change in step 4 for all training patterns)
-
8/2/2019 ANN 3- Perceptron
31/56
Lecture 3-31
Perceptron learning schemePerceptron learning scheme
-
8/2/2019 ANN 3- Perceptron
32/56
Illustrative ExampleIllustrative Example
Lecture 3-32
-
8/2/2019 ANN 3- Perceptron
33/56
IntroductionIntroduction
We will begin with a simple test problem to show how theperceptron learning rule should work. To simplify ourproblem, we will begin with a network without a bias andhave two-inputs and one output. The network will thenhave just two parameters, w11 and w12 , to adjustmentas shown in the following figure.
By removing the bias we are left with a network whosedecision boundary must pass through the origin.
Lecture 3-33
-
8/2/2019 ANN 3- Perceptron
34/56
Example 1Example 1
The input/target pairs for our test problem are
Lecture 3-34
which are to be used in the training together
with the following weights
Show how the learning proceeds?
1;1;1
1
0;
2
1;
2
1
321
321
ddd
XXX
5.0;8.0
0.1
w
-
8/2/2019 ANN 3- Perceptron
35/56
Lecture 3-35
Iteration 1. Input isX1 and the desired output isd1:
1)sgn(
6.0
1
12
18.00.111
nety
Xwnet T
sinced1=1 , the error signal ise= 1-(-1) = 2
Then the Correction (update weights) in this step is necessaryThen the Correction (update weights) in this step is necessary
The updated weight vector is
2.1
0.2
2
1
8.0
0.1
*))1(1(*5.0
2
12 1
w
Xww
This operation is illustrated in the adjacent figure.
-
8/2/2019 ANN 3- Perceptron
36/56
Lecture 3-36
Iteration 2. Input isX2 and the desired output isd2:
1)sgn(
4.0
2
2
12.10.2222
nety
XwnetT
sinced2=-1 , the error signal ise = -1-1 = -2
Then the Correction (update weights) in this step is necessaryThen the Correction (update weights) in this step is necessary
The updated weight vector is
8.0
0.3
2
1
2.1
0.2*)11(*5.0
3
223
w
Xww
This operation is illustrated in the adjacent figure.
-
8/2/2019 ANN 3- Perceptron
37/56
Lecture 3-37
Iteration 3. Input isX3 and the desired output isd3:
1)sgn(
8.0
3
1
08.00.3333
nety
XwnetT
sinced1=-1 , the error signal ise = -1-1 = -2
Then the Correction (update weights) in this step is necessaryThen the Correction (update weights) in this step is necessary
The updated weight vector is
2.0
0.3
1
0
8.0
0.3*)11(*5.0
4
334
w
Xww
This operation is illustrated in the adjacent figure.
-
8/2/2019 ANN 3- Perceptron
38/56
Lecture 3-38
check
weights change for iteration 1, 2, and 3.No stop
-
8/2/2019 ANN 3- Perceptron
39/56
Lecture 3-39
Iteration 4:
111
2
1
2.00.314
1)sgn(
4.31
dnety
Xwnet
T
222
2
12.00.3242
1)sgn(
6.2
dnety
XwnetT
33
1
02.00.3343
1)sgn(
2.0
dnety
XwnetT
No weights change for iteration 4, 5, and 6Stopping criteria is satisfied
Iteration 5:
Iteration 6:
-
8/2/2019 ANN 3- Perceptron
40/56
Lecture 3-40
The diagram to the left shows that theperceptron has finally learned to classifythe three vectors properly.If we present any of the input vectors to
the neuron, it will output the correct classfor that input vector.
-
8/2/2019 ANN 3- Perceptron
41/56
Lecture 3-41
Your turn (HomeworkYour turn (Homework 11))
Consider the following set of training vectorsx1,x2, andx3,
1
5.0
1
1
;
1
5.0
5.1
0
;
1
0
2
1
321 XXX
With learning rate equals 0.1, and the desired
responses and weights initialized as:
;
5.0
0
1
1
;
1
11
3
2
1
w
d
dd
Show how the learning proceeds?
-
8/2/2019 ANN 3- Perceptron
42/56
ExampleExample 22
AND ProblemAND Problem
Lecture 3-42
-
8/2/2019 ANN 3- Perceptron
43/56
Lecture 3-43
ExampleExample 22
Consider the AND problem
1
1;
1
1;
1
1;
1
14321 xxxx
With bias b =0.5 and the desired responses and weights
initialized as:
1.0;7.0
3.0;
1
1
1
1
4
3
2
1
w
d
d
d
d
Show how the learning proceeds?
-
8/2/2019 ANN 3- Perceptron
44/56
Lecture 3-44
Consider:
1
1
1
;
1
1
1
;
1
1
1
;
1
1
1
4321 XXXX
7.0
3.0
5.0
w
-
8/2/2019 ANN 3- Perceptron
45/56
Lecture 3-45
Iteration 1. InputX1present and its desired outputd1:
1)sgn(
5.1
1
1111
7.03.05.010
nety
XwnetT
sinced1=1 , the error signal is
e1 = 1-1 = 0Correction (update the weights) in this step isCorrection (update the weights) in this step is not necessarynot necessary
TTww 01
-
8/2/2019 ANN 3- Perceptron
46/56
Lecture 3-46
The updated weight vector is
9.01.03.0
9.0
1.03.0
2.0
2.02.0
7.0
3.05.0
*)11(1.0
2
2
12 2
Tw
w
Xww
Iteration 2. Input X2present and its desired output d2:
1)sgn(
1.0
2
1117.03.05.0212
nety
Xwnet T
sinced2=-1 , the error signal is
e2 = -1-1 = -2Correction in this step is necessaryCorrection in this step is necessary
-
8/2/2019 ANN 3- Perceptron
47/56
Lecture 3-47
Iteration 3. Input X3 present and its desired outputd3:
1)sgn(
1.1
3
111
9.01.03.0323
nety
Xwnet T
sinced3=-1 , the error signal is
e3= -1-1 = -2Correction in this step is necessary
7.03.01.0
7.0
3.01.0
2.0
2.02.0
9.0
1.03.0
*)11(*1.0
3
2
3
33
Tw
w
XwwThe updated weight vector is
-
8/2/2019 ANN 3- Perceptron
48/56
Lecture 3-48
Iteration 4. Input isX4 present and its desired outputd4:
1)sgn(
9.0
4
111
7.03.01.0434
nety
Xwnet T
sinced4=-1 , the error signal is
e4= -1+1 = 0Correction in this step is not necessaryCorrection in this step is not necessary
TTww 34
-
8/2/2019 ANN 3- Perceptron
49/56
Lecture 3-49
check
weights change for iteration 2 and 3.Stopping criteria is not satisfied, continuous withepoch 2
-
8/2/2019 ANN 3- Perceptron
50/56
Iteration 5
11
1
11
7.03.01.014
1)sgn(
1.11
dnety
XwnetT
22
111
7.03.01.0252
1)sgn(
3.0
dnety
Xwnet T
33
111
7.03.01.0363
1)sgn(
5.0
dnety
XwnetT
X
Iteration 6
Iteration 7
TTww 45
TTww
56
-
8/2/2019 ANN 3- Perceptron
51/56
Lecture 3-51
Iteration 7. Input X3present and its desired outputd3:
1)sgn(
5.0
7
111
7.03.01.0367
nety
Xwnet T
sinced3=-1 , the error signal is
e7 = -1-1 = -2Correction in this step is necessary
5.05.01.0
5.0
5.0
1.0
2.0
2.0
2.0
7.0
3.0
1.0
*)11(*1.0
7
7
367
Tw
w
Xww
-
8/2/2019 ANN 3- Perceptron
52/56
Lecture 3-52
Iteration 8. Now all the patterns will give the correct
response. Try it.We can draw the solution found using the line:
01.05.05.0 21 xx
-
8/2/2019 ANN 3- Perceptron
53/56
Lecture 3-53
PerceptronLimitations
-
8/2/2019 ANN 3- Perceptron
54/56
Lecture 3-54
The perceptron learning rule is guaranteed to convergeto a solution in a finite number of steps, so long as asolution exists. The perceptron can be used to classify input vectors that
can be separated by a linear boundary. Unfortunately, many problems are not linearly separable.The classic example is the XOR gate. This problem isillustrated graphically as follows:
Linearly Inseparable Problems
Perceptron Limitations
-
8/2/2019 ANN 3- Perceptron
55/56
Lecture 3-55
Rosenblatt had investigated more complex networks,
which he would felt overcome the limitations of the basic
perceptron, but he was never able to effectively extend the
perceptron rule to such networks.
In later lecture we will introduce multilayer perceptrons,
which can solve arbitrary classification problems, and will
describe the backpropagation algorithm, which can be
used to train them.
End of Perceptron
-
8/2/2019 ANN 3- Perceptron
56/56
HomeworkHomework 11, cont., cont.
Problems:
1.1, 1.2, and 1.3
Computer Experiment
1.6
56