artificial neural network

53
Artificial Neural Networks Ildar Nurgaliev

Upload: ildar-nurgaliev

Post on 15-Jul-2015

132 views

Category:

Science


10 download

TRANSCRIPT

Page 1: Artificial neural network

Artificial Neural Networks

Ildar Nurgaliev

Page 2: Artificial neural network

Warren McCulloch and Walter Pitts (1943) created a

computational model for neural networks based on

mathematics and algorithms. They called this model

threshold logic.

Neural networks, as used in artificial intelligence, have

traditionally been viewed as simplified models of neural

processing in the brain.

Introduction

Page 3: Artificial neural network

Biological Network and Neural System

Page 4: Artificial neural network

Biological Network and Neural System

Page 5: Artificial neural network

Biological Network and Neural System

Page 6: Artificial neural network

Perceptron

Page 7: Artificial neural network

Perceptron

Page 8: Artificial neural network

Perceptron

Page 9: Artificial neural network

Perceptron

Page 10: Artificial neural network

Bias: Activation Weight

Page 11: Artificial neural network

Perceptron and Boolean Functions

x1 x2 x1 ⋀ x2

-1 -1

-1 1

1 -1

1 1

-1

-1

-1

1

sign(w0 + w1x1 + w2x2) = x1 ⋀ x2

Conjunction

Page 12: Artificial neural network

Perceptron and Boolean Functions

x1 x2 x1 ⋀ x2

-1 -1

-1 1

1 -1

1 1

-1

-1

-1

1

sign(w0 + w1x1 + w2x2) = x1 ⋀ x2

Conjunction

Page 13: Artificial neural network

Perceptron and Boolean Functions

x1 x2 x1 ⋀ x2

-1 -1

-1 1

1 -1

1 1

-1

-1

-1

1

sign(w0 + w1x1 + w2x2) = x1 ⋀ x2

w0 = -1

w1 = 1

w2 = 1

Conjunction

Page 14: Artificial neural network

Perceptron and Boolean Functions

x1 x2 x1 ⋀ x2

-1 -1

-1 1

1 -1

1 1

-1

1

1

1

sign(w0 + w1x1 + w2x2) = x1 ⋁ x2

Disjunction

Page 15: Artificial neural network

Perceptron and Boolean Functions

x1 x2 x1 ⋀ x2

-1 -1

-1 1

1 -1

1 1

-1

1

1

1

sign(w0 + w1x1 + w2x2) = x1 ⋁ x2

Disjunction

w0 = 1

w1 = 1

w2 = 1

Page 16: Artificial neural network

Geometric Interpretation

w0

Page 17: Artificial neural network

Geometric Interpretation

w0

Page 18: Artificial neural network

Perceptron Learning

Ensemble-Teacher Learning

Perceptron

Page 19: Artificial neural network

Perceptron Learning

Ensemble-Teacher Learning

Perceptron

x1 x2 x1 ⋀ x2

-1 -1

-1 1

1 -1

1 1

-1

-1

-1

1

sign(w0 + w1x1 + w2x2) = x1 ⋀ x2

Page 20: Artificial neural network

Ensemble-Teacher Learning

1

1

-1

1

1

1

0.5

0.5

0.5

-1

1

Page 21: Artificial neural network

Ensemble-Teacher Learning

1

1

-1

1

1

1

0.5

0.5

0.5

-1

1 -1

Page 22: Artificial neural network

Ensemble-Teacher Learning

1

1

-1

1

1

1

0.5

0.5

0.5

-1

1 -1

↓↓

↓↓

Page 23: Artificial neural network

Ensemble-Teacher Learning

1

1

-1

1

1

1

0.5

0.5

0.5

-1

1 -1

↓↓

↓↓

Right ans: a

Net ans : y

Direction of learning

d = a - y = -2

Change weight

Δwi = εdxi|wi|

Page 24: Artificial neural network

Ensemble-Teacher Learning

1

1

-1

1

1

0.5

0.2

0.7

0

-1.5

11 -1

Right ans: a

Net ans : y

Direction of learning

d = a - y = -2

Change weight

Δwi = εdxi|wi|

Page 25: Artificial neural network

XOR-function

Doesn’t work?

x1 x2 x1 ⊕ x2

-1 -1

-1 1

1 -1

1 1

-1

1

1

-1

Page 26: Artificial neural network

XOR-function

Doesn’t work?

x1 x2 x1 ⊕ x2

-1 -1

-1 1

1 -1

1 1

-1

1

1

-1

Solution

Page 27: Artificial neural network

Multilayer Perceptron

Page 28: Artificial neural network

Multilayer Perceptron

Input layer Hidden layer Output layer

Page 29: Artificial neural network

Learning as a Function MinimizationGiven:

X=(X1...Xk) input vectors, Xi∈Rn

A=(A1...Ak) correct output vectors, Ai∈Rm

(X,A) learning set

W vector which contains all weights

N(W,X) neuron network’s function

Y = N(W,X) neuron network’s response Y∈Rm

D(Y,A) = ∑mj=1(Y[j]-A[j])2 error function

D(Yi)=D(Y,Ai) error function on i-th example

Ei(W)=Di(N(W,Xi)) network’s error on i-th example

E(W)=∑ki=1Ei(W) network’s error in whole set

Goal:

Find vector W such that E(W)➝min (learning in the whole set)

Find vector W such that Ei(W)➝min (learning in the particular example)

Page 30: Artificial neural network

Gradient Descent Method

Page 31: Artificial neural network

Gradient Descent Method

Algorithm for single variable Algorithm for GDM

1. Initialize x1 with random value from

R

2. i=1

3. xi+1 = xi-દf’(xi)

4. i++

5. if f(xi ) - f(xi+1) > c goto 3

1. Initialize W1 with random value

from Rn

2. i=1

3. Wi+1 = Wi-દ▽f(Wi)

4. W++

5. if f(Wi ) - f(Wi+1 )> c goto 3

Page 32: Artificial neural network

Backpropagation Method

Page 33: Artificial neural network

Backpropagation MethodGiven:

X=(X1...Xk) input vectors, Xi∈Rn

A=(A1...Ak) correct output vectors, Ai∈Rm

(X,A) learning set

W vector which contains all weights

N(W,X) neuron network’s function

Y = N(W,X) neuron network’s response Y∈Rm

D(Y,A) = ∑mj=1(Y[j]-A[j])2 error function

D(Yi)=D(Y,Ai) error function on i-th example

Ei(W)=Di(N(W,Xi)) network’s error on i-th example

E(W)=∑ki=1Ei(W) network’s error in whole set

Goal:

Find vector W such that E(W)➝min (learning in the whole set)

Find vector W such that Ei(W)➝min (learning in the particular example)

Page 34: Artificial neural network

Backpropagation MethodDk(y1,y2)=(y1- a1)

2 + (y2- a2)2 Goal: decrease function, using gradient

descent in order increase accuracy of

function.

Page 35: Artificial neural network

Backpropagation MethodDk(y1,y2)=(y1- a1)

2 + (y2- a2)2

= 2(y1- a1) = 2(y2- a2)

Calculate partial derivatives

Page 36: Artificial neural network

Backpropagation MethodDk(y1,y2)=(y1- a1)

2 + (y2- a2)2

= 2(y1- a1) = 2(y2- a2)

y1=y1(w01,w11,w21) = f( sssssssssss )

But y1 is also a function. Let’s consider it as function of weights.

Now we are able to calculate its partial derivatives.

Page 37: Artificial neural network

Backpropagation MethodDk(y1,y2)=(y1- a1)

2 + (y2- a2)2

= 2(y1- a1) = 2(y1- a1)

y1=y1(w01,w11,w21) = f( ssssssssss )

= f’(S1) x2

= 2(y1- a1) = 2(y2- a2)

For example

Page 38: Artificial neural network

Backpropagation MethodDk(y1,y2)=(y1- a1)

2 + (y2- a2)2

= 2(y1- a1) = 2(y2- a2)

y1=y1(w01,w11,w21) = f( ssssssssss )

= f’(S1) x2 = 0

Ek(W)=Dk(y1(w01, w11, w21 ), y2 (w02, w12, w22 ))

y2=y2(w02,w12,w22) = f( sssssssssss )

And now we are able to calculate

partial derivative to function Ek on

each weight.

Page 39: Artificial neural network

Backpropagation MethodDk(y1,...,yn) = (y1- an)

2 +... + (yn- an )2

Same actions in the general case.

Page 40: Artificial neural network

Backpropagation MethodDk(y1,...,yn) = (y1- an)

2 +... + (yn- an )2

yi = f(Si )

Now calculate functions Si yi and each derivative on wji.

Page 41: Artificial neural network

Backpropagation MethodDk(y1,...,yn) = (y1- an)

2 +... + (yn- an )2

yi = f(Si )

Page 42: Artificial neural network

Backpropagation MethodDk(y1,...,yn) = (y1- an)

2 +... + (yn- an )2

yi = f(Si )

Formula to calculate derivative of Ek function on each weight

Page 43: Artificial neural network

Backpropagation Method

yi = yi(x1 , … , xm ) xj = xj(v0j , … , vrj )

If would be that Dk= Dk(x1 ,.., xm ) it means that

Page 44: Artificial neural network

Backpropagation Method

yi = yi(x1 , … , xm ) xj = xj(v0j , … , vrj )

If would be that Dk= Dk(x1 ,.., xm ) it means that

We don’t know only that, so let calculate it!

Page 45: Artificial neural network

Backpropagation MethodDk(y1,y2)=(y1- a1)

2 + (y2- a2)2

= 2(y1- a1) = 2(y2- a2)

y1=y1(w01,w11,w21) = f( sssssssssss )

Now let’s consider f function as a function of xi and

calculate its derivative

Page 46: Artificial neural network

Backpropagation MethodDk(y1,y2)=(y1- a1)

2 + (y2- a2)2

= 2(y1- a1) = 2(y2- a2)

y2=y2(w02,w12,w22) = f( sssssssssss )

Now let’s consider f function as a function of xi and

calculate its derivative

Page 47: Artificial neural network

Backpropagation MethodDk(y1,y2)=(y1- a1)

2 + (y2- a2)2

= 2(y1- a1) = 2(y2- a2)

y1=y1(w01,w11,w21) = f( sssssssssss )

Now we are able to calculate

derivative of Dk function of x1

Page 48: Artificial neural network

Backpropagation Method

Now do the same actions in the general case.

Page 49: Artificial neural network

Backpropagation Method

Page 50: Artificial neural network

Offline Learning

Train the ANN

on example set

Use the ANN

on the real data

But what if the real data is not i.i.d.?

Page 51: Artificial neural network

Online learning

1. Receive an instance

2. Predict the outcome

3. Obtain the real outcome

Learn one instance at a time

However, in practice it is not always possible

to obtain the real outcome

Page 52: Artificial neural network

Spiking Neural Network

● Third-generation of ANN models

● Adds the concept of time to the neuron

● One SNN neuron can replace hundreds of

hidden neurons in conventional ANN models

● Requires huge computational power

Page 53: Artificial neural network

TrueNorth by IBM

1 million neurons

256 million

synapses

Power: <100 mW