chee825 fall 2005j. mclellan1 nonlinear empirical models

33
CHEE825 Fall 2 005 J. McLellan 1 Nonlinear Empirical Models

Upload: britney-ray

Post on 18-Jan-2018

217 views

Category:

Documents


0 download

DESCRIPTION

CHEE825 Fall 2005J. McLellan3 Neural Networks... structure motivated by physiological structure of brain individual nodes or cells - “neurons” -sometimes called “perceptrons” neuron characteristics - notion of “firing” or threshold behaviour

TRANSCRIPT

Page 1: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 1

Nonlinear Empirical Models

Page 2: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 2

Neural Network Models of Process Behaviour

• generally modeling input-output behaviour• empirical models - no attempt to model physical

structure• estimated from plant data

Page 3: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 3

Neural Networks...

• structure motivated by physiological structure of brain • individual nodes or cells - “neurons” -sometimes

called “perceptrons”• neuron characteristics - notion of “firing” or threshold

behaviour

Page 4: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 4

Stages of Neural Network Model Development

• data collection - training set, validation set• specification / initialization - structure of network,

initial values• “learning” or training - estimation of parameters• validation - ability to predict new data set collected

under same conditions

Page 5: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 5

Data Collection

• expected range and point of operation• size of input perturbation signal• type of input perturbation signal

- random input sequence?- number of levels (two or more?)

• validation data set

Page 6: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 6

Model Structure

• numbers and types of nodes• input, “hidden”, output• depends on type of neural network

- e.g., Feedforward Neural Network- e.g., Recurrent Neural Network

• types of neuron functions - threshold behaviour - e.g., sigmoid function, ordinary differential equation

Page 7: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 7

“Learning” (Training)

• estimation of network parameters - weights, thresholds and bias terms

• nonlinear optimization problem • objective function - typically sum of squares of output

prediction error• optimization algorithm - gradient-based method or

variation

Page 8: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 8

Validation

• use estimated NN model to predict outputs for new data set

• if prediction unacceptable, “re-train” NN model with modifications - e.g., number of neurons

• diagnostics - sum of squares of prediction error- R2 - coefficient of determination

Page 9: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 9

Feedforward Neural Networks

• signals flow forward from input through hidden nodes to output- no internal feedback

• input nodes - receive external inputs (e.g., controls) and scale to [0,1] range

• hidden nodes - collect weighted sums of inputs from other nodes and act on the sum with a nonlinear function

Page 10: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 10

Feedforward Neural Networks (FNN)

• output nodes - similar to hidden nodes BUT they produce signals leaving the network (outputs)

• FNN has one input layer, one output layer, and can have many hidden layers

Page 11: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 11

FNN - Neuron Model

• ith neuron in layer l+1

y f w yil

ijl

jl

il

j

Nl+ + +

== +∑1 1 1

1( )θ

threshold value

weight

activation function

state of neuron

Page 12: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 12

FNN parameters

• weights wl+1ij - weight on output from jth neuron in

layer l entering neuron i in layer l+1• threshold - determines value of function when inputs

to neuron are zero• bias - provision for additional constants to be added

Page 13: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 13

FNN Activation Function

• typically sigmoidal function

f xe x

( )=+ −

11

Page 14: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 14

FNN Structure

input layerhidden layer

output layer

Page 15: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 15

Mathematical Basis

• approximation of functions• e.g., Cybenko, 1989 - J. of Mathematics of Control,

Signals and Systems• approximation to arbitrary degree given sufficiently

large number of nodes - sigmoidal

Page 16: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 16

Training FNN’s

• calculate sum of squares of output prediction error

• take current iterates of parameters, calculate forward and calculate E

• update estimates of weights working backwards - “backpropagation”

E y yj jj

= −∑( $)2

Page 17: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 17

Estimation

• typically using a gradient-based optimization method• make adjustments proportional to

• issues - highly over-parameterized models - potential for singularity

• e.g., Levenberg-Marquardt algo.

∂∂Ewij

l+1

Page 18: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 18

How to use FNN for modeling dynamic behaviour?

• structure of FNN suggests static model • model dynamic model as nonlinear difference

equation • essentially a NARMAX model

Page 19: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 19

Linear discrete time transfer function

• transfer function

• equivalent difference equation

y G z bzaz

uk k+

−= =

+−1

1

1

11

( )

y ay u buk k k k+ −= + +1 1

Page 20: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 20

FNN Structure - 1st order linear example

input layerhidden layer

output layeryk

uk

uk-1

yk+1

Page 21: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 21

FNN model for 1st order linear example

• essentially modelling algebraic relationship between past and present inputs and outputs

• nonlinear activation function not required• weights required - correspond to coefficients in

discrete transfer function

Page 22: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 22

Applications of FNN’s

• process modeling - bioreactors, pulp and paper,• nonlinear control• data reconciliation• fault detection• some industrial applications - many academic

(simulation) studies

Page 23: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 23

“Typical dimensions”

• Dayal et al., 1994 - 3-state jacketted CSTR as a basis

• 700 data points in training set• 6 inputs, 1 hidden layer with 6 nodes, 1 output

Page 24: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 24

Advantages of Neural Net Models

• limited process knowledge required - but be careful (e.g., Dayal et al. paper)

• flexible - can model difficult relationships directly (e.g., inverse of a nonlinear control problem)

Page 25: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 25

Disadvantages

• potential for large computational requirements - implications for real-time application

• highly over-parameterized • limited insight into process structure• amount of data required• limited to range of data collection

Page 26: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 26

Recurrent Neural Networks

• neurons contain differential equation model - 1st order linear + nonlinearity

• contain feedback and feedforward components• can represent continuous dynamics• e.g., You and Nikolaou, 1993

Page 27: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 27

Nonlinear Empirical Model Representations

• Volterra Series (continuous and discrete)• Nonlinear Auto-Regressive Moving Average with

Exogenous Inputs (NARMAX)• Cascade Models

Page 28: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 28

Volterra Series Models

• higher-order convolution models• continuous

y t h t u t d

h u t u t d dh u t u t u t d d d

( ) ( ) ( )

( , ) ( ) ( )( , , ) ( ) ( ) ( )

= −∫

+ − −∫∫+ − − −∫∫∫+

−∞

1

2 1 2 1 2 1 2

3 1 2 3 1 2 3 1 2 3

τ τ

τ τ τ τ τ ττ τ τ τ τ τ τ τ τ

L

Page 29: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 29

Volterra Series Model

• discrete time

y k h j u k j

h j j u k j u k jh j j j u k j u k j u k j

j( ) ( ) ( )

( , ) ( ) ( )( , , ) ( ) ( ) ( )

=∑ −

+ − −∑∑+ − − −∑∑∑+

=

11

2 1 2 1 2

3 1 2 3 1 2 3

L

Page 30: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 30

Volterra Series models...

• can be estimated directly from data or derived from state space models

• causality - limits of sum or integration• functions hi - referred to as the ith order kernel• applications - typically second-order

(e.g., Pearson et al., 1994 - binder)

Page 31: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 31

NARMAX models

• nonlinear difference equation models• typical form

• dependence on lagged y’s - autoregressive• dependence on lagged u’s - moving average

y k f y k y k u k u k( ) ( ( ), ( ),. . . , ( ), ( ),. . .)+ = − −1 1 1

Page 32: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 32

NARMAX examples

• with products, cross-products

• 2nd order Volterra model– as NARMAX model in u only, with second order terms

y k a y k y k a y k u ka u k u k( ) ( ) ( ) ( ) ( )

( ) ( )+ = − + −

+1 1 11 2

3

Page 33: CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models

CHEE825 Fall 2005 J. McLellan 33

Nonlinear Cascade Models

• made from serial and parallel arrangements of static nonlinear and linear dynamic elements

• e.g., 1st order linear dynamic element fed into a “squaring” element– obtain products of lagged inputs– cf. second order Volterra term