alternatives to least-squares - ubccourses.ece.ubc.ca/574/ident2.pdfalternatives to least-squares...

44
Alternatives to Least-Squares Need a method that gives consistent estimates in presence of coloured noise Generalized Least-Squares Instrumental Variable Method Maximum Likelihood Identification Adaptive Control Lecture Notes – c Guy A. Dumont, 1997-2005 61

Upload: others

Post on 16-Mar-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Alternatives to Least-Squares

• Need a method that gives consistent estimates in presence of colourednoise

• Generalized Least-Squares

• Instrumental Variable Method

• Maximum Likelihood Identification

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 61

Page 2: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Generalized Least Squares

• Noise model

w(t) =e(t)

C(q−1)where {e(t)} = N(0, σ) and C(q−1) is monic and of degree n.

• System described as

A(q−1)y(t) = B(q−1)u(t) +1

C(q−1)e(t)

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 62

Page 3: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Generalized Least Squares

• Defining filtered sequences

y(t) = C(q−1)y(t)

u(t) = C(q−1)u(t)

• System becomes

A(q−1)y(t) = B(q−1)u(t) + e(t)

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 63

Page 4: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Generalized Least Squares

• If C(q−1) is known, then least-squares gives consistent estimates of Aand B, given y and u

• The problem however, is that in practice C(q−1) is not known and {y}and {u} cannot be obtained.

• An iterative method proposed by Clarke (1967) solves that problem

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 64

Page 5: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Generalized Least Squares

1. Set C(q−1) = 1

2. Compute {y(t)} and {u(t)} for t = 1, . . . , N

3. Use least–squares method to estimate A and B from y and u

4. Compute the residuals

w(t) = A(q−1

)y(t)− B(q−1

)u(t)

5. Use least-squares method to estimate C from

C(q−1

)w(t) = ε(t)

i.e.

w(t) = −c1w(t− 1)− c2w(t− 2)− · · ·+ ε(t)

where {ε(t)} is white

6. If converged, then stop, otherwise repeat from step 2.

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 65

Page 6: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Generalized Least Squares

• For convergence, the loss function and/or the whiteness of the residualscan be tested.

• An advantage of GLS is that not only it may give consistent estimatesof the deterministic part of the system, but also gives a representationof the noise that the LS method does not give.

• The consistency of GLS depends on the signal to noise ratio, theprobability of consistent estimation increasing with the S/N.

• There is, however no guarantee of obtaining consistent estimates

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 66

Page 7: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Instrumental Variable Method (IV) 4

• As seen previously, the LS estimate

θ = [XTX]−1XTY

is unbiased if W is independent of X.

• Assume that a matrix V is available, which is correlated with X but notwith W and such that V TX is positive definite, i.e.

E[V TX] is nonsingular

E[V TW ] = 0

4T. Soderstrom and P.G. Stoica,Instrumental Variable Methods for System Identification, Spinger-Verlag,Berlin, 1983

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 67

Page 8: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Instrumental Variable Method (IV)

• Then,V TY = V TXθ + V TW

and θ estimated by

θ = [V TX]−1V TY

V is called the instrumental variable matrix.

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 68

Page 9: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Instrumental Variable Method (IV)

• Ideally V is the noise-free process output and the IV estimate θ isconsistent.

• There are many possible ways to construct the instrumental variable. Forinstance, it may be built using an initial least–squares estimates:

A1y(t) = B1u(t)

and the kth row of V is given by

vTk = [−y(k − 1), . . . ,−y(k − n), u(k − 1), . . . , u(k − n)]

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 69

Page 10: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Instrumental Variable Method (IV)

• Consistent estimation cannot be guaranteed in general.

• A two-tier method in its off–line version, the IV method is more usefulin its recursive form.

• Use of instrumental variable in closed-loop.

– Often the instrumental variable is constructed from the input sequence.This cannot be done in closed-loop, as the input is formed from oldoutputs, and hence is correlated with the noise w, unless w is white.In that situation, the following choices are available.

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 70

Page 11: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Instrumental Variable Method (IV)

• Instrumental variables in closed-loop

– Delayed inputs and outputs. If the noise w is assumed to be a movingaverage of order n, then choosing v(t) = x(t− d) with d > n gives aninstrument uncorrelated with the noise. This, however will only workwith a time-varying regulator.

– Reference signals. Building the instruments from the setpoint willsatisfy the noise independence condition. However, the setpoint mustbe a sufficiently rich signal for the estimates to converge.

– External signal. This is effect relates to the closed-loop identifiabilitycondition (covered a bit later...). A typical external signal is a whitenoise independent of w(t).

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 71

Page 12: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Maximum-likelihood identification

The maximum-likelihood method considers the ARMAX model belowwhere u is the input, y the output and e is zero-mean white noise withstandard deviation σ:

A(q−1)y(t) = B(q−1)u(t− k) + C(q−1)e(t)

whereA(q−1) = 1 + a1q

−1 + · · ·+ anq−n

B(q−1) = b1q−1 + · · ·+ bnq−n

C(q−1) = 1 + c1q−1 + · · ·+ cnq−n

The parameters of A, B, C as well as σ are unknown.

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 72

Page 13: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Maximum-likelihood identificationDefining

θT

= [ a1 · · · an b1 · · · bn c1 · · · cn]

xT

= [ −y(t) · · · u(t− k) · · · e(t− 1) · · · ]

the ARMAX model can be written as

y(t) = xT (t)θ + e(t)

Unfortunately, one cannot use the least-squares method on this model sincethe sequence e(t) is unknown.

In case of known parameters, the past values of e(t) can be reconstructedexactly from the sequence:

ε(t) = [A(q−1)y(t)−B(q−1)]/C(q−1)

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 73

Page 14: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Maximum-likelihood identification

Defining the performance index below, the maximum-likelihood method is then

summarized by the following steps.

V =1

2

NXt=1

ε2(t)

• Minimize V with respect to θ, using for instance a Newton-Raphson algorithm. Note

that ε is linear in the parameters of A and B but not in those of C. We then have to

use some iterative procedure

θi+1 = θi − αi(V′′(θi))

−1V′(θi)

• Initial estimate θ0 is usually obtained from a least-squares estimate

• Estimate the noise variance as

σ2=

2

NV (θ)

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 74

Page 15: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Properties of the Maximum-likelihood Estimate (MLE)

• If the model order is sufficient, the MLE is consistent, i.e. θ → θ asN →∞.

• The MLE is asymptotically normal with mean θ and standard deviationσθ.

• The MLE is asymptotically efficient, i.e. there is no other unbiasedestimator giving a smaller σθ.

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 75

Page 16: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Properties of the Maximum-likelihood Estimate (MLE)

• The Cramer-Rao inequality says that there is a lower limit on the precisionof an unbiased estimate, given by

cov θ ≥ M−1θ

where Mθ is the Fisher Information Matrix Mθ = −E[(log L)θθ] For theMLE

σ2θ = M−1

θ

i.e.σ2

θ = σ2V −1θθ

if σ is estimated, then

σ2θ =

2V

NV −1

θθ

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 76

Page 17: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Identification In Practice

1. Specify a model structure

2. Compute the best model in this structure

3. Evaluate the properties of this model

4. Test a new structure, go to step 1

5. Stop when satisfactory model is obtained

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 77

Page 18: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

MATLAB System Identification Toolbox

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 78

Page 19: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

MATLAB System Identification Toolbox

• Most used package

• Graphical User Interface

• Automates all the steps

• Easy to use

• Familiarize yourself with it by running the examples

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 79

Page 20: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

RECURSIVE IDENTIFICATION

• There are many situations when it is preferable to perform theidentification on-line, such as in adaptive control.

• Identification methods need to be implemented in a recursive fashion,i.e. the parameter estimate at time t should be computed as a functionof the estimate at time t− 1 and of the incoming information at time t.

• Recursive least-squares

• Recursive instrumental variables

• Recursive extended least-squares and recursive maximum likelihood

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 80

Page 21: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Recursive Least-Squares (RLS)

We have seen that, with t observations available, the least-squaresestimate is

θ(t) = [XT (t)X(t)]−1XT (t)Y (t)

withY T (t) = [ y(1) · · · y(t)]

X(t) =

xT (1)...

xT (t)

Assume one additional observation becomes available, the problem is thento find θ(t + 1) as a function of θ(t) and y(t + 1) and u(t + 1).

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 81

Page 22: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Recursive Least-Squares (RLS)

Defining X(t + 1) and Y (t + 1) as

X(t + 1) =[

X(t)xT (t + 1)

]Y (t + 1) =

[Y (t)

y(t + 1)

]and defining P (t) and P (t + 1) as

P (t) = [XT (t)X(t)]−1 P (t + 1) = [XT (t + 1)X(t + 1)]−1

one can write

P (t + 1) = [XT (t)X(t) + x(t + 1)xT (t + 1)]−1

θ(t + 1) = P (t + 1)[XT (t)Y (t) + x(t + 1)y(t + 1)]

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 82

Page 23: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Matrix Inversion Lemma

Let A, D and [D−1 + CA−1B] be nonsingular square matrices. ThenA + BDC is invertible and

(A + BDC)−1 = A−1 −A−1B(D−1 + CA−1B)−1CA−1

Proof The simplest way to prove it is by direct multiplication

(A + BDC)(A−1 − A

−1B(D

−1+ CA

−1B)−1

CA−1

)

= I + BDCA−1 − B(D

−1+ CA

−1B)−1

CA−1

−BDCA−1

B(D−1

+ CA−1

B)−1

CA−1

= I + BDCA−1 − BD(D

−1+ CA

−1B)(D

−1+ CA

−1B)−1

CA−1

= I

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 83

Page 24: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Matrix Inversion Lemma

An alternative form, useful for deriving recursive least-squares is obtainedwhen B and C are n× 1 and 1× n (i.e. column and row vectors):

(A + BC)−1 = A−1 − A−1BCA−1

1 + CA−1B

Now, consider

P (t + 1) = [XT (t)X(t) + x(t + 1)xT (t + 1)]−1

and use the matrix-inversion lemma with

A = XT (t)X(t) B = x(t + 1) C = xT (t + 1)

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 84

Page 25: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Recursive Least-Squares (RLS)

Some simple matrix manipulations then give the recursive least-squaresalgorithm:

θ(t + 1) = θ(t) + K(t + 1)[y(t + 1)− xT(t + 1)θ(t)]

K(t + 1) =P (t)x(t + 1)

1 + xT (t + 1)P (t)x(t + 1)

P (t + 1) = P (t)−P (t)x(t + 1)xT (t + 1)P (t)

1 + xT (t + 1)P (t)x(t + 1)

Note that K(t + 1) can also be expressed as

K(t + 1) = P (t + 1)x(t + 1)

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 85

Page 26: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Recursive Least-Squares (RLS)

• The recursive least-squares algorithm is the exact mathematical equivalent of the batch

least-squares.

• Once initialized, no matrix inversion is needed

• Matrices stay the same size all the time

• Computationally very efficient

• P is proportional to the covariance matrix of the estimate, and is thus called the

covariance matrix.

• The algorithm has to be initialized with θ(0) and P (0). Generally, P (0) is initialized

as αI where I is the identity matrix and α is a large positive number. The larger α,

the less confidence is put in the initial estimate θ(0).

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 86

Page 27: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

RLS and Kalman Filter

There are some very strong connections between the recursive least-squares algorithm and the Kalman filter. Indeed, the RLS algorithm has thestructure of a Kalman filter:

θ(t + 1)︸ ︷︷ ︸new

= θ(t)︸︷︷︸old

+K(t + 1) [y(t + 1)− xT (t + 1)θ(t)]︸ ︷︷ ︸correction

where K(t + 1) is the Kalman gain.

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 87

Page 28: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

RLS

The following Matlab code is a straightforward implementation of theRLS algorithm:function [thetaest,P]=rls(y,x,thetaest,P)

% RLS

% y,x: current measurement and regressor

% thetaest, P: parameter estimates and covariance matrix

K= P*x/(1+x’*P*x); % Gain

P= P- (P*x*x’*P)/(1+x’*P*x); % Covariance matrix update

thetaest= thetaest +K*(y-x’*thetaest); %Parameter estimate update

% end

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 88

Page 29: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Recursive Extended Least-Squares and RecursiveMaximum-Likelihood

Because the prediction error is not linear in the C–parameters, it is not possible to an

exact recursive maximum likelihood method as for the least–squares method.

The ARMAX model

A(q−1

)y(t) = B(q−1

)u(t) + C(q−1

)e(t)

can be written as

y(t) = xT(t)θ + e(t)

with

θ = [a1, . . . , an, b1, . . . , bn, c1, . . . , cn]T

xT(t) = [−y(t− 1), . . . ,−y(t− n), u(t− 1),

. . . , u(t− n), e(t− 1), . . . , e(t− n)]T

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 89

Page 30: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Recursive Extended Least-Squares and ApproximateMaximum-Likelihood

• If e(t) was known, RLS could be used to estimate θ, however it isunknown and thus has to be estimated.

• It can be done in two ways, either using the prediction error or theresidual.

• The first case corresponds to the RELS method, the second to the AMLmethod.

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 90

Page 31: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Recursive Extended Least-Squares and ApproximateMaximum-Likelihood

• The one–step ahead prediction error is defined as

ε(t) = y(t)− y(t | t− 1)

= y(t)− xT (t)θ(t− 1)

x(t) = [−y(t− 1), . . . , u(t− 1), . . . , ε(t− 1), . . . , ε(t− n)]T

• The residual is defined as

η(t) = y(t)− y(t | t)

= y(t)− xT (t)θ(t)

x(t) = [−y(t− 1), . . . , u(t− 1), . . . , η(t− 1), . . . , η(t− n)]T

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 91

Page 32: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Recursive Extended Least-Squares and ApproximateMaximum-Likelihood

• Sometimes ε(t) and η(t) are also referred to as a–priori and a–posterioriprediction errors.

• Because it uses the latest estimate θ(t), as opposed to θ(t− 1) for ε(t),η(t) is a better estimate, especially in transient behaviour.

• Note however that if θ(t) converges as t −→∞ then η(t) −→ ε(t).

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 92

Page 33: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Recursive Extended Least-Squares and ApproximateMaximum-Likelihood

The two schemes are then described by

θ(t + 1) = θ(t) + K(t + 1)[y(t + 1)− xT(t + 1)θ(t)]

K(t + 1) = P (t + 1)x(t + 1)/[1 + xT(t + 1)P (t)x(t + 1)]

P (t + 1) = P (t)−P (t)x(t + 1)xT (t + 1)P (t)

[1 + xT (t + 1)P (t)x(t + 1)]

but differ by their definition of x(t)

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 93

Page 34: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Recursive Extended Least-Squares and ApproximateMaximum-Likelihood

• The RELS algorithm corresponds uses the prediction error. This algorithmis called RELS, Extended Matrix or RML1 in the literature. It hasgenerally good convergence properties, and has been proved consistentfor moving–average and first–order auto regressive processes. However,counterexamples to general convergence exist, see for example Ljung(1975).

• The AML algorithm uses the residual error. The AML has betterconvergence properties than the RML, and indeed convergence can beproven under rather unrestrictive conditions.

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 94

Page 35: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Recursive Maximum-Likelihood

• Yet another approach.

• The ML can also be interpreted in terms of data filtering. Consider theperformance index:

V (t) =12

t∑i=1

ε2(i)

with ε(t) = y(t)− xT (t)θ(t− 1)

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 95

Page 36: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Recursive Maximum-Likelihood

• Because V is a nonlinear function of C, it has to be approximated by aTaylor series truncated after the second term. The resulting scheme isthen:

θ(t + 1) = θ(t) + K(t + 1)[y(t + 1)− xT(t + 1)θ(t)]

K(t + 1) =P (t + 1)xf(t + 1)

[1 + xTf (t + 1)P (t)xf(t + 1)]

P (t + 1) = P (t)−P (t)xf(t + 1)xT

f (t + 1)P (t)

[1 + xTf (t + 1)P (t)xf(t + 1)]

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 96

Page 37: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Properties of AML and RML

Definition 1. A discrete transfer function is said to be strictly positive realif it is stable and

ReH(ejw) > 0 ∀w − π < w ≤ π

on the unit circle.

This condition can be checked by replacing z by 1+jw1−jw and extracting the

real part of the resulting expression.

For the convergence of AML, the following theorem is then available.

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 97

Page 38: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Properties of AML and RML

Theorem. [Ljung & Soderstrom, 1983)] Assume both process andmodel are described by ARMAX with order model ≥ order process, then if

1. {u(t)} is sufficiently rich

2. 1C(q−1)

− 12$isstrictlypositivereal then θ(t) will converge such that

E[ε(t, θ)− e(t)]2 = 0

If model and process have the same order, this implies

θ(t) −→ θ as t −→∞

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 98

Page 39: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

A Unified Algorithm

Looking at all the previous algorithms, it is obvious that they all havethe same form, with only different parameters. They can all be representedby a recursive prediction - error method (RPEM).

θ(t + 1) = θ(t) + K(t + 1)ε(t + 1)

K(t + 1) = P (t)z(t + 1)/[1 + xT (t + 1)P (t)z(t + 1)]

P (t + 1) = P (t)− P (t)z(t + 1)xT (t + 1)P (t)[1 + xT (t + 1)P (t)x(t + 1)]

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 99

Page 40: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Tracking Time-Varying ParametersAll previous methods use the least–squares criterion

V (t) =1t

t∑i=1

[y(i)− xT (i)θ]2

and thus identify the average behaviour of the process. When the parametersare time varying, it is desirable to base the identification on the most recentdata rather than on the old one, not representative of the process anymore.This can be achieved by exponential discounting of old data, using thecriterion

V (t) =1t

t∑i=1

λt−i[y(i)− xT (i)θ]2

where 0 < λ ≤ is called the forgetting factor.

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 100

Page 41: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Tracking Time-Varying Parameters

The new criterion can also be written

V (t) = λV (t− 1) + [y(t)− xT (t)θ]2

Then, it can be shown (Goodwin and Payne, 1977) that the RLS schemebecomes

θ(t + 1) = θ(t) + K(t + 1)[y(t + 1)− xT(t + 1)θ(t)]

K(t + 1) = P (t)x(t + 1)/[λ + xT(t + 1)P (t)x(t + 1)]

P (t + 1) =

(P (t)−

P (t)x(t + 1)xT (t + 1)P (t)

[λ + xT (t + 1)P (t)x(t + 1)]

)1

λ

In choosing λ, one has to compromise between fast tracking and long termquality of the estimates. The use of the forgetting may give rise to problems.

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 101

Page 42: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Tracking Time-Varying Parameters

The smaller λ is, the faster the algorithm can track, but the more theestimates will vary, even the true parameters are time-invariant.A small λ may also cause blowup of the covariance matrix P , since inthe absence of excitation, covariance matrix update equation essentiallybecomes

P (t + 1) =1λP (t)

in which case P grows exponentially, leading to wild fluctuations in theparameter estimates.One way around this is to vary the forgetting factor according to theprediction error ε as in

λ(t) = 1− kε2(t)

Then, in case of low excitation ε will be small and λ will be close to 1. Incase of large prediction errors, λ will decrease.

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 102

Page 43: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Exponential Forgetting and Resetting AlgorithmThe following scheme due Salgado, Goodwin and Middleton5 is

recommended:

ε(t + 1) = y(t + 1)− xT (t + 1)θ(t)

θ(t + 1) = θT (t) +αP (t)x(t + 1)

λ + xT (t + 1)P (t)x(k + 1)ε(t)

P (t + 1) =1λ

[P (t)− P (t)x(t + 1)xT (t + 1)P (t)

λ + x(t + 1)TP (t)x(t + 1)

]+βI − γP (t)2

where I is the identity matrix, and α, β and γ are constants.5M.E. Salgado, G.C. Goodwin, and R.H. Middleton, “Exponential Forgetting and Resetting”, International

Journal of Control, vol. 47, no. 2, pp. 477–485, 1988.

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 103

Page 44: Alternatives to Least-Squares - UBCcourses.ece.ubc.ca/574/ident2.pdfAlternatives to Least-Squares • Need a method that gives consistent estimates in presence of coloured noise •

Exponential Forgetting and Resetting Algorithm

With the EFRA, the covariance matrix is bounded on both sides:

σminI ≤ P (t) ≤ σmaxI ∀t

where

σmin ≈β

α− ησmax ≈

η

γ+

β

η

with

η =1− λ

λWith α = 0.5, β = γ = 0.005 and λ = 0.95, σmin = 0.01 and σmax = 10.

Adaptive Control Lecture Notes – c©Guy A. Dumont, 1997-2005 104