summary estim

7/29/2019 Summary Estim

1/23

Estimation: Bias, Variance and Mean Square Error

Let denote the thing that we are trying toestimate.

Let denote the result of an estimation based onone data set.

Each data set used, will generate a different

Bias, b( ) = E[ ] = difference between the true

value and the average of all possible estimates.Variance, 2 = E[ ] = measure of the spreadof the estimates about the mean of all estimates.

Mean Square Error (m.s.e.) = E[ ] = b2 +

2

2

)(

2] )[E(


2/23

Estimation: Some definitions

Estimate is consistent if, when we use more data toform the estimate, the mean square error is reduced.

If we have two ways of estimating the same thing, wesay that the estimator that leads to the smaller meansquare error is more efficient than the otherestimator.

xxxxxx

xx

x

x estimatesmean of allestimates

= (a,b)

a

b truevalue

bias


3/23

Examples we did in class

Bias and variance of an estimate of the mean.

Bias of the estimate of the variance of a set ofmeasurements:

assuming that the measurements were independentrandom variables.

Two methods of estimating Rxx() from T sec. of data.

1. Dividing by the integration time: T-||

Estimation was unbiased but had very high variance,particularly when was close to T.

2. Dividing by total time: TEstimation was biased (asymptotically unbiased).This was equivalent to multiplying Method 1 estimate by atriangular window (T-||)/T.

This window attenuated the high variance estimates.

2N

1n

n )X(N

1

=


4/23

Power Spectral Density Estimation

Definition:

Estimation:

1. Could Fourier Transform the Autocorrelation functionestimate (not computationally efficient).

2. Could use the frequency domain definition directly.

Raw Estimate =Extremely poor variance characteristics.

We showed that variance was , unaffected

by T, the length of data used.

.de)(RT

XXE

T

l im)f(S

f2j

x xT*T

xx +

=

=

= T

XX

)f(S T

*

T

xx

2

x x )f(S


5/23

Power Spectral Density Estimation (Continued)

Smoothed estimate from segment averaging.

1. Break signal up into NSEG segments, Tr seconds long.2. For each segment:

1. Apply a window to smooth transition at ends of segments

2. Fourier Transform windowed segment

XT(f)3. Calculate a raw power spectral density: |XTr|2/Tr estimate

3. Average the results from each segment to get the smoothedestimate.

==

NSEG

1ix xx x

)f(S

NSEG

1)f(S

~

i

x(t)

time

Tr

w(t)


6/23

Power Spectral Density (PSD) Estimation (Continued)

We argued that the distribution of the smoothed PSD was related to that

of a Chi-squared random variable (2) with = 2.NSEG degrees offreedom, if Tr was large enough so we could ignore bias errors.

Therefore:

and rearranging we showed that:

Therefore, we can control variance by averaging more segments.

Note: shorter segments mean larger bias, so for a fixed T seconds of data,there is a trade-off between Segment Length (Tr), which controls the bias,

and Number of Segments (NSEG), which controls the variance: T=Tr.NSEG.

[ ] )NSEG.2(2S~

VarianceS

NSEG.4

S

S~

.Nseg.2Variance x x2

x x

2

x xx x ==

NSEG

S]S

~[Variance

2x x

x x=


7/23

Cross Spectral Density (CSD) Definition

Estimation1. Could Fourier Transform the Cross-correlation function estimate

(not computationally efficient).2. Could use the frequency domain definition directly.

Raw Estimate =

As with PSD,this has extremely poor variance characteristics, so1. divide the time histories into segments,2. generate a raw estimate from each segment, and

3. average to reduce variance and produce a smoothed estimate.

.de)(RT

YXE

T

l im)f(S

f2j

x yT

*

Tx y

+

=

=

= T

YX

)f(S T

*

T

xy


8/23


9/23

Issues with Cross Spectral Density Estimates

1. Reduce bias by choosing the segment length (Tr) as large as

possible.(Bias greatest where the phase changes rapidly.)

2. Reduce variance by averaging many segments.

3. Might require a large amount of averaging to reduce noise

effects:

y(t) = h(t)*x(t) + n(t) x,n uncorrelated, h-impulse responseSxy (f) = H(f) *Sxx (f) + Sxn (f)

To make Sxn (f)

0 may require a lot of averaging if thesignal to noise on the output [ SNRy= |H|2Sxx/Snn ] is low.

4. Time delays between x and y cause problems, if the time delay(to) is greater than a small fraction of the segment length (Tr).(See class notes.)


10/23

Coherence Function Estimation:Substitute in Smoothed Estimates of Spectral Densities

Coherence takes values in the range 0 to 1.

Definition: ; Estimate:

Note that substituting in raw spectral density estimates results in an

estimate of 1. (See class notes for proof.) A result where thecoherence = 1 at all frequencies from measured signals should betreated with a high degree of suspicion.

Estimate highly sensitive to bias in spectral density estimates,particularly bad where the phase of the cross spectral density

changes rapidly (at maxima and minima in |Sxy|).

COHERENCE 0 because of:NOISENONLINEARITYBIAS ERRORS IN ESTIMATION

yyx x

2x y2

x ySS

|S|=

yyx x

2x y2

x yS~

S~

|S~|~ =


11/23

Example: System with Some Nonlinearities (cubic stiffness)and Noisy Measurements

Nonlinearity causesspread of energyhere, around 3xand 5x thisfrequency

Poor SNRon outputcausing this

Nonlineary causesbroad dips in

coherencefunction.If you drive thesystem harderthese regions

become wider

NonlinearMode

Dip due toBias Errors


12/23

Example: Linear System with Noisy Measurements

High SNR;Tr = 512/fs

High SNR;Tr = 2048/fs

Low SNR on output;Tr = 512/fs

Dips mainly due to bias.and thus get smaller

as resolution increases

SNRy also affecting

coherence here

Less Averaging compared to N=512 case:fewer segments greater variance

Dip filled in with noise

Bias greatest where phase change is fastest


13/23

H1(f) and H2(f): Effects of Noise

If the system is linear and there is no noise:H(f) = Sxy(f)/Sxx(f) = Syy(f)/Syx(f)

Case with Noise:Assuming here that spectral density estimation

errors small (Tr and NSEG Large).

H1 = Sxmym/Sxmxm = [Sxy(f)/Sxx(f)]/[1+ Snxnx /Sxx]

= H(f)/[1+ Snxnx /Sxx]

Noise on the input adversely affects this estimate of H. |H1| < |H|

H2 = Symym/S*xmym

= [Syy(f)/S*xy(f)].[1+ Snyny /Syy]

= H(f).[1+ Snyny

/Syy]

Noise on the output adversely affects this estimate of H. |H2| > |H|


14/23

Estimation of HNote that, e.g., E[H1] E[ ]/E[ ]

E[H1] = E[ ]

Frequency response function estimates (both H1 and H2) areextremely sensitive to bias errors which are worse at peaks andtroughs.

Require large segment sizes to overcome bias, but this means lesssegments to average, thus higher variance.

Note: A low coherence function does not necessarily imply a poorfrequency response function estimate. If the only reason thecoherence function is low is noise on the response (input), then theH1 (H2) frequency response estimation should be accurate, providedsufficient averaging was done to reduce the variance of the

estimates.

xyS~

xxS~

x xx y S~

/S~


15/23

Some notes1.Systems with feedback and noise may result in erroneous frequency

response function estimates.

The problem is caused by the noise n(t) affecting both the outputand the input signals, and thus input and output noise iscorrelated.. you cannot set Sxn(f) to zero in this example.

Sometimes, in these cases, we use a third signal, r(t), that is highlycorrelated to the noise-free output and uncorrelated with the noise.

Estimate of H = Sry/Srx.

A good choice for r(t) here would be d(t).

H(f)

n(t)

+

G(f)

y(t)d(t) x(t)

-

+

d and nuncorrelated


16/23

Some notes2. Multi-input systems with correlated inputs may cause problems in

frequency response function estimation.

y=h(t)*x1(t)+g(t)*x2(t); x1(t) = u(t) + n(t); x2(t) = u(t) + m(t);Sx1y = H.Sx1x1 + G.Sx1x2 and you can show that Sx1x2 = Suu 0.

So, Sx1y / Sx1x1 = H + G. (Suu /[Suu + Snn]) H

Similarly: Sx2y / Sx2x2 = H (Suu /[Suu + Smm]) + G G

Can sometimes use PARTIAL COHERENCE techniques to overcome this problem.

u, m and nuncorrelated

u(t)

H(f)n(t)

+G(f)

y(t)

x1(t)+

m(t)

+ x2(t)


17/23

Partial Spectral and CoherenceWe just touched on this is in the class. Basically we try to sequentially

remove the influence of correlated components in a series of signals.If you are interested in more details, see Bendat and Piersol, Random Data:Analysis and Measurement Procedures, John Wiley.

Applied to the previous example, we would try to remove the effect of

correlated inputs by modifying x2(t) and y(t), to generate x21(t) andy1(t), and then examine: Sx21y1/Sx21x21.x21(t) = x2(t) - l12(t)*x1(t), where L12(f) = Sx1x2/Sx1x1y1(t) = y(t) - l1y(t)*x1(t), where L1y(f) = Sx1y/Sx1x1 = [H + G.L12]

For the previous example you can show that:

Sx21y1 = G [ Smm+ L12 Snn ], where L12 = Suu / [Suu + Snn]Sx21x21 = [Smm+ L12 Snn],Therefore Sx21y1/Sx21x21 = G.We can also generate a partial coherence function from these modified

signals: x2x1,yx12 =

|Sx21y1|2

/[Sx21x21 Sy1y1]


18/23

Sines + Noise The power spectral density of a sinusoid is:

But by using windows Tr seconds long, the delta

functions become sinc functions with maximum heightaffected by window size Tr.

If Tr is too small the sinc functions will be buried in thenoise. But as Tr is increased the sinc functions begin to

emerge from the noise. So if you expect a peak in your spectrum is due to

sinewave, increase the window size (freqency resolution)and see if the peak gets larger, as you would expect if

it were truly a sine wave.

)ff(2

A)ff(

2

A1

2

1

2

++


19/23

Sines in noise

Frequency - Hertz

PSD

-

V2/H

z

Tr = NN=4,096

N=8,192N=16,384

sinc functionemerging

from noise as

Tr increases.

Variation in estimated PSD due to lack of averaging.Tr larger NSEG smaller, larger variance.


20/23

Calibration of PSD in MatLab The units should be (signal units)2/Hz.

The PSD you see is one-half of a two-sided PSD.

Even though you give the PSD program the sampling rate,the output is incorrectly scaled by a factor of fs.

To get total power of the signal you should be able to integrate theestimated power spectral density and get the mean square value ofthe time history (Parsevals Theorem). However if you calculate

where N is the number of points in a segment and P(f) is theestimate of Sxx(f), it is off by a factor of fs.

You need to divide by fs to get the correct result.

)resolutionfrequency(})N/fs.k(P.2)2/fs(P)0(P{1)2/N(

1k

=

++


21/23

Calibration (continued)

Power Spectral Density:Recall that for fs/2


22/23

Calibration Continued: Energy Spectral Density

We sometimes have segments that contain a single transient (tap

testing of structures) and we average the raw spectra from eachsegment to remove noise effects.

If we choose different Tr, i.e., allow a shorter or longer time betweensuccessive transients, (transient should have died away in the

segment), the PSD will change because of the divide by Tr.

To overcome this problem we estimate an Energy Spectral Density(ESD) (remove the divide by Tr in the raw PSD estimate.)

Raw ESD estimate = |XTr(f)|2 2 |Xk|2 (Volts/Hz)2

[You also need to be careful with window choice hereso as not to distort the transient.]

time - sTr


23/23

Calibration Continued: Power Spectrum

Power SpectrumSegment averaging is often applied to periodic signals that

are corrupted with noise. As resolution increases thenoise floor in the power spectrum decreases.

Recall: ck = Xk/N, if you synchronize, dont alias and there is nonoise.

Raw PWR estimate = |Xk|2/N2 =Raw PSD estimate . (frequency resolution)= ( |Xk|2/N ) . (fs/N) Volts2

So effectively at each point in the PSD you haveintegrated the power spectral density over a bin of

width fs/N Hertz

summary estim

Documents