0 optimal solutions to geodetic inverse problems in statistical and numerical aspects jianqing cai...

1

Optimal solutions to geodetic inverse problems in statistical and numerical aspects

Jianqing CaiGeodätisches Institut, Universität Stuttgart

Session 6: Theoretische GeodäsieGeodätische Woche 2010

05.–07. Oktober 2010, Messe Köln

G

eod

ätis

ches

In

stit

ut

– U

niv

ersi

tät

Stu

ttg

art

2

Main Topics

1. Geodetic data analysis

2. Fixed effects (G-M, Ridge, BLE, - weighted homBLE)

3. Regularizations

4. Mixed model (Combination method)

5. Conclusions and further studies

3 3 3

1. Geodetic data analysis

Numerical aspect: Inverse and ill-posed problems• Applied mathematicians often are more interesting in existence,

uniqueness, and construction, given an inifinite number of noise-free data, and stability given data contaminated by a deterministic disturbance.

• Approach: Tykhonov-Phillips regularization and • numerical methods such as L-Curve (Hansen 1992) or the Cp-Plot (Mallows

1973).

Statistical aspect: Estimation and inference • the data are modeled as stochastic • Standard statistical concepts, questions, and considerations such as

bias, variance, mean-square error, identifiability, consistency, efficiency and various forms of optimality can be applied

• ill-posed problems is related to the rank defect, etc. • Biased and unbiased estimations

Many diffenent developed independently in separate disciplines. Some clear differences in terminology, philosophy and numerical implementation remain,

due to tradition and lack of interdisciplinary communication.

2. The special linear Gauss-Markov model Special Gauss Markov model

1st moments

2nd moments

{ }, , { } ( ), rk (1)n mE R E m Aξ y A y A AR

{ } , positive definite, rk (2)n nD R n y y yΣ y Σ Σ

y Aξ e

ˆ ( BLUUE o f ) Theorem ξ ξ y Aξ e

1

1 1 1

1ˆ

1 1ˆ

e e min ˆ ( )

ˆ

subject to the related dispersion matri

ˆ ˆ ˆ ˆ{ }: {( )( ) } {

xˆ{ }: ( ).

}

T

MSE E D

D

y

y y

yξ

yξ

Σ

ξ Ly A Σ A A Σ y

ξ Σ A Σ y

ξ Σ A Σ A

ξ ξ ξ ξ ξ ξ

ˆ ˆLet be - BLUUE of in the speciallinear Gauss-Markov model (1), (2) . Then

yξ Ly Σ ξ ξ

5 5 5

Best Linear Estimation (BLE) (R. Rao, 1972)

It is not a suitable criterion for minimizing, since it involves both the unknowns.

2 1( ) ( ) ( )m mE yLy ξ L L A L I ξξ A IΣ L

For the GGM model (y, Aξ, Σy= σ2P-1 ), and let L′y be an estimator of ξ; The Mean Square Error (MSE) of L′y is：

1 1 1 1( ) ( )T T T T ξ RA P A A y A P VA A P yV

Three possibilities:

If ξ is considered as a random variable with a priori mean dispersion E(ξξ′)= σ2V;

1 ( ) ( )m mF L L A L I V A IP L

Choose an a priori value of σ-1ξ, say γ and substitute σ2V =σ2 γγ′ for ξξ′;

be consist of two parts, variance and bias. The choise of V in F represents the relative weight, which is n.n.d and of rank greater than one.

yields the BLE of ξ

6

The open problem to evaluate the regularization parameter

Ever since Tykhonov (1963) and Phillips (1962) introduced the hybrid minimum norm approximation solution (HAPS) of a linear improperly posed problem there has been left the open problem to evaluate the regularization factor λ;

In most applications of Tykhonov-Phillips type of regularization the weighting factor λ is determined by heurishical methods, such as by means of L-Curve (Hansen 1992) or the Cp-Plot (Mallows 1973). In

literature also optimization techniques have been applied.

α-weighted hybrid minimum variance- minimum bias estimation (hom α-BLE)

7

-weighted S-homBLE and A-optimal design of the regularization parameter λ

According to Grafarend and Schaffrin (1993), updated by Cai (2004), a homogeneously linear α-weighted hybrid minimum variance-minimum bias estimation (α, S-homBLE) is based upon the weighted sum of two norms of type:

namely

2,

2 2

1ˆ|| { }|| : tr { } tr[ ] [ ]

1|| || || ( ) ||

m m

m

MSE D

y

S

Σ S

ξ L y L I LA S I LA

L I LA

2,

1ˆ( ) : || { }|| =tr tr ( ) ( ) minm mMSE S y L

L ξ LΣ L I - LA S I - LAL

2,

ˆThe hybrid norm || { }|| establishes the Lagrangean MSE S ξ

α,S-homBLˆfor as of ( 1).E Theoremξ ξ

2 traverage variance

yyL L L

21 1, ( ) tr [ ] [ ] .m m mweighted average bias

SS I LA I LA S I LA

8

Theorem 1 α,S-homBLE, also called: ridge estimator

{ }, , { } ( ), rk { } , positive definite, rk

n m

n nE R E m

D R n

y y y

Aξ y A y A AΣ y Σ Σ

RLinear Gauss-Markov model:

ξ̂

ξ̂α,S-homBLE:1 1 1 1 1ˆ ( ) (if exists) y yξ A Σ A S A Σ y S

1 1 1 1 1 1 1ˆ{ } ( ) ( )D y y yξ A Σ A S A Σ A A Σ A Sdispersion matrix :

1 1 1 1

1 1 1 1

ˆ: { } [ ( ) ]

( )mE

y y

y

β ξ ξ I A Σ A S A Σ A ξ

A Σ A S S ξbias vector:

Mean Square Error matrix:

1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1

ˆ ˆ ˆ ˆ{ }: {( )( ) } { }

( ) ( )

[( ) ] [ ( ) ]

( ) [ ( ) ( )]( ) .

MSE E D

y y y

y y

y y y

ξ ξ ξ ξ ξ ξ ββ

A Σ A S A Σ A A Σ A S

A Σ A S S ξξ S A Σ A S

A Σ A S A Σ A S ξξ S A Σ A S

9

ˆtr MSE{ ( )}ξ

BLUUE

α, homBLES

ˆBias-Squared( ( ))ξ

ˆ( ( ))Variance ξ

opt

Figure 1. The relationship between the variance, the squared bias and the weighting factor α. The variacne term decrease as α increases, while the squared bias increase with α.

10qa

The geodetic inverse Problem:

Exact or strict Multicollinearity means

weak Multicollinearity means

Use the condition Number for diagnostics:

| | 0 yA Σ A

| | 0 yA Σ A 1

2max

min

k

The weight factor α can be alternatively determined by the A-optimal design of type

ˆ(1) tr { } min, orD

ξ(2) min, or

ββ

Here we focus on the third case – the most meaningful one –

"minimize the trace of the Mean Square Error matrix

ˆ(3) tr MSE{ } min

ξ

ˆ ˆ{ } ( ) ˆˆ { { } }"

trMSE ξ of ξ - weighted S - homBLE to findarg trMSE ξ min

11

Theorem 2. A-optimal design of

1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

ˆtr MSE{ }

tr( ) ( )

tr[( ) ] [ ( ) ],

y y y

y y

ξ

A Σ A S A Σ A A Σ A S

A Σ A S S ξξ S A Σ A S

then follows by A-optimal design in the sense of

ˆtr MSE{ } minξ

1 1 1 2 1 1 1 1

1 1 1 2 1 1 1 1 1

if and only if

ˆ ˆtr ( ) ( )ˆ

ˆ ˆ( ) ( )

y y y

y y y

A Σ A A Σ A S S A Σ A S

ξ S A Σ A S A Σ A A Σ A S S ξ

Let the average hybrid -weighted variance-bias norm of ( α, S-homBLE) with respect to the linear Gauss-Markov model be given by

ˆMSE{ }ξ ξ̂

12 12 12 12

Tykhonov-Phillips regularization:

yields the normal equations system1) .( T T ξ A WA I A Wy

2 2min{|| || || || }. ξ

WAξ y ξ

Tykhonov-Phillips regularization is defined as the solution to the problem

Ridge regression (Hoerl and Kennard, 1970a, b).

yields the normal equations system

1 1 1 ˆ( ) [ ( ) ] .T T Tpk k ξ A A I A y I A A ξ

0ˆ ˆminimize (1/ )[( ) ( ) ].T TF k ξ ξ ξ ξ A A ξ ξ

Ridge regression is defined as biased estimation for nonorthogonal problems with a Lagrangian function:

21 ( ) ( ) ( ).TF k R y Aξ y Aξ ξ ξor a equivalent statement:

13 13 13 13 13

Generalized Tykhonov-Phillips regularization:

What is the solution of generalized regularization ? T 1( ) .T ξ A WA R A Wy

2 20min{|| || || || }. R

ξWAξ y ξ ξ

Tykhonov-Phillips regularization is defined as the solution to the problem

2 20 0 0min{|| ( ) ( ) || || || }. R

ξWA ξ ξ y Aξ ξ ξ

Rewrite the objective function

10 0

1 10

1 10

1 10

1 10

10

( ) ( )

( ) ( ( ) )

( ) ( )

( ) ( )

( ) ( )

( ) ( )

T T

T T T T

T T T

T T T

T T T

T T

ξ ξ A WA R A W y Aξ

A WA R A Wy I A WA R A WA ξ

A WA R A Wy A WA R Rξ



A WA R A Wy Rξ

yields the right solution

14 14

Comparison of the determination of the regularization factor λ by A-optimal design and the ridge parameter k in ridge regression

3 2 3 2

2 1 2 1

3 2

3

2

ˆ ˆtr ( ) tr ( )ˆ ˆ ˆ ˆ( ) ( ) ( ) ( )

ˆ(1 ).

ˆ(

ˆ

1 )

m m m m

m m m m m m m

m m

A A A A I I I I

ξ A A I A A A A I ξ ξ I I I I I ξ

ξ ξ ξξ

2 2, , rk , { } , { } , unknownn mnm E D y Aξ e A A e 0 y I

2 /k m ξ ξ

Hoerl, Kennard and Baldwin (1975) have suggested that if A′A=Im, then a minimum meansquare error (MSE) is obtained if ridge parameter for multiple linear regression model:

This is just the special case of our general solution by A-optimal design of Corollary 3 under unit weight P and A’A=Im, yielding

15

3. Mixed model (Combination method)Theil & Goldberger (1961) and Theil (1963), Toutenburg (1982) and Rao & Toutenburg (1999) mixed estimator with additional information as stochastic linear restrictions:

Linear Gauss-Markov model:

, { } , { } ,, rk , rk .

n n

n mE D

m n

y Aξ e e 0 e ΣA A Σ

The additional information: ˆ, { } { } { }p pE E E ξ ξ ξ ξ

{ }, , { } .P P P P PE D ξ Iξ e e 0 e Σ

Mixed model: , , .P P P P P

E D e e e Σ 0y A ξ 0ξ e e e 0 ΣI

The BLUUE estimator of the original G-M model:

1 1 10

ˆ ( ) . ξ A Σ A A Σ y

The BLUUE estimator of the mixed model:

1 1 1 1 1

1 1 1 1 10 0

ˆ( ) ( ) ( )ˆ ˆ( ) [ ( ) ] ( ).

P P p

P P

P

ξ A Σ A Σ A Σ y Σ ξ

ξ A Σ A Σ A Σ A ξ ξ

Stochastic linear restriction:

The dispersion matrix: 1 1 1

ˆ ( )( )PP

ξ

Σ A Σ A Σ

0

1 1 1 1 1ˆ ˆ ( )

1 1 1 1 1 1 1

( ) ( )

( ) [ ( ) ] ( ) 0.

PP

P

ξ ξ

Σ Σ A Σ A A Σ A Σ

A Σ A Σ A Σ A A Σ A

ˆ( ) :P ξ unbiased with smaller dispersion

The use of stochastic restrictions leads to a gain in efficiency.

16

The light constraint solutions (Reigber, 1989)

, ~ ( , ) . ., { } , { } .P P p P P Pi e E D 0 Iξ ξ ξ 0 Σ ξ 0 ξ Σ

Light constraint model : , , .p p p P

E D

e e e Σ 0Ay ξ 0ξ ξ ξ 0 Σ0 I

BLUUE estimator oflight constraint model :

Light constraint with a priori information

With the dispersion matrix: 1 1 1ˆ ( )

( ) .PL

ξ

Σ A Σ A Σ

1 1 1 1 1

1 1 1 1

1 1 1 1 10 0

ˆ( ) ( ) ( )( )ˆ ˆ( ) [ ( ) ] .

P P

P

P

L

ξ A Σ A Σ A Σ y Σ 0A Σ A Σ A Σ y

ξ A Σ A Σ A Σ A ξ

The objective function of light constraint solution:1

1 10 ,0p p P ppP

eΣe ξ e Σ e ξ Σ ξξΣ

p ξ ξ 0

1 1 1 1 1 1

1

1

1

1

1

1

1

( ) ( )ˆ ˆ ˆ( ) ( ) )

( .

(

)P P p P

P P p

P L

ξ ξ ξ

A Σ A

A Σ A Σ A Σ

Σ Σ

y Σ ξ A Σ A Σ Σ

ξ

A y

Difference between estimators of the mixed models and light constraint models:

The same as the objective function of the estimate with weighting parameters !

17

Combination and Regularization methods

In order to solve the PGP by spectral domain stabilization, we use a priori information in terms of spherical harmonic coefficients. Augmenting the minimization of squared residuals r = Aξ – y by a parameter component ξ – ξ0

1

2 20min{|| || || || }.

yRΣξ

Aξ y ξ ξ

yields the normal equations system

T 1 1 T 10

ˆ ( ) ( .) y yξ A Σ A R A Σ y Rξ

The parameter denotes the regularization parameter. It balances the residual norm ||Aξ – y|| against the (reduced) parameter norm ||ξ – ξ0||.

Accounts for both regularization and combination, which is just the so-called generalized Tykhonov regularization in the case of α = 1!

• When ξ0 = 0, i.e. the a priori information consisting of null pseudo-observables,

This is the case of regularization. • Data combination in the spectral domain is achieved by incorporating non-trivial a

priori information ξ0 0, yielding the mixed estimator with additional information as

stochastic linear restrictions.

18 18

Biased and unbiased estimations in different aspects

Numerical Analysis Statistical aspects

2 2min{|| || || || }. Rξ

WAξ y ξ

Standard Tykhonov-Phillips regularization

1 .ˆ ( )T T ξ A WA R A Wy

Gen.Tykhonov-Phillips regularization2 2

0min{|| || || || }. Rξ

WAξ y ξ ξ

T 1 T0( ) ( ) . ξ A WA R A Wy Rξ

Ridge regression, BLE and Light constraint solution

1 1 1 1ˆ ( ) . y yξ A Σ A V A Σ y

2 2ˆ ˆ ˆ|| { } || : || {( )( ) } || minMSE E ξ ξ ξ ξ ξ

BLUUE estimator of Mixed model

1 1 1 1 1( ) ( )P P p y yξ A Σ A Σ A Σ y Σ ξ

1 1 minp P p ye Σ e ξ Σ ξ

This answer the relationship of these biased and unbiased solutions and estimators in numerical and statistical aspects.

Biased

Unbiased

19

19

4. Conclusions and further studies

Development of a rigorous approach to minimum MSE adjustment in a Gauss-Markov Model, i.e. α -weighted S-homBLE;

Derivation of a new method of determining the optimal regularization parameter in uniform Tykhonov-Phillips regularization (-weighted S-homBLE) by A-optimal design in the general case;

It was, therefore, possible to translate the previous results for the α -weighted S-homBLE to the case of Tykhonov-Phillips regularization with remarkable success;

The optimal ridge parameter k in ridge regression as developed by Hoerl and Kennard in 1970s is just the special case of our general solution by A-optimal design.

Accounts for both regularization and combination, which is just the so-called generalized Tykhonov regularization!

20

In order to develop and promote the generality of inversion methods, it is necessary to study this kind problem from the following aspects:

1) Statistical or deterministic regularization;

2) Ridge estimation;

3) Best linear estimation;

4) Mixed model;

5) Biased or unbiased estimations;

6) The criterion in derivation of the inversion solution:

Mean square error of the estimtes MSE - Gauss´s second approach

instead of Gauss´s first approach

7) Optimal solution.

ˆ ˆ ˆ ˆ{ }: {( )( ) } { }MSE E D ξ ξ ξ ξ ξ ξ ββ

1e e minT yΣ

21

Laplace (1810) distinguishes between errors of observaions and errors of estimates, and points out that a theory of estimation should be based on a measure of derviation between the estimate and the true value.

Gauss(1823) finally accepted Laplace’s criticism and indicates that if he were to rewrite the 1809 proof (LS), he would use the expected mean square error as optimality criterion.

This means that estimation theory should be based on minimization of the error of estimation!

Historical remark:

26

Thank you !

0 optimal solutions to geodetic inverse problems in statistical and numerical aspects jianqing cai...

Documents