recall: dot product on r2 u v = (u1;u2 v1;v2) = u1v1 u2v2 ...mahuang/math2111/lect06a.pdf ·...

Recall: Dot product on R2:

u · v = (u1, u2) · (v1, v2) = u1v1 + u2v2,

u · u = u21 + u2

2 = ||u||2.

Geometric Meaning:

u · v = ||u|| ||v|| cos θ.

u

vθ

1

Reason: The opposite side is given by u− v.

||u− v||2 = (u− v) · (u− v)

= u · u− v · u− u · v + v · v= ||u||2 + ||v||2 − 2u · v.

By Cosine law: c2 = a2 + b2 − 2ab cos θ, i.e.

||u− v||2 = ||u||2 + ||v||2 − 2||u|| ||v|| cos θ.

Comparing two equalities, we get:

u · v = ||u|| ||v|| cos θ.

2

Inner Product → Generalization of dot product.

Direct generalization to Rn:

u · v := u1v1 + . . .+ unvn =n∑

i=1

uivi.

Using matrix notation:

n∑i=1

uivi = [u1 . . . un ] ·

v1...vn

= uTv = vTu.

This is called the (standard) inner product on Rn.

3

Thm 1 (P.359): Let u,v,w ∈ Rn and c ∈ R. Then:

(i) u · v = v · u;(ii) (u+ v) ·w = u ·w + v ·w;

(iii) (cu) · v = c(u · v);(iv) u · u ≥ 0, and u · u = 0 iff u = 0.

Note: (iv) is sometimes called: “positive-definite” property.

• A general inner product is defined using the above 4properties.

• For complex inner product, need to add complex conju-gate to (i).

4

Def: The length (or norm) of v is defined as:

||v|| :=√v · v.

||v|| = 1 : called unit vectors.

Def: The distance between u,v is defined as:

dist(u,v) := ||u− v||.

Def: The angle between u,v is defined as:

u,v) := cos−1 u · v||u|| · ||v||

.

5

Extra: The General Inner Product Space

Let V be a vector space over R or C.

Def: An inner product on V is a real/complex-valued func-tion of two vector variables <u,v> such that:

(a) <u,v>= <v,u>; (conjugate symmetric)

(b) <u+ v,w>=<u,w> + <v,w>;

(c) <cu,v>= c <u,v>;

(linear in the first vector variable)

(d) <u,u>≥ 0; and <u,u>= 0 iff u = 0.(positive-definite property)

6

Def: A real/complex vector space V equipped with an innerproduct is called an inner product space.

Note: (i) An inner product is conjugate linear in the secondvector variable:

<u, cv1 + dv2>= c <u,v1> +d <u,v2> .

(ii) If we replace (a) by <u,v> “ = ” <v,u>, consider:

<iu, iu> “=” i2 <u,u>= − <u,u>,

it will be incompatible with (d).

• When working with complex inner product space, musttake complex conjugate when interchanging u, v.

7

Examples of (general) inner product spaces:

1. The dot product on Cn: (∗: conjugate transpose)

<u,v>:= u1v1 + . . .+ unvn = v∗u.

2. A non-standard inner product on R2:

<u,v>:= u1v1−u1v2−u2v1+2u2v2 = vT

[1 −1−1 2

]u.

3. An inner product on the matrix space Mm×n:

<A,B>:= tr(B∗A) =

m∑j=1

n∑k=1

ajk bjk.

8

4. Consider the vector space V of continuous real/complex-valued functions defined on the interval [a, b]. Then thefollowing is an inner product on V :

<f, g>:=1

b− a

∫ b

a

f(t)g(t)dt.

[In real case, the norm ||f || will give the “root-mean-square” (r.m.s.) of area bounded by the curve of f andthe t-axis over the interval [a, b].]

9

Schwarz’s inequality:

(a1b1 + . . .+ anbn)2 ≤ (a21 + . . .+ a2n)(b

21 + . . .+ b2n).

Pf: The following equation cannot have distinct solution:

(a1x+ b1)2 + . . .+ (anx+ bn)

2 = 0

(a21 + . . .+ a2n)x2 + 2(a1b1 + . . .+ anbn)x

+(b21 + . . .+ b2n) = 0

So ∆ ≤ 0, and this gives the inequality.

10

The Cauchy-Schwarz Inequality:

|u · v| ≤ ||u|| · ||v||,

and equality holds if, and only if, {u,v} is l.d.

Proof: When u = 0, set u = 1||u||u.

Consider w = v − (v · u)u.

u

v

(v · u)u

w

Obviously, w ·w = ||w||2 ≥ 0 ⇒ Cauchy-Schwarz Inequality.

11

Set k = v · u:

0 ≤ (v − ku) · (v − ku)

= v · v − 2k(v · u) + k2(u · u)= ||v||2 − k2,

Note that k =v · u||u||

, so:

k2 =(v · u)2

||u||2≤ ||v||2 ⇒ (u · v)2 ≤ ||u||2 · ||v||2.

Taking positive square roots, we obtain the result.

12

Thm: (Triangle Inequality) For u,v ∈ Rn:

||u+ v|| ≤ ||u||+ ||v||,

and equality holds iff one of the vectors is a non-negativescalar multiple of the other.

Proof: Consider ||u+ v||2.

(u+ v) · (u+ v) = ||u||2 + 2(u · v) + ||v||2

≤ ||u||2 + 2||u|| · ||v||+ ||v||2

= (||u||+ ||v||)2.

Taking square root, we obtain the inequality.

13

Orthogonality: Pythagoras’ Theorem in vector form:

u+ vv

u

||u+ v||2 = ||u||2 + ||v||2

But in general we have:

||u+ v||2 = ||u||2 + 2(u · v) + ||v||2,

so we need u · v = 0.

14

Def: Let u, v be two vectors in Rn. When u ·v = 0, we saythat u is orthogonal to v, denoted by u ⊥ v.

• This generalizes the concept of perpendicularity.

• 0 is the only vector that is orthogonal to every vector vin Rn.

Example: In R2, we have:[34

]⊥

[4−3

].

Thm 2 (P.362): u and v are orthogonal iff

||u+ v||2 = ||u||2 + ||v||2.

15

Common Orthogonality:

Def: Let S be a set of vectors in Rn. If u is orthogonal toevery vector in S, we will say “u is orthogonal to S”, denotedby u ⊥ S.

i.e. We can regard u to be a “common perpendicular” to S.

Examples: (i) 0 ⊥ Rn.

(ii) In R2, let S = x-axis. Then e2 ⊥ S.

(iii) In R3, let S = x-axis. Then both e2 ⊥ S and e3 ⊥ S.

Exercise: Let u,v ⊥ S. Show that:

(i) (au+ bv) ⊥ S for any numbers a, b; (ii) u ⊥ SpanS.

***

16

Orthogonal Complement:

Def: Let S be a set of vectors in Rn. We define:

S⊥ := {u ∈ Rn | u ⊥ S},

called the orthogonal complement of S in Rn.

i.e. S⊥ collects all the “common perpendiculars” to S.

Examples: (i) {0}⊥ = Rn, (Rn)⊥ = {0}.

(ii) In R2, let S = x-axis. Then S⊥ = y-axis.

(iii) In R3, take S = {e1}. Then S⊥ = yz-plane.

17

Thm: S⊥ is always a subspace of Rn.

Checking: (i) 0 ⊥ v for every v ∈ S. So 0 ∈ S⊥.

(ii) Pick any u1,u2 ∈ S⊥. For any scalars a, b ∈ R, consider:

(au1 + bu2) · v = a(u1 · v) + b(u2 · v) = a · 0 + b · 0 = 0,

whenever v ∈ S. So au1 + bu2 ∈ S⊥ (cf. previous exercise).

Note: S itself need not be a subspace.

Thm: (a) S⊥ = (SpanS)⊥. (b) SpanS ⊆ (S⊥)⊥.

Pf: (a) S⊥ ⊇ (SpanS)⊥ is easy to see, since any vectoru ⊥ SpanS must also satisfy u ⊥ S.

18

Now, pick any u ∈ S⊥. For every v ∈ SpanS, write:

v = c1v1 + . . .+ cpvp, vi ∈ S, i = 1, . . . , p.

Then since u ⊥ S:

u · v = c1(u · v1) + . . .+ cp(u · vp) = 0,

and hence u ∈ (SpanS)⊥, so “S⊥ ⊆ (SpanS)⊥” is proved.

(b) Pick a vector w ∈ SpanS, we have l.c.:

w = c1v1 + . . .+ cpvp.

For any u ∈ S⊥:

w · u = c1(v1 · u) + . . .+ cp(vp · u) = 0 ⇒ w ∈ (S⊥)⊥.

19

Thm 3 (P.363): Let A be an m× n matrix. Then:

(RowA)⊥ = NulA and (ColA)⊥ = NulAT .

Pf: The product Ax can be rewritten as:

Ax =

r1...rm

x =

rT1 · x...

rTm · x

.

So x ∈ NulA ⇔ x ∈ {rT1 , . . . rTm}⊥ ⇔ x ∈ (RowA)⊥.

Hence (RowA)⊥ = NulA. Apply the result toAT , we obtain:

(ColA)⊥ = (RowAT )⊥ = NulAT .

20

Orthogonal sets and Orthonormal sets

Def: A set S is called orthogonal if any two vectors in S arealways orthogonal to each other.

Def: A set S is called orthonormal if (i) S is orthogonal, and(ii) each vector in S is of unit length.

Example: Orthonormal set: 1√11

311

,1√6

−121

,1√66

−1−47

.

21

Thm 4 (P.366): An orthogonal set S of non-zero vectors isalways linearly independent.

Pf: Let S = {u1,u2, . . . ,up} and consider the relation:

c1u1 + c2u2 + . . .+ cpup = 0.

Take inner product with u1, then:

c1(u1 · u1) + c2(u1 · u2) + . . .+ cp(up · u1) = 0 · u1

c1||u1||2 + c2 · 0 + . . .+ cp · 0 = 0.

As ||u1|| = 0, we must have c1 = 0. Similarly for other ci.

So S must be l.i.

22

The method of proof of previous Thm 4 gives:

Thm 5 (P.367): Let S = {u1, . . . ,up} be an orthogonal setof non-zero vectors and let v ∈ SpanS. Then:

v =v · u1

||u1||2u1 + . . .+

v · up

||up||2up.

Pf: Let c1, . . . , cp be such that v = c1u1 + . . .+ cpup. Takeinner product with u1, we have:

v · u1 = c1(u1 · u1) + . . .+ cp(up · u1) = c1||u1||2.

So c1 = v·u1

||u1||2 . Similarly for other ci.

23

Thm 5′: Let S = {u1, . . . up} be an orthonormal set. Thenfor any v ∈ SpanS, we have:

v = (v · u1)u1 + . . .+ (v · up)up.

Remark: This generalizes our familiar expression in R3:

v = (v · i)i+ (v · j)j+ (v · k)k.

Example: Express v as a l.c. of the vectors in S:

v =

123

, S =

311

,

−121

,

−1−47

.

24

New method: Compute c1, c2, c3 directly:

c1 =

123

·

311

||

311

||2c2 =

123

·

−121

||

−121

||2c3 =

123

·

−1−47

||

−1−47

||2

⇒ c1 =8

11, c2 =

6

6= 1, c3 =

12

66=

2

11.

25

Exercise: Determine if v ∈ Span {u1,u2}.

v =

32−5

, u1 =

122

, u2 =

−22−1

.

***

26

Orthogonal basis and Orthonormal basis

Def: A basis for a subspace W is called an orthogonal basisif it is an orthogonal set.

Def: A basis for a subspace W is called an orthonormal basisif it is an orthonormal set.

Examples: (i) {e1, . . . , en} is an orthonormal basis for Rn.

(ii) S = {[34

],

[4−3

]} is an orthogonal basis for R2.

S′ = {[

3545

],

[45

− 35

]} is an orthonormal basis for R2.

27

(iii) The following set S:

S =

311

,

−121

,

−1−47

is an orthogonal basis for R3.

(iv) The columns of an n× n orthogonal matrix A will forman orthonormal basis for Rn.

Orthogonal matrix: square matrix and ATA = In.

28

Checking: Write A = [v1 . . . vn ].

(i, j)-th entry of ATA = vTi vj = vi · vj .

(i, j)-th entry of In =

{1, i = j,0, i = j.

Above checking also works for non-square matrix:

Thm 6 (P.371): The n columns of an m × n matrix U areorthonormal iff UTU = In.

But for square matrices: AB = I ⇒ BA = I. So:

(iv)′ The rows of an n × n orthogonal matrix A (written incolumn form) also form an orthonormal basis for Rn.

29

Matrices having orthonormal columns are very special:

Thm 7 (P.371): Let T : Rn → Rm be a linear transforma-tion given by an m×n standard matrix U with orthonormalcolumns. Then for any x,y ∈ Rn:

a. ||Ux|| = ||x|| (preserving length)

b. (Ux) · (Uy) = x · y (preserving inner product)

c. (Ux) ·(Uy) = 0 iff x ·y = 0 (preserving orthogonality)

Pf: Direct verifications using UTU = In.

Results not true for just orthogonal columns.

30

Recall:

Let S = {u1, . . . ,up} be orthogonal. When v ∈ W = SpanS,we have:

v =v · u1

||u1||2u1 + . . .+

v · up

||up||2up.

What happens if v ∈ W?

• LHS = RHS, as RHS is always a vector in W .

• v′ = RHS is still computable.

What is the relation between v and v′?

31

LHS = v, RHS = v′ =

p∑i=1

v · ui

||ui||2ui.

Take inner product of RHS with uj :

v′ · uj =

( p∑i=1

v · ui

||ui||2ui

)· uj

=

p∑i=1

v · ui

||ui||2(ui · uj) =

v · uj

||uj ||2(uj · uj) = v · uj ,

which is the same as LHS · uj .

32

In other words, (v − v′) · uj = 0 for j = 1, . . . , p.

Thm: The vector z = v − v′ is orthogonal to every vectorin SpanS, i.e. z ∈ (SpanS)⊥.

v

uiv′

z

W

33

Def: Let {u1, . . . ,up} be an orthogonal basis for W . Foreach v in Rn, the following vector in W :

projWv :=v · u1

||u1||2u1 + . . .+

v · up

||up||2up,

is called the orthogonal projection of v onto W .

Remark: {u1, . . . ,up} must be orthogonal, otherwise RHSwill not give us the correct vector v′.

Note: v = projWv ⇔ v ∈ W.

34

Example: In R3, consider S = {e1, e2}. Then W = SpanSis the xy-plane. For any vector v ∈ R3:

x

z

y

v

v·e1

||e1||2 e1

v·e2

||e2||2 e2

projWv

projWv =

xy0

35

Exercise: Consider in R3 and W = Span {u1,u2}. FindprojWv:

v =

101

, u1 =

2−21

, u2 =

21−2

.

***

36

Def: The decomposition:

v = projWv + (v − projWv), (v − projWv) ∈ W⊥,

is called the orthogonal decomposition of v w.r.t. W .

v

w = projWv

z = v − projWv

W

Thm 8 (P.376): Orthogonal decomposition w.r.t. W is theunique way to write v = w + z with w ∈ W and z ∈ W⊥.

37

Exercise: Find the orthogonal projection of v onto W =NulA.

A = [ 1 1 −1 −1 ] , v =

1234

.

***

Thm 9 (P.378): Let v ∈ Rn and let w ∈ W . Then we have:

||v − projWv|| ≤ ||v −w||,

and equality holds only when w = projWv.

38

Pf: We can rewrite v −w as:

v −w = (v − projWv) + (projWv −w).

v

projWv

w

v − projWv

v −w

projWv −w

W

Can apply “Pythagoras Theorem” to the right-angle triangle.

39

||v −w||2 = ||v − projWv||2 + ||projWv −w||2

≥ ||v − projWv||2,

and equality holds iff ||projWv −w|| = 0 iff w = projWv.

Because of the inequality:

||v − projWv|| ≤ ||v −w||,

projWv sometimes is called the best approximation of v byvectors in W .

40

Def: The distance of v to W is defined as:

dist(v,W ) := ||v − projWv||.

Obviously, v ∈ W iff dist(v,W ) = 0.

Exercise: Let W = Span {u1,u2,u3}. Find dist(v,W ):

u1 =

1−11−1

, u2 =

11−1−1

, u3 =

1111

and v =

2464

.

Sol: Remeber to check that {u1,u2,u3} is orthogonal.***

41

Extension of Orthogonal Set

Let S = {u1, . . . ,up} be an orthogonal basis forW = SpanS.When W = Rn, we can find a vector v ∈ W and:

z = v − projWv = 0.

This vector z is in W⊥, i.e. will satisfy:

z ·w = 0 for every w ∈ W.

Hence the following set will again be orthogonal:

S ∪ {z} = {u1, . . . ,up, z}.

42

Thm: Span(S ∪ {v}) = Span(S ∪ {z}).

In other words, we can extend an orthogonal set S byadding the vector z.

• S1 = {u1} orthogonal, v2 ∈ SpanS1, then compute z2.⇒ S2 = {u1, z2} is again orthogonal.

and Span {u1,v2} = Span {u1, z2}.

• S2 = {u1,u2} orthogonal, v3 ∈ SpanS2, compute z3.⇒ S3 = {u1,u2, z3} is again orthogonal.

and Span {u1,u2,v3} = Span {u1,u2, z3}....

This is called the Gram-Schmidt orthogonalization process.

43

Thm 11 (P.383): Let {x1, . . . ,xp} l.i.. Define u1 = x1 and:

u2 = x2 −x2 · u1

||u1||2u1

u3 = x3 −x3 · u2

||u2||2u2 −

x3 · u1

||u1||2u1

...

up = xp −p−1∑i=1

xp · ui

||ui||2ui.

Then {u1, . . . ,up} will be orthogonal and for 1 ≤ k ≤ p:

Span {x1, . . . ,xk} = Span {u1, . . . ,uk}.

44

Notes: (i) Must use {ui} to compute projWkxk+1 since the

formula:

projWkxk+1 =

k∑i=1

xk+1 · ui

||ui||2ui,

is only valid for orthogonal set {ui}.

(ii) If obtain uk = 0 for some k, i.e. xk = projWxk, we have:

xk ∈ Span {x1, . . . ,xk−1}.

so {x1, . . . ,xk} will be l.d. instead.

(iii) All the ui will be non-zero vectors as {xi} is l.i.

45

Example: Apply Gram-Schmidt Process to {x1,x2,x3}:

x1 =

110

, x2 =

20−1

, x3 =

111

.

Solution: Take u1 = x1. Then:

u2 = x2 −x2 · u1

||u1||2u1 = x2 −

2

2u1 =

1−1−1

,

u3 = x3 −−1

3u2 −

2

2u1 =

13

−13

23

.

46

Example: Apply Gram-Schmidt Process to {x1,x3,x2}:

x1 =

110

, x3 =

111

, x2 =

20−1

.

Solution: Take u′1 = x1. Then:

u′2 = x3 −

x3 · u′1

||u′1||2

u′1 = x3 −

2

2u′1 =

001

,

u′3 = x2 −

−1

1u′2 −

2

2u′1 =

1−10

.

47

Exercise: Find an orthogonal basis for ColA:

A =

1 3 1 23 4 −2 11 1 1 11 2 2 2

.

Sol: First find a basis for ColA (e.g. pivot columns of A).

Then apply Gram-Schmidt Process.

***

48

Approximation Problems: Solve Ax = b.

Due to the presence of errors, a consistent system may appearas an inconsistent system:

x1 + x2 = 1

x1 − x2 = 0

2x1 + 2x2 = 2

→

x1 + x2 = 1.01

x1 − x2 = 0.01

2x1 + 2x2 = 2.01

Also in practice, exact solutions are usually not necessary.

• How to obtain a “good” approximate “solution” for theabove inconsistent system?

49

Least squares solution: How to measure the “goodness”of x0 as an approximated solution to the system:

Ax = b?

• Minimize the difference ||x− x0||

Problem: But x is unknown......

Another way of approximation:

x0 ≈ x “ ⇒ ” Ax0 ≈ Ax = b.

50

Analysis: Find x0 such that:

Ax0 = b0,

and b0 is as close to b as possible.

• b0 must be in ColA.

• ||b− b0||2 is a sum of squares → least squares solution.

Best approximation property of orthogonal projection:

||b− projWb|| ≤ ||b−w|| for every w in W = ColA.

Should take b0 = projWb.

51

Example: Find the least squares solution of the inconsistentsystem:

x1 + x2 = 1.01

x1 − x2 = 0.01

2x1 + 2x2 = 2.01

To compute projWb, we need an orthogonal basis for W =ColA first.

a basis for ColA is: {

112

,

1−12

}.

52

Then by Gram-Schmidt Process, we get an orthogonal basisfor W = ColA:

{

112

,

1−12

} −→ {

112

,

1−52

}.Compute b0 = projWb:

b0 =

1.010.012.01

·

112

||

112

||2

112

+

1.010.012.01

·

1−52

||

1−52

||2

1−52

53

Hence:

b0 =

1.0060.012.012

.

Since b0 ∈ ColA, the system Ax0 = b0 must be consistent.

Solving Ax0 = b0: 1 1 | 1.0061 −1 | 0.012 2 | 2.012

→

1 0 | 0.5080 1 | 0.4980 0 | 0

.

Thus we have the following least squares solution:

x0 =

[0.5080.498

].

54

But we have the following result:

(ColA)⊥ = NulAT .

Then, since we take b0 = projColAb:

(b− b0) ∈ (ColA)⊥ ⇔ (b− b0) ∈ NulAT

⇔ AT (b− b0) = 0

⇔ ATb0 = ATb.

So, if x0 is an approximate solution, we have:

AT (Ax0) = ATb.

The above is usually called the normal equation of Ax = b.

55

Thm 13 (P.389): The least squares solutions of Ax = b arethe solutions of the normal equation ATAx = ATb.

In the following case, the least square solution will be unique:

Thm 14 (P.391): Let A be anm×nmatrix with rankA = n.Then the n× n matrix ATA is invertible.

Example: Find again the least squares solution:x1 + x2 = 1.01

x1 − x2 = 0.01

2x1 + 2x2 = 2.01

56

Solution: Solve the normal equation. Compute:

ATA =

[1 1 21 −2 2

] 1 11 −12 2

=

[6 44 6

],

ATb =

[1 1 21 −2 2

] 1.010.012.01

=

[5.045.02

].

So the normal equation is:[6 44 6

] [x1

x2

]=

[5.045.02

]⇒

[x1

x2

]=

[0.5080.498

].

57

Least Squares Problems

Linear Regression: Fitting data (xi, yi) with straight line.

x

y

To “minimize” the differences indicated by the red intervals.

58

When a straight line y = c + mx can pass through all thepoints, it will of course “best fit” the data. This requires:

c+mx1 = y1

...

c+mxn = yn

↔

1 x1...

...1 xn

[cm

]=

y1...yn

being consistent.

But in general the above system Ax = b is inconsistent.

59

Measurement of closeness: square sum of y-distances.

|y1 − (mx1 + c)|2 + . . .+ |yn − (mxn + c)|2.

Note that this is expressed as ||b− b0||2, where:

b =

y1...yn

, b0 =

c+mx1...

c+mxn

.

b0 ∈ ColA since Ax = b0 is consistent.

→ Use normal equation!

60

Example: Find a straight line that best fits the points:

(2, 1), (5, 2), (7, 3), (8, 3),

in the sense of minimizing the square-sum of y-distances.

Sol: The (inconsistent) system is:1 21 51 71 8

[cm

]=

1233

.

We are going the find its least squares solution.

61

Compute:

ATA =

[1 1 1 12 5 7 8

]1 21 51 71 8

=

[4 2222 142

],

ATb =

[1 1 1 12 5 7 8

]1233

=

[957

].

62

So the normal equation is:[4 2222 142

] [cm

]=

[957

],

which has a unique solution of (27 ,514 ).

The “best fit” straight line will be:

y =2

7+

5

14x.

63

Polynomial Curve Fitting:

Example: Find a polynomial curve of degree at most 2which best fits the following data:

(2, 1), (5, 2), (7, 3), (8, 3),

in the sense of least squares.

Sol: Consider the general form of the fitting curve:

y = a0 · 1 + a1 · x+ a2 · x2.

64

The curve cannot pass through all the 4 points as:a0 · 1 + a1 · 2 + a2 · 22 = 1

a0 · 1 + a1 · 5 + a2 · 52 = 2

a0 · 1 + a1 · 7 + a2 · 72 = 3

a0 · 1 + a1 · 8 + a2 · 82 = 3

is inconsistent.

Again, use normal equation.

65

The corresponding normal equation ATAx = ATb is: 4 22 14222 142 988142 988 7138

a0a1a2

=

957393

,

which has a unique solution of ( 19132 ,

1944 ,−

1132 ).

So the best fitting polynomial is:

y =19

132+

19

44x− 1

132x2.

66

General Curve Fitting:

Example: Find a curve in the form c0 + c1 sinx + c2 sin 2xwhich best fits the following data:

(π

6, 1), (

π

4, 2), (

π

3, 3), (

π

2, 3),

in the sense of least squares.

Sol: Let y = c0 · 1 + c1 · sinx+ c2 · sin 2x. The systemc0 · 1 + c1 · sin π

6 + c2 · sin 2π6 = 1

c0 · 1 + c1 · sin π4 + c2 · sin 2π

4 = 2c0 · 1 + c1 · sin π

3 + c2 · sin 2π3 = 3

c0 · 1 + c1 · sin π2 + c2 · sin 2π

2 = 3

is inconsistent.

67

Solving ATAx = ATb ......

c0 =184− 39

√2− 89

√3 + 9

√6

78− 18√2− 38

√3 + 6

√6

≈ −2.29169

c1 =9 + 3

√2− 7

√3− 2

√6

−39 + 9√2 + 19

√3− 3

√6≈ 5.31308

c2 =8 + 9

√2− 10

√3− 6

√6

78− 18√2− 38

√3 + 6

√6≈ 0.673095

So the best fitting function is:

(−2.29169) + (5.31308) sinx+ (0.673095) sin 2x.

68

Extra: Continuous Curve Fitting

Find g(x) best fitting a given f(x).

f(x)

g(x)

Try to minimize the difference (area) between two curves.

69

• To minimize the “root-mean-square” (r.m.s.) of areabetween two curves:√

1

b− a

∫ b

a

|f(x)− g(x)|2dx.

• Given by the following inner product:

<f, g>=1

b− a

∫ b

a

f(x)g(x)dx.

• not in Rn, not the standard inner product...No normal equation.

• But we can use orthogonal projection.

70

Recall: Formula of orthogonal projection in general:

projWy =

p∑i=1

<y,ui>

<ui,ui>ui,

where {u1, . . . ,up} is an orthogonal basis of W .

Example: Fit f(x) = x over [0, 1] by l.c. of

S = {1, sin 2πkx, cos 2πkx; k = 1, 2, . . . , n}Sol: S is orthogonal under the inner product:

<f, g>=

∫ 1

0

f(x)g(x)dx.

(direct checking)

71

So compute those “<y,ui>”:

<f(x), 1>=1

2,

<f(x), sin 2πkx>= − 1

2πk, <f(x), cos 2πkx>= 0.

We also need those “<ui,ui>”:

<1, 1>= 1,

<sin 2πkx, sin 2πkx>=1

2, <cos 2πkx, cos 2πkx>=

1

2.

72

So the best fitting curve is g(x) = projW f(x):

g(x) =1

2− 1

π

(sin 2πx

1+

sin 4πx

2+ . . .+

sin 2nπx

n

)When n = 5:

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

73

Example: Let f(x) = sgn(x), the sign of x:

sgn(x) =

−1 for x < 00 for x = 01 for x > 0

Find the best r.m.s. approximation function over [−1, 1] us-ing l.c. of S = {1, sin kπx, cos kπx; k = 1, 2, 3, . . . , 2n+ 1}.

Sol: Interval changed. Use new inner product:

<f, g>=1

2

∫ 1

−1

f(x)g(x)dx.

74

Then S is orthogonal (needs another checking) and:

<1, 1>= 1, <sin kπx, sin kπx>=1

2=<cos kπx, cos kπx> .

So, we compute:

<sgn(x), 1> = 0;

<sgn(x), sin kπx> =

{0 if k is even,2kπ if k is odd;

<sgn(x), cos kπx> = 0.

75

Hence the best r.m.s. approx. to sgn(x) over [−1, 1] is:

4

π

(sinπx+

sin 3πx

3+

sin 5πx

5. . .+

sin(2n+ 1)πx

2n+ 1

).

When 2n+ 1 = 9:

-1 -0.5 0.5 1

-1

-0.5

0.5

1

76

recall: dot product on r2 u v = (u1;u2 v1;v2) = u1v1 u2v2 ...mahuang/math2111/lect06a.pdf ·...

Documents