recall: dot product on r2 u v = (u1;u2 v1;v2) = u1v1 u2v2 ...mahuang/math2111/lect06a.pdf ·...
TRANSCRIPT
Recall: Dot product on R2:
u · v = (u1, u2) · (v1, v2) = u1v1 + u2v2,
u · u = u21 + u2
2 = ||u||2.
Geometric Meaning:
u · v = ||u|| ||v|| cos θ.
u
vθ
1
Reason: The opposite side is given by u− v.
||u− v||2 = (u− v) · (u− v)
= u · u− v · u− u · v + v · v= ||u||2 + ||v||2 − 2u · v.
By Cosine law: c2 = a2 + b2 − 2ab cos θ, i.e.
||u− v||2 = ||u||2 + ||v||2 − 2||u|| ||v|| cos θ.
Comparing two equalities, we get:
u · v = ||u|| ||v|| cos θ.
2
Inner Product → Generalization of dot product.
Direct generalization to Rn:
u · v := u1v1 + . . .+ unvn =n∑
i=1
uivi.
Using matrix notation:
n∑i=1
uivi = [u1 . . . un ] ·
v1...vn
= uTv = vTu.
This is called the (standard) inner product on Rn.
3
Thm 1 (P.359): Let u,v,w ∈ Rn and c ∈ R. Then:
(i) u · v = v · u;(ii) (u+ v) ·w = u ·w + v ·w;
(iii) (cu) · v = c(u · v);(iv) u · u ≥ 0, and u · u = 0 iff u = 0.
Note: (iv) is sometimes called: “positive-definite” property.
• A general inner product is defined using the above 4properties.
• For complex inner product, need to add complex conju-gate to (i).
4
Def: The length (or norm) of v is defined as:
||v|| :=√v · v.
||v|| = 1 : called unit vectors.
Def: The distance between u,v is defined as:
dist(u,v) := ||u− v||.
Def: The angle between u,v is defined as:
u,v) := cos−1 u · v||u|| · ||v||
.
5
Extra: The General Inner Product Space
Let V be a vector space over R or C.
Def: An inner product on V is a real/complex-valued func-tion of two vector variables <u,v> such that:
(a) <u,v>= <v,u>; (conjugate symmetric)
(b) <u+ v,w>=<u,w> + <v,w>;
(c) <cu,v>= c <u,v>;
(linear in the first vector variable)
(d) <u,u>≥ 0; and <u,u>= 0 iff u = 0.(positive-definite property)
6
Def: A real/complex vector space V equipped with an innerproduct is called an inner product space.
Note: (i) An inner product is conjugate linear in the secondvector variable:
<u, cv1 + dv2>= c <u,v1> +d <u,v2> .
(ii) If we replace (a) by <u,v> “ = ” <v,u>, consider:
<iu, iu> “=” i2 <u,u>= − <u,u>,
it will be incompatible with (d).
• When working with complex inner product space, musttake complex conjugate when interchanging u, v.
7
Examples of (general) inner product spaces:
1. The dot product on Cn: (∗: conjugate transpose)
<u,v>:= u1v1 + . . .+ unvn = v∗u.
2. A non-standard inner product on R2:
<u,v>:= u1v1−u1v2−u2v1+2u2v2 = vT
[1 −1−1 2
]u.
3. An inner product on the matrix space Mm×n:
<A,B>:= tr(B∗A) =
m∑j=1
n∑k=1
ajk bjk.
8
4. Consider the vector space V of continuous real/complex-valued functions defined on the interval [a, b]. Then thefollowing is an inner product on V :
<f, g>:=1
b− a
∫ b
a
f(t)g(t)dt.
[In real case, the norm ||f || will give the “root-mean-square” (r.m.s.) of area bounded by the curve of f andthe t-axis over the interval [a, b].]
9
Schwarz’s inequality:
(a1b1 + . . .+ anbn)2 ≤ (a21 + . . .+ a2n)(b
21 + . . .+ b2n).
Pf: The following equation cannot have distinct solution:
(a1x+ b1)2 + . . .+ (anx+ bn)
2 = 0
(a21 + . . .+ a2n)x2 + 2(a1b1 + . . .+ anbn)x
+(b21 + . . .+ b2n) = 0
So ∆ ≤ 0, and this gives the inequality.
10
The Cauchy-Schwarz Inequality:
|u · v| ≤ ||u|| · ||v||,
and equality holds if, and only if, {u,v} is l.d.
Proof: When u = 0, set u = 1||u||u.
Consider w = v − (v · u)u.
u
v
(v · u)u
w
Obviously, w ·w = ||w||2 ≥ 0 ⇒ Cauchy-Schwarz Inequality.
11
Set k = v · u:
0 ≤ (v − ku) · (v − ku)
= v · v − 2k(v · u) + k2(u · u)= ||v||2 − k2,
Note that k =v · u||u||
, so:
k2 =(v · u)2
||u||2≤ ||v||2 ⇒ (u · v)2 ≤ ||u||2 · ||v||2.
Taking positive square roots, we obtain the result.
12
Thm: (Triangle Inequality) For u,v ∈ Rn:
||u+ v|| ≤ ||u||+ ||v||,
and equality holds iff one of the vectors is a non-negativescalar multiple of the other.
Proof: Consider ||u+ v||2.
(u+ v) · (u+ v) = ||u||2 + 2(u · v) + ||v||2
≤ ||u||2 + 2||u|| · ||v||+ ||v||2
= (||u||+ ||v||)2.
Taking square root, we obtain the inequality.
13
Orthogonality: Pythagoras’ Theorem in vector form:
u+ vv
u
||u+ v||2 = ||u||2 + ||v||2
But in general we have:
||u+ v||2 = ||u||2 + 2(u · v) + ||v||2,
so we need u · v = 0.
14
Def: Let u, v be two vectors in Rn. When u ·v = 0, we saythat u is orthogonal to v, denoted by u ⊥ v.
• This generalizes the concept of perpendicularity.
• 0 is the only vector that is orthogonal to every vector vin Rn.
Example: In R2, we have:[34
]⊥
[4−3
].
Thm 2 (P.362): u and v are orthogonal iff
||u+ v||2 = ||u||2 + ||v||2.
15
Common Orthogonality:
Def: Let S be a set of vectors in Rn. If u is orthogonal toevery vector in S, we will say “u is orthogonal to S”, denotedby u ⊥ S.
i.e. We can regard u to be a “common perpendicular” to S.
Examples: (i) 0 ⊥ Rn.
(ii) In R2, let S = x-axis. Then e2 ⊥ S.
(iii) In R3, let S = x-axis. Then both e2 ⊥ S and e3 ⊥ S.
Exercise: Let u,v ⊥ S. Show that:
(i) (au+ bv) ⊥ S for any numbers a, b; (ii) u ⊥ SpanS.
***
16
Orthogonal Complement:
Def: Let S be a set of vectors in Rn. We define:
S⊥ := {u ∈ Rn | u ⊥ S},
called the orthogonal complement of S in Rn.
i.e. S⊥ collects all the “common perpendiculars” to S.
Examples: (i) {0}⊥ = Rn, (Rn)⊥ = {0}.
(ii) In R2, let S = x-axis. Then S⊥ = y-axis.
(iii) In R3, take S = {e1}. Then S⊥ = yz-plane.
17
Thm: S⊥ is always a subspace of Rn.
Checking: (i) 0 ⊥ v for every v ∈ S. So 0 ∈ S⊥.
(ii) Pick any u1,u2 ∈ S⊥. For any scalars a, b ∈ R, consider:
(au1 + bu2) · v = a(u1 · v) + b(u2 · v) = a · 0 + b · 0 = 0,
whenever v ∈ S. So au1 + bu2 ∈ S⊥ (cf. previous exercise).
Note: S itself need not be a subspace.
Thm: (a) S⊥ = (SpanS)⊥. (b) SpanS ⊆ (S⊥)⊥.
Pf: (a) S⊥ ⊇ (SpanS)⊥ is easy to see, since any vectoru ⊥ SpanS must also satisfy u ⊥ S.
18
Now, pick any u ∈ S⊥. For every v ∈ SpanS, write:
v = c1v1 + . . .+ cpvp, vi ∈ S, i = 1, . . . , p.
Then since u ⊥ S:
u · v = c1(u · v1) + . . .+ cp(u · vp) = 0,
and hence u ∈ (SpanS)⊥, so “S⊥ ⊆ (SpanS)⊥” is proved.
(b) Pick a vector w ∈ SpanS, we have l.c.:
w = c1v1 + . . .+ cpvp.
For any u ∈ S⊥:
w · u = c1(v1 · u) + . . .+ cp(vp · u) = 0 ⇒ w ∈ (S⊥)⊥.
19
Thm 3 (P.363): Let A be an m× n matrix. Then:
(RowA)⊥ = NulA and (ColA)⊥ = NulAT .
Pf: The product Ax can be rewritten as:
Ax =
r1...rm
x =
rT1 · x...
rTm · x
.
So x ∈ NulA ⇔ x ∈ {rT1 , . . . rTm}⊥ ⇔ x ∈ (RowA)⊥.
Hence (RowA)⊥ = NulA. Apply the result toAT , we obtain:
(ColA)⊥ = (RowAT )⊥ = NulAT .
20
Orthogonal sets and Orthonormal sets
Def: A set S is called orthogonal if any two vectors in S arealways orthogonal to each other.
Def: A set S is called orthonormal if (i) S is orthogonal, and(ii) each vector in S is of unit length.
Example: Orthonormal set: 1√11
311
,1√6
−121
,1√66
−1−47
.
21
Thm 4 (P.366): An orthogonal set S of non-zero vectors isalways linearly independent.
Pf: Let S = {u1,u2, . . . ,up} and consider the relation:
c1u1 + c2u2 + . . .+ cpup = 0.
Take inner product with u1, then:
c1(u1 · u1) + c2(u1 · u2) + . . .+ cp(up · u1) = 0 · u1
c1||u1||2 + c2 · 0 + . . .+ cp · 0 = 0.
As ||u1|| = 0, we must have c1 = 0. Similarly for other ci.
So S must be l.i.
22
The method of proof of previous Thm 4 gives:
Thm 5 (P.367): Let S = {u1, . . . ,up} be an orthogonal setof non-zero vectors and let v ∈ SpanS. Then:
v =v · u1
||u1||2u1 + . . .+
v · up
||up||2up.
Pf: Let c1, . . . , cp be such that v = c1u1 + . . .+ cpup. Takeinner product with u1, we have:
v · u1 = c1(u1 · u1) + . . .+ cp(up · u1) = c1||u1||2.
So c1 = v·u1
||u1||2 . Similarly for other ci.
23
Thm 5′: Let S = {u1, . . . up} be an orthonormal set. Thenfor any v ∈ SpanS, we have:
v = (v · u1)u1 + . . .+ (v · up)up.
Remark: This generalizes our familiar expression in R3:
v = (v · i)i+ (v · j)j+ (v · k)k.
Example: Express v as a l.c. of the vectors in S:
v =
123
, S =
311
,
−121
,
−1−47
.
24
New method: Compute c1, c2, c3 directly:
c1 =
123
·
311
||
311
||2c2 =
123
·
−121
||
−121
||2c3 =
123
·
−1−47
||
−1−47
||2
⇒ c1 =8
11, c2 =
6
6= 1, c3 =
12
66=
2
11.
25
Exercise: Determine if v ∈ Span {u1,u2}.
v =
32−5
, u1 =
122
, u2 =
−22−1
.
***
26
Orthogonal basis and Orthonormal basis
Def: A basis for a subspace W is called an orthogonal basisif it is an orthogonal set.
Def: A basis for a subspace W is called an orthonormal basisif it is an orthonormal set.
Examples: (i) {e1, . . . , en} is an orthonormal basis for Rn.
(ii) S = {[34
],
[4−3
]} is an orthogonal basis for R2.
S′ = {[
3545
],
[45
− 35
]} is an orthonormal basis for R2.
27
(iii) The following set S:
S =
311
,
−121
,
−1−47
is an orthogonal basis for R3.
(iv) The columns of an n× n orthogonal matrix A will forman orthonormal basis for Rn.
Orthogonal matrix: square matrix and ATA = In.
28
Checking: Write A = [v1 . . . vn ].
(i, j)-th entry of ATA = vTi vj = vi · vj .
(i, j)-th entry of In =
{1, i = j,0, i = j.
Above checking also works for non-square matrix:
Thm 6 (P.371): The n columns of an m × n matrix U areorthonormal iff UTU = In.
But for square matrices: AB = I ⇒ BA = I. So:
(iv)′ The rows of an n × n orthogonal matrix A (written incolumn form) also form an orthonormal basis for Rn.
29
Matrices having orthonormal columns are very special:
Thm 7 (P.371): Let T : Rn → Rm be a linear transforma-tion given by an m×n standard matrix U with orthonormalcolumns. Then for any x,y ∈ Rn:
a. ||Ux|| = ||x|| (preserving length)
b. (Ux) · (Uy) = x · y (preserving inner product)
c. (Ux) ·(Uy) = 0 iff x ·y = 0 (preserving orthogonality)
Pf: Direct verifications using UTU = In.
Results not true for just orthogonal columns.
30
Recall:
Let S = {u1, . . . ,up} be orthogonal. When v ∈ W = SpanS,we have:
v =v · u1
||u1||2u1 + . . .+
v · up
||up||2up.
What happens if v ∈ W?
• LHS = RHS, as RHS is always a vector in W .
• v′ = RHS is still computable.
What is the relation between v and v′?
31
LHS = v, RHS = v′ =
p∑i=1
v · ui
||ui||2ui.
Take inner product of RHS with uj :
v′ · uj =
( p∑i=1
v · ui
||ui||2ui
)· uj
=
p∑i=1
v · ui
||ui||2(ui · uj) =
v · uj
||uj ||2(uj · uj) = v · uj ,
which is the same as LHS · uj .
32
In other words, (v − v′) · uj = 0 for j = 1, . . . , p.
Thm: The vector z = v − v′ is orthogonal to every vectorin SpanS, i.e. z ∈ (SpanS)⊥.
v
uiv′
z
W
33
Def: Let {u1, . . . ,up} be an orthogonal basis for W . Foreach v in Rn, the following vector in W :
projWv :=v · u1
||u1||2u1 + . . .+
v · up
||up||2up,
is called the orthogonal projection of v onto W .
Remark: {u1, . . . ,up} must be orthogonal, otherwise RHSwill not give us the correct vector v′.
Note: v = projWv ⇔ v ∈ W.
34
Example: In R3, consider S = {e1, e2}. Then W = SpanSis the xy-plane. For any vector v ∈ R3:
x
z
y
v
v·e1
||e1||2 e1
v·e2
||e2||2 e2
projWv
projWv =
xy0
35
Exercise: Consider in R3 and W = Span {u1,u2}. FindprojWv:
v =
101
, u1 =
2−21
, u2 =
21−2
.
***
36
Def: The decomposition:
v = projWv + (v − projWv), (v − projWv) ∈ W⊥,
is called the orthogonal decomposition of v w.r.t. W .
v
w = projWv
z = v − projWv
W
Thm 8 (P.376): Orthogonal decomposition w.r.t. W is theunique way to write v = w + z with w ∈ W and z ∈ W⊥.
37
Exercise: Find the orthogonal projection of v onto W =NulA.
A = [ 1 1 −1 −1 ] , v =
1234
.
***
Thm 9 (P.378): Let v ∈ Rn and let w ∈ W . Then we have:
||v − projWv|| ≤ ||v −w||,
and equality holds only when w = projWv.
38
Pf: We can rewrite v −w as:
v −w = (v − projWv) + (projWv −w).
v
projWv
w
v − projWv
v −w
projWv −w
W
Can apply “Pythagoras Theorem” to the right-angle triangle.
39
||v −w||2 = ||v − projWv||2 + ||projWv −w||2
≥ ||v − projWv||2,
and equality holds iff ||projWv −w|| = 0 iff w = projWv.
Because of the inequality:
||v − projWv|| ≤ ||v −w||,
projWv sometimes is called the best approximation of v byvectors in W .
40
Def: The distance of v to W is defined as:
dist(v,W ) := ||v − projWv||.
Obviously, v ∈ W iff dist(v,W ) = 0.
Exercise: Let W = Span {u1,u2,u3}. Find dist(v,W ):
u1 =
1−11−1
, u2 =
11−1−1
, u3 =
1111
and v =
2464
.
Sol: Remeber to check that {u1,u2,u3} is orthogonal.***
41
Extension of Orthogonal Set
Let S = {u1, . . . ,up} be an orthogonal basis forW = SpanS.When W = Rn, we can find a vector v ∈ W and:
z = v − projWv = 0.
This vector z is in W⊥, i.e. will satisfy:
z ·w = 0 for every w ∈ W.
Hence the following set will again be orthogonal:
S ∪ {z} = {u1, . . . ,up, z}.
42
Thm: Span(S ∪ {v}) = Span(S ∪ {z}).
In other words, we can extend an orthogonal set S byadding the vector z.
• S1 = {u1} orthogonal, v2 ∈ SpanS1, then compute z2.⇒ S2 = {u1, z2} is again orthogonal.
and Span {u1,v2} = Span {u1, z2}.
• S2 = {u1,u2} orthogonal, v3 ∈ SpanS2, compute z3.⇒ S3 = {u1,u2, z3} is again orthogonal.
and Span {u1,u2,v3} = Span {u1,u2, z3}....
This is called the Gram-Schmidt orthogonalization process.
43
Thm 11 (P.383): Let {x1, . . . ,xp} l.i.. Define u1 = x1 and:
u2 = x2 −x2 · u1
||u1||2u1
u3 = x3 −x3 · u2
||u2||2u2 −
x3 · u1
||u1||2u1
...
up = xp −p−1∑i=1
xp · ui
||ui||2ui.
Then {u1, . . . ,up} will be orthogonal and for 1 ≤ k ≤ p:
Span {x1, . . . ,xk} = Span {u1, . . . ,uk}.
44
Notes: (i) Must use {ui} to compute projWkxk+1 since the
formula:
projWkxk+1 =
k∑i=1
xk+1 · ui
||ui||2ui,
is only valid for orthogonal set {ui}.
(ii) If obtain uk = 0 for some k, i.e. xk = projWxk, we have:
xk ∈ Span {x1, . . . ,xk−1}.
so {x1, . . . ,xk} will be l.d. instead.
(iii) All the ui will be non-zero vectors as {xi} is l.i.
45
Example: Apply Gram-Schmidt Process to {x1,x2,x3}:
x1 =
110
, x2 =
20−1
, x3 =
111
.
Solution: Take u1 = x1. Then:
u2 = x2 −x2 · u1
||u1||2u1 = x2 −
2
2u1 =
1−1−1
,
u3 = x3 −−1
3u2 −
2
2u1 =
13
−13
23
.
46
Example: Apply Gram-Schmidt Process to {x1,x3,x2}:
x1 =
110
, x3 =
111
, x2 =
20−1
.
Solution: Take u′1 = x1. Then:
u′2 = x3 −
x3 · u′1
||u′1||2
u′1 = x3 −
2
2u′1 =
001
,
u′3 = x2 −
−1
1u′2 −
2
2u′1 =
1−10
.
47
Exercise: Find an orthogonal basis for ColA:
A =
1 3 1 23 4 −2 11 1 1 11 2 2 2
.
Sol: First find a basis for ColA (e.g. pivot columns of A).
Then apply Gram-Schmidt Process.
***
48
Approximation Problems: Solve Ax = b.
Due to the presence of errors, a consistent system may appearas an inconsistent system:
x1 + x2 = 1
x1 − x2 = 0
2x1 + 2x2 = 2
→
x1 + x2 = 1.01
x1 − x2 = 0.01
2x1 + 2x2 = 2.01
Also in practice, exact solutions are usually not necessary.
• How to obtain a “good” approximate “solution” for theabove inconsistent system?
49
Least squares solution: How to measure the “goodness”of x0 as an approximated solution to the system:
Ax = b?
• Minimize the difference ||x− x0||
Problem: But x is unknown......
Another way of approximation:
x0 ≈ x “ ⇒ ” Ax0 ≈ Ax = b.
50
Analysis: Find x0 such that:
Ax0 = b0,
and b0 is as close to b as possible.
• b0 must be in ColA.
• ||b− b0||2 is a sum of squares → least squares solution.
Best approximation property of orthogonal projection:
||b− projWb|| ≤ ||b−w|| for every w in W = ColA.
Should take b0 = projWb.
51
Example: Find the least squares solution of the inconsistentsystem:
x1 + x2 = 1.01
x1 − x2 = 0.01
2x1 + 2x2 = 2.01
To compute projWb, we need an orthogonal basis for W =ColA first.
a basis for ColA is: {
112
,
1−12
}.
52
Then by Gram-Schmidt Process, we get an orthogonal basisfor W = ColA:
{
112
,
1−12
} −→ {
112
,
1−52
}.Compute b0 = projWb:
b0 =
1.010.012.01
·
112
||
112
||2
112
+
1.010.012.01
·
1−52
||
1−52
||2
1−52
53
Hence:
b0 =
1.0060.012.012
.
Since b0 ∈ ColA, the system Ax0 = b0 must be consistent.
Solving Ax0 = b0: 1 1 | 1.0061 −1 | 0.012 2 | 2.012
→
1 0 | 0.5080 1 | 0.4980 0 | 0
.
Thus we have the following least squares solution:
x0 =
[0.5080.498
].
54
But we have the following result:
(ColA)⊥ = NulAT .
Then, since we take b0 = projColAb:
(b− b0) ∈ (ColA)⊥ ⇔ (b− b0) ∈ NulAT
⇔ AT (b− b0) = 0
⇔ ATb0 = ATb.
So, if x0 is an approximate solution, we have:
AT (Ax0) = ATb.
The above is usually called the normal equation of Ax = b.
55
Thm 13 (P.389): The least squares solutions of Ax = b arethe solutions of the normal equation ATAx = ATb.
In the following case, the least square solution will be unique:
Thm 14 (P.391): Let A be anm×nmatrix with rankA = n.Then the n× n matrix ATA is invertible.
Example: Find again the least squares solution:x1 + x2 = 1.01
x1 − x2 = 0.01
2x1 + 2x2 = 2.01
56
Solution: Solve the normal equation. Compute:
ATA =
[1 1 21 −2 2
] 1 11 −12 2
=
[6 44 6
],
ATb =
[1 1 21 −2 2
] 1.010.012.01
=
[5.045.02
].
So the normal equation is:[6 44 6
] [x1
x2
]=
[5.045.02
]⇒
[x1
x2
]=
[0.5080.498
].
57
Least Squares Problems
Linear Regression: Fitting data (xi, yi) with straight line.
x
y
To “minimize” the differences indicated by the red intervals.
58
When a straight line y = c + mx can pass through all thepoints, it will of course “best fit” the data. This requires:
c+mx1 = y1
...
c+mxn = yn
↔
1 x1...
...1 xn
[cm
]=
y1...yn
being consistent.
But in general the above system Ax = b is inconsistent.
59
Measurement of closeness: square sum of y-distances.
|y1 − (mx1 + c)|2 + . . .+ |yn − (mxn + c)|2.
Note that this is expressed as ||b− b0||2, where:
b =
y1...yn
, b0 =
c+mx1...
c+mxn
.
b0 ∈ ColA since Ax = b0 is consistent.
→ Use normal equation!
60
Example: Find a straight line that best fits the points:
(2, 1), (5, 2), (7, 3), (8, 3),
in the sense of minimizing the square-sum of y-distances.
Sol: The (inconsistent) system is:1 21 51 71 8
[cm
]=
1233
.
We are going the find its least squares solution.
61
Compute:
ATA =
[1 1 1 12 5 7 8
]1 21 51 71 8
=
[4 2222 142
],
ATb =
[1 1 1 12 5 7 8
]1233
=
[957
].
62
So the normal equation is:[4 2222 142
] [cm
]=
[957
],
which has a unique solution of (27 ,514 ).
The “best fit” straight line will be:
y =2
7+
5
14x.
63
Polynomial Curve Fitting:
Example: Find a polynomial curve of degree at most 2which best fits the following data:
(2, 1), (5, 2), (7, 3), (8, 3),
in the sense of least squares.
Sol: Consider the general form of the fitting curve:
y = a0 · 1 + a1 · x+ a2 · x2.
64
The curve cannot pass through all the 4 points as:a0 · 1 + a1 · 2 + a2 · 22 = 1
a0 · 1 + a1 · 5 + a2 · 52 = 2
a0 · 1 + a1 · 7 + a2 · 72 = 3
a0 · 1 + a1 · 8 + a2 · 82 = 3
is inconsistent.
Again, use normal equation.
65
The corresponding normal equation ATAx = ATb is: 4 22 14222 142 988142 988 7138
a0a1a2
=
957393
,
which has a unique solution of ( 19132 ,
1944 ,−
1132 ).
So the best fitting polynomial is:
y =19
132+
19
44x− 1
132x2.
66
General Curve Fitting:
Example: Find a curve in the form c0 + c1 sinx + c2 sin 2xwhich best fits the following data:
(π
6, 1), (
π
4, 2), (
π
3, 3), (
π
2, 3),
in the sense of least squares.
Sol: Let y = c0 · 1 + c1 · sinx+ c2 · sin 2x. The systemc0 · 1 + c1 · sin π
6 + c2 · sin 2π6 = 1
c0 · 1 + c1 · sin π4 + c2 · sin 2π
4 = 2c0 · 1 + c1 · sin π
3 + c2 · sin 2π3 = 3
c0 · 1 + c1 · sin π2 + c2 · sin 2π
2 = 3
is inconsistent.
67
Solving ATAx = ATb ......
c0 =184− 39
√2− 89
√3 + 9
√6
78− 18√2− 38
√3 + 6
√6
≈ −2.29169
c1 =9 + 3
√2− 7
√3− 2
√6
−39 + 9√2 + 19
√3− 3
√6≈ 5.31308
c2 =8 + 9
√2− 10
√3− 6
√6
78− 18√2− 38
√3 + 6
√6≈ 0.673095
So the best fitting function is:
(−2.29169) + (5.31308) sinx+ (0.673095) sin 2x.
68
Extra: Continuous Curve Fitting
Find g(x) best fitting a given f(x).
f(x)
g(x)
Try to minimize the difference (area) between two curves.
69
• To minimize the “root-mean-square” (r.m.s.) of areabetween two curves:√
1
b− a
∫ b
a
|f(x)− g(x)|2dx.
• Given by the following inner product:
<f, g>=1
b− a
∫ b
a
f(x)g(x)dx.
• not in Rn, not the standard inner product...No normal equation.
• But we can use orthogonal projection.
70
Recall: Formula of orthogonal projection in general:
projWy =
p∑i=1
<y,ui>
<ui,ui>ui,
where {u1, . . . ,up} is an orthogonal basis of W .
Example: Fit f(x) = x over [0, 1] by l.c. of
S = {1, sin 2πkx, cos 2πkx; k = 1, 2, . . . , n}Sol: S is orthogonal under the inner product:
<f, g>=
∫ 1
0
f(x)g(x)dx.
(direct checking)
71
So compute those “<y,ui>”:
<f(x), 1>=1
2,
<f(x), sin 2πkx>= − 1
2πk, <f(x), cos 2πkx>= 0.
We also need those “<ui,ui>”:
<1, 1>= 1,
<sin 2πkx, sin 2πkx>=1
2, <cos 2πkx, cos 2πkx>=
1
2.
72
So the best fitting curve is g(x) = projW f(x):
g(x) =1
2− 1
π
(sin 2πx
1+
sin 4πx
2+ . . .+
sin 2nπx
n
)When n = 5:
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
73
Example: Let f(x) = sgn(x), the sign of x:
sgn(x) =
−1 for x < 00 for x = 01 for x > 0
Find the best r.m.s. approximation function over [−1, 1] us-ing l.c. of S = {1, sin kπx, cos kπx; k = 1, 2, 3, . . . , 2n+ 1}.
Sol: Interval changed. Use new inner product:
<f, g>=1
2
∫ 1
−1
f(x)g(x)dx.
74
Then S is orthogonal (needs another checking) and:
<1, 1>= 1, <sin kπx, sin kπx>=1
2=<cos kπx, cos kπx> .
So, we compute:
<sgn(x), 1> = 0;
<sgn(x), sin kπx> =
{0 if k is even,2kπ if k is odd;
<sgn(x), cos kπx> = 0.
75
Hence the best r.m.s. approx. to sgn(x) over [−1, 1] is:
4
π
(sinπx+
sin 3πx
3+
sin 5πx
5. . .+
sin(2n+ 1)πx
2n+ 1
).
When 2n+ 1 = 9:
-1 -0.5 0.5 1
-1
-0.5
0.5
1
76