me471_lec01

8
Review of Concepts from Linear Algebra Our text jumps right into finite element methods in Chapter 2, but I prefer to step back a bit first and try to provide a frame work in which we can place the formulations and techniques we’ll use. All of the problems we will solve have a common thread: they are linear in a sense that we will make precise shortly. The problems of linear algebra are the prototypes of all linear problems, and therefore it is important that we review some of the basic concepts of this subject so that we can see the parallels in more complex situations. An additional benefit of this review is that we will ultimately reduce all our problems to linear algebraic ones – so it is helpful to recall some basic results from this subject. A basic notion in linear algebra is that of a vector space. The definition of a vector space is a formalization of the properties of three dimensional vectors with which we are all familiar. Let F denote the set of real numbers R or complex numbers C, and call the elements of F scalars. We say a set V with elements ~x,~ y,~ z,... called vectors forms a vector space over F provided: 1. For each element ~x V and each scalar α F there is defined a vector α · ~x V called the product of α and ~x (i.e., there is a function, say p : F × V V whose value p(α,~x) at a particular pair is denoted by α · ~x – usually we omit the · and simply write α~x). 2. For each pair of vectors ~x,~ y in V there is defined a vector ~x + ~ y called the sum of ~x and ~ y (i.e., there is a function s : V × V V whose value s(~x,~ y) at a pair ~x,~ y of elements in V is denoted by ~x + ~ y). 3. The functions p and s satisfy the conditions: (a) ~x + ~ y = ~ y + ~x. (b) ~x +(~ y + ~ z)=(~x + ~ y)+ ~ z. (c) There is a unique vector ~ 0 such that ~x + ~ 0= ~x for every ~x. We usually simply write 0 instead of ~ 0. (d) For every vector ~x there is a vector -~x such that ~x +(-~x)=0. (e) α(~x + ~ y)= α~x + α~ y. (f) (α + β)~x = α~x + β~x. (g) α(β~x)=(αβ)~x. (h) 1 · ~x = ~x. (i) 0 · ~x =0. A number of elementary properties follow immediately from these definitions (e.g. The cancelation law ~x + ~ y = ~x + ~ z ~ y = ~ z. Proof: ~ y = ~ y +0= ~ y + ~x +(-~x)= ~ z + ~x +(-~x)= ~ z +0= ~ z and -~x =(-1) · ~x – proof: 0 = 0~x = (1 - 1)~x = ~x +(-1) · ~x, but by (d) 0 = ~x +(-~x) ⇒-~x =(-1) · ~x by the cancelation law.) All these conditions are obviously satisfied in the two most important cases of vector space we will deal with: 1

Upload: hasen-bebba

Post on 17-Jul-2016

5 views

Category:

Documents


0 download

DESCRIPTION

Finite Element

TRANSCRIPT

Review of Concepts from Linear Algebra

Our text jumps right into finite element methods in Chapter 2, but I prefer to step back a bit firstand try to provide a frame work in which we can place the formulations and techniques we’ll use.

All of the problems we will solve have a common thread: they are linear in a sense that we willmake precise shortly. The problems of linear algebra are the prototypes of all linear problems, andtherefore it is important that we review some of the basic concepts of this subject so that we cansee the parallels in more complex situations. An additional benefit of this review is that we willultimately reduce all our problems to linear algebraic ones – so it is helpful to recall some basicresults from this subject.

A basic notion in linear algebra is that of a vector space. The definition of a vector space is aformalization of the properties of three dimensional vectors with which we are all familiar. Let Fdenote the set of real numbers R or complex numbers C, and call the elements of F scalars. Wesay a set V with elements ~x, ~y, ~z, . . . called vectors forms a vector space over F provided:

1. For each element ~x ∈ V and each scalar α ∈ F there is defined a vector α · ~x ∈ V called theproduct of α and ~x (i.e., there is a function, say p : F × V → V whose value p(α, ~x) at aparticular pair is denoted by α · ~x – usually we omit the · and simply write α~x).

2. For each pair of vectors ~x, ~y in V there is defined a vector ~x + ~y called the sum of ~x and ~y(i.e., there is a function s : V × V → V whose value s(~x, ~y) at a pair ~x, ~y of elements in V isdenoted by ~x+ ~y).

3. The functions p and s satisfy the conditions:

(a) ~x+ ~y = ~y + ~x.

(b) ~x+ (~y + ~z) = (~x+ ~y) + ~z.

(c) There is a unique vector ~0 such that ~x + ~0 = ~x for every ~x. We usually simply write 0instead of ~0.

(d) For every vector ~x there is a vector −~x such that ~x+ (−~x) = 0.

(e) α(~x+ ~y) = α~x+ α~y.

(f) (α+ β)~x = α~x+ β~x.

(g) α(β~x) = (αβ)~x.

(h) 1 · ~x = ~x.

(i) 0 · ~x = 0.

A number of elementary properties follow immediately from these definitions (e.g. The cancelationlaw ~x + ~y = ~x + ~z ⇒ ~y = ~z. Proof: ~y = ~y + 0 = ~y + ~x + (−~x) = ~z + ~x + (−~x) = ~z + 0 = ~z and−~x = (−1) · ~x – proof: 0 = 0~x = (1− 1)~x = ~x+ (−1) · ~x, but by (d) 0 = ~x+ (−~x)⇒ −~x = (−1) · ~xby the cancelation law.)

All these conditions are obviously satisfied in the two most important cases of vector space we willdeal with:

1

1. n dimensional Euclidean space Rn = {~x : ~x = (x1, x2, . . . , xn)} with addition defined component-wise, ~x+ ~y = (x1 + y1, . . . xn + yn), and scalar multiplication defined by α~x = (αx1, . . . , αxn).For n = 3 this is the space of ordinary three dimensional vectors. It is useful in most cases tothink of the components of a vector ~x as arranged in a column, i.e.

~x =

x1...xn

2. Real or complex function spaces. As a familiar example, let [a, b] denote a closed interval onthe real axis, and let C([a, b]) denote the set of all real valued continuous functions defined on[a, b]. Letting f, g be elements of C([a, b]) and x ∈ [a, b] we define the sum of f and g pointwiseby (f + g)(x) = f(x) + g(x) and if c is a real number we define scalar multiplication pointwiseby (cf)(x) = cf(x). It is clear that these definitions give a vector space. By the same reasoningthe sets Ck([a, b]) of all functions defined, continuous, and having continuous derivatives up toand including order k all form vector spaces (all strictly contained in C([a, b])).

Vector spaces are the setting for our definition of a linear function. Let V, W be vector spacesand f : V → W be a function defined on V and giving values in W , i.e., if ~x ∈ V f(~x) is definedand is a vector in W . f is said to be a linear function if f(α~x+ β~y) = αf(~x) + βf(~y). The set V iscalled the domain of f , and the set W the codomain. The set of vectors in W that are images ofvectors in V under f , i.e., f(V ) = {y ∈W : f(x) = y}, is called the range of f (range(f) = f(V )).

For example, suppose a ≤ x1 < x2 ≤ b are two fixed points in [a, b]. Define A : C([a, b]) → R2 byA(f) = (f(x1), f(x2)) and we have a function from the vector space C([a, b]) to the vector space R2.This function is linear since A(αf + βg) = (αf(x1) + βg(x1), αf(x2) + βg(x2)) = αA(f) + βA(g).

As another example, consider the problem of fitting a pair of data vectors of length n to a straightline: we want to find a, b such that yi = axi + b, i = 1, . . . , n. We can write this as A~v = ~y, where

A =

1 x1...

...1 xn

, ~v =

(

ba

)

, and ~y =

y1...yn

,

and we see that A is a linear map from R2 to Rn. If we write Av in the form

A~v = b

1...1

+ a

x1...xn

we see that the range of A is a linear combination of just two vectors in Rn, and so we can’t hope tohave exact equivalence of A~v and ~y except in very special cases – the range of A is smaller than Rn.

Our two prime examples of vector spaces have a vastly different size. The size of a vector space Vis measured by its dimension dim(V ) which is either a positive integer or ∞. The Euclidean spaceRn has dimension n whereas the space of continuous functions on an interval [a, b] has dimension∞.To get at a definition of the number dim(V ) we need to introduce the notion of a system of basisvectors of V . This requires several useful ideas.

2

1. Given a set of vectors S contained in a vector space V . A finite linear combination ofvectors in S is a sum of the form c1~v1 + c2~v2 + · · ·+ ck~vk with ~vj ∈ S j = 1, . . . , k.

2. If U ⊂ V is a subset of a vector space V , then U is called a subspace if every finite linearcombination of vectors in U belongs to U (in particular, the zero vector must belong to U).If U is a subspace, it is a vector space all by itself. (For example, the subspaces of R2 are alllines through the origin, the subspaces of R3 are all lines and all planes passing through theorigin.)

3. Given a set of vectors S contained in a vector space V . The span of S, written span(S) is theset of all finite linear combinations of vectors in S, i.e. ~v ∈ span(S)⇒ ~v = c1~v1 + c2~v2 + · · ·+ck~vk, ~vj ∈ S, j = 1, . . . , k. Of course, span(S) is a subspace of V for every S ⊂ V.

4. A set of vectors S = {~v1, . . . , ~vk} in a vector space V is called linearly independent ifc1~v1 + c2~v2 + · · ·+ ck~vk = 0⇒ cj = 0, j = 1, . . . , k, that is a linear combination of the vectorscan vanish only when all the coefficients of combination are 0. A set of vectors is linearlydependent if it is not linearly independent, i.e., if there exist scalars c1, . . . , ck, not all zerosuch that c1~v1 + c2~v2 + · · ·+ ck~vk = 0.

5. A set of vectors S in the vector space V forms a basis of V if S is linearly independent andspan(S) = V . If S is a basis, then every vector in V may be written as a unique linearcombination of vectors in S. (Suppose ~y =

j cj~vj =∑

k c′k~v′k. We can assume the same

vectors appear in each finite sum – otherwise include them with 0 coefficients. Then subtractingwe have

j(cj − c′j)vj = 0⇒ cj = c′j for all j since the vectors are linearly independent.)

It can be shown that if a vector space is finite dimensional, then every basis has the same numberof elements, and we call this number dim(V ). In the case of Rn it is clear that the vectors ~ej withjth component 1 and all others 0 span Rn since ~x =

j xj~ej . It is also clear that the vectors ~ejare linearly independent,

j cj~ej = 0⇒ cj = 0, j = 1, . . . , n so that dim(Rn) = n according to ourdefinitions.

It can also be shown that given any linearly independent set of vectors S it is always possible to finda spanning set S1 ⊇ S, i.e., any linearly independent set may be increased to form a basis. Now theset of functions {xk : k = 0, 1, 2, . . .} is linearly independent in C([a, b]) (if c0+ c1x+ · · ·+ cnxn = 0,we would have a polynomial of degree n with more than n roots which is impossible). Since this setis not finite dim(C([a, b])) = ∞. This space has many finite dimensional subspaces, e.g., the set ofall polynomials of degree n is a subspace of dimension n+ 1 (spanned by {1, x, x2, . . . , xn}

A major concern of linear algebra is the study of linear maps between finite dimensional linearspaces. We’ve already seen an example of this in the data fitting problem, where the linear mapwas a matrix. We’ll now see that any linear map can be thought of as a matrix. Let A : V →W bea linear map between a vector space V (dim(V ) = n, ~vj , j = 1, . . . , n a basis) and a vector spaceW (dim(W ) = m, ~wk, k = 1, . . . ,m a basis). Since ~v ∈ V ⇒ ~v =

j xj~vj , we can write the imagevector ~w = A(~v) =

j xjA(~vj). But each of the n vectors A(~vj) ∈ W can be expressed in terms ofthe basis ~wk, i.e., we have

A(~vj) =m∑

s=1

asj ~ws,

where the m · n coefficients asj are given numbers for a fixed pair of bases and a fixed operator A.

3

We therefore find that

~w =n∑

j=1

xj

m∑

s=1

asj ~ws =m∑

s=1

(n∑

j=1

asjxj)~ws.

Since ~w =∑

s ys ~ws with unique coefficients we have

ys =n∑

j=1

asjxj , s = 1, . . . ,m

Using matrix multiplication we can write ~w = A(~v) as the matrix equation

y1...ym

=

a11 · · · a1n...

...am1 · · · amn

x1...xn

,

where the m × n matrix [aij ] corresponds to the operator A. It is useful to observe that the jthcolumn of the matrix is the result of applying A to the jth basis vector. The correspondence betweena linear map and a matrix is one to many since each time we choose a system of basis vectors for Vand W we get, in general, a different matrix representing the map. On the other hand, linear mapscan now be thought of as simply matrices – objects we are all familiar with.

As an example, lets take V = P2 =the set of all polynomials of degree at most 2 (considered as realvalues functions on R). This has the basis {e0(x) = 1, e1(x) = x, e2(x) = x2} (note indices run from0 to 2). We define on this set the operator D : V → W = P1 (with basis vectors e0, e1) defined byDp(x) = dp(x)/dx. We have De0 = 0, De1 = e0, De2 = 2e1. Thus if p = c0 + c1x+ c2x

2 we haveDp = p′ = c′0e0 + c′1e1 or in matrix form

Dp =

(

0 1 00 0 2

)

c0c1c2

=

(

c′0c′1

)

Note that Dp = 0 is satisfied not only by the zero vector (polynomial) p ≡ 0, but also by p = c0e0.For any linear function A, we set ker(A) = {~v ∈ V : A~v = 0}, and call this set the kernel ornull space of A. If ker(A) = {0}, we say that is trivial. If ker(A) is trivial then the equationA~v = ~w has at most one solution for a given ~w ∈ W . (Suppose there were two ~v1, ~v2, thenA(~v1−~v2) = ~w− ~w = 0⇒ ~v1−~v2 ∈ ker(A)⇒ ~v1−~v2 = 0.) If ker(A) is nontrivial, then the solutionof A~v = ~w for a given ~w ∈ W is not unique: Reason if ~v1 is one solution, then ~v1 + ~v0 is also asolution for any ~v0 6= 0 ∈ ker(A).

Up to this point, we have not introduced two very important features of the Euclidean space Rn,and we consider these features now.

1. Rn has a notion of distance defined on it: namely there if ~x ∈ Rn the length of ~x is given

by ‖~x‖ =(

j x2j

)1/2, and if ~y is another point the distance between ~x and ~y is defined by

‖~x− ~y‖2. The function ‖ · ‖2 : Rn → R is called the Euclidean norm on Rn.Let V be a vector space and assume there is a function ‖ · ‖ : V → R. This function is calleda norm on V if satisfies the following three conditions

4

(a) ‖~v‖ ≥ 0, ‖~v‖ = 0⇒ ~v = 0

(b) ‖~v + ~w‖ ≤ ‖~v‖ + ‖~w‖. This inequality is known as the triangle inequality. In R2 withthe Euclidean norm it states that the length of any side of a triangle is less than the sumof the lengths of the other two sides. If n > 2, it may not be obvious that the triangleinequality is satisfied by ‖ · ‖2, but we’ll see in a moment that it is.

(c) ‖α~v‖ = |α|‖~v‖.

Other distance functions or norms can also be defined of Rn. For example, it is easy to showthat ‖~x‖ = maxi |xi| satisfies the three conditions just given. A vector space on which a normis defined is called a normed linear space.

2. Rn has a scalar (or inner, or dot) product defined on it by which the idea of angles betweenvectors may be defined, recall that (at least in two and three dimensions) 〈~x, ~y〉 =

∑nj=1 xjyj =

‖~x‖‖~y‖ cos(~x, ~y). In general, a scalar product on a vector space V is a function 〈〉 : V ×V →R or C such that (assuming the codomain is R)

(a) 〈~x, ~y〉 = 〈~y, ~x〉(b) 〈α~x, ~y〉 = α〈~x, ~y〉, and 〈~x+ ~z, ~y〉 = 〈~x, ~y〉+ 〈~z, ~y〉(c) 〈~x, ~x〉 ≥ 0, 〈~x, ~x〉 = 0⇒ ~x = 0

The usual scalar product on Rn clearly satisfies these axioms, and we see that ‖~x‖22 = 〈~x, ~x〉,that is, the norm is actually expressible in terms of the inner product. A vector space withan inner product defined on it is called an inner product space. Any inner product spacebecomes a normed space with ‖~x‖ ≡

〈~x, ~x〉 as the norm1

A vector space with an inner product defined on it is called an inner product space, and thisis the type of space where most of our problems can be posed in a natural way. Aside from finitedimensional spaces like Rn we will also have to contend with infinite dimensional spaces like C([a, b]).This vector space and others like it can be made into inner product spaces in many ways, but oneuseful straight forward definition of inner product is

〈f, g〉 =∫ b

a

f(x)g(x) dx

By the properties of the integral, it is clear that the first two requirements of an inner product are

satisfied, and for a continuous function∫ b

af2(x) dx = 0 implies that f ≡ 0 as you can easily check.

The general situation that we will deal with in determining approximate solutions to both ordinaryand partial differential equations is this: We will be seeking the solution of a linear differentialequation and will be able to pose our problem using linear space terminology. That is, the differentialequation will be viewed as a linear operator from a vector space V to a vector space W , A : V →W ,and for a given ~f ∈W we have to find a vector ~v ∈ V such that A~v = ~f. For example, consider theboundary value problem

−u′′ = f, 0 < x < 1, u(0) = u(1) = 0, f ∈ C([0, 1]) given1All norm conditions but the triangle inequality are obvious. Define the quadratic Q(α) = 〈~x − α~y, ~x − α~y〉 ≥ 0.

Expanding and using the properties of the inner product gives Q = 〈~x, ~x〉−2α〈~x, ~y〉+α2〈~y, ~y〉. The minimum value ofQ occurs for α〈~y, ~y〉 = 〈~x, ~y〉. Assuming ~y 6= 0 this means 〈~x, ~x〉 − 〈~x, ~y〉2/〈~y, ~y〉 = Qmin ≥ 0. From this computationwe find the result |〈~x, ~y〉| ≤ ‖~x‖‖~y‖ called the Cauchy inequality. The triangle inequality follows from this, and thecomputation ‖~x + ~y‖2 = ‖~x‖2 + 2〈~x, ~y〉 + ‖~y‖2 ≤ ‖~x‖2 + 2|〈~x, ~y〉| + ‖~y‖2. But using the Cauchy inequality this lastsum is ≤ ‖~x‖2 + 2‖~x‖‖~y‖+ ‖~y‖2 = (‖~x‖+ ‖~y‖)2, and the triangle inequality is proved.

5

take V = C20 ([0, 1]) = {u ∈ C2([0, 1]) : u(0) = u(1) = 0}, W = C([0, 1]), and for u ∈ V take

Au(x) = −u′′(x). Then the equation and the boundary conditions are reduce to finding the solutionof the operator equation Au = f.

It is important to realize that a numerical approximation to a problem of solving the linear equationAu = f for a given f where the domain space for A has infinite dimension (like C2

0 ([0, 1])) canonly yield a finite number of values, say uj , j = 1, . . . , n which may approximate the values of thesolution u at certain points in [0, 1]. For our purposes, it is best to view this constraint as havingto find an approximation to the solution u in a finite dimensional subspace of V. If V is an innerproduct space we have the following important approximation theorem.

If V is an inner product space, ~v ∈ V , and W a finite dimensional subspace of V , then the problemof finding a ~w ∈ W such that ‖~v − ~w‖ =minimum has a unique solution, ~w∗, i.e., ‖~v − ~w∗‖ =min~w∈W ‖~v − ~w‖.

I won’t try to prove the existence of ~w∗ rigorously. Instead I’ll develop formulas for computing itscomponents (assuming a real vector space). Let ~ej , j = 1, . . . , n be a basis for W , and ~w =

j xj~ejbe an element of W . We have

d ≡ ‖~v − ~w‖2 = 〈~v − ~w,~v − ~w〉 = ‖~v‖2 − 2〈~v, ~w〉+ ‖~w‖2

and introducing the components of ~w

d = d(x1, . . . , xn) = ‖~v‖2 − 2n∑

j=1

〈~v,~ej〉xj +n∑

i,j=1

〈~ei, ~ej〉xixj

In order to minimize d we set its partial derivatives with respect to xj , j = 1, . . . , n equal to 0. Thisgives

n∑

j=1

〈~ei, ~ej〉xj = 〈~v,~ei〉, i = 1, . . . , n or, G~x = ~y

with ~x = [x1, . . . , xn]T , ~y = [〈~v,~e1〉, . . . , 〈~v,~en〉]T . These equations are called the Normal Equations.

The matrixG = [〈~ei, ~ej〉] is called the Gram matrix of the basis vectors ~ei. It is relatively easy to showthat G is nonsingular so that the normal equations always have a solution, xj = x∗j , j = 1, . . . , n.In particular, if the basis vectors are mutually perpendicular,

〈~ei, ~ej〉 ={

gi > 0, i = j0, i 6= j

then the normal equations have the simple solution, xi = 〈~v,~ei〉/gi, i = 1, . . . , n.

Note the following important property of this solution: 〈~v − ~w∗, ~w〉 = 0 for all ~w ∈ W , i.e., thedifference vector ~v − ~w∗ is perpendicular to W . In fact, the normal equations can be written as〈~w∗, ~ei〉 = 〈~v,~ei〉, i = 1, . . . , n so that 〈~v − ~w∗, ~ei〉 = 0 for all i, so that ~v − ~w∗ ⊥W.

The problem of fitting a straight line to a sequence of data pairs can be viewed as a best approxi-mation problem. Recall that this problem could be expressed as finding a, b such that

b

1...1

+ a

x1...xn

= b~e + a~x, approximates ~y =

y1...yn

6

in the sense that ‖~y − (b~e + a~x)‖ =minimum. This is just the problem we are studying with W thetwo dimensional space spanned by ~e, ~x. In this case, the Gram matrix is 2 by 2

G =

(

〈~e,~e〉 〈~e, ~x〉〈~e, ~x〉 〈~x, ~x〉

)

=

(

n∑

i xi∑

i xi∑

i xiyi

)

and the right hand side vector is [〈~e, ~y〉, 〈~x, ~y〉]T = [∑

i yi,∑

i xiyi]T .

As another example, assume we want to approximate the transcendal function sin(πx) on the interval[0, 1] by a polynomial of degree at most 2. Here V = C([0, 1]), W is the three dimensional subspaceP2. We will use the L2 norm and take the basis vectors 1, x, x2 for P2. The Gram matrix is

gij =

∫ 1

0xixj dx, i, j = 0, 1, 2 or G =

1 1/2 1/31/2 1/3 1/41/3 1/4 1/5

The right hand side vector is [∫ 10 x

i sin(πx) dx]T = [2/π, 1/pi, (π2 − 4)/π3]. Using Matlab we find

>> g=[1 1/2 1/3;1/2 1/3 1/4;1/3 1/4 1/5]

g =

1.0000 0.5000 0.3333

0.5000 0.3333 0.2500

0.3333 0.2500 0.2000

>> rhs=[2/pi, 1/pi, (pi^2-4)/pi^3]’

rhs =

0.6366

0.3183

0.1893

>> y=g\rhs

y =

-0.0505

4.1225

-4.1225

>> xx=linspace(0,1);

>> yy=sin(pi*xx);

>> zz=y(1)*ones(1,100)+y(2)*xx+y(3)*xx.^2;

>> plot(xx,yy,xx,zz)

The plot shows close agreement.

7

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.2

0

0.2

0.4

0.6

0.8

1

1.2

8