math review

29
MATHS Extensions for ECON/MOFI Honours The Greek alphabet. α = alpha, β = beta, γ = gamma, δ = delta, ε = epsilon, ζ = zeta, η = eta, θ = theta, ι = iota, κ = kappa, λ = lambda, µ = mu, ν = nu, ξ = xi, ο = omicron, π = pi, ρ = rho, σ = sigma, τ = tau, υ = upsilon, φ = phi, χ = chi, ψ = psi, ω = omega. 0. LOGIC Let p and q be two propositions (for example, 2 + 2 = 4; the set, A, is convex; the function, F: X Y, has a maximum; New Zealand’s Prime Minister is a woman; all swans are white; all golden mountains are golden; Socrates was mortal; this sentence is not true). Sufficient conditions: p is a sufficient condition for q means that whenever p is true, so is q. Alternative expressions of the same relationship are p implies q (p q), if p then q or q if p. Necessary conditions: p is a necessary condition for q means that p must be true whenever q is. Alternative expressions are p is implied by q (p q) or q only if p. For example, "4 < x < 5" is a sufficient but not a necessary condition for "x > 3". Conversely, "x > 3" is a necessary but not sufficient condition for "4 < x < 5". Equivalent conditions: p is equivalent to q means that p is both necessary and sufficient for q. That is often expressed as p if and only if q or p iff q (p q). 1. ALGEBRA (MI Appendix B; AC Ch 5, TY Ch 10, 11; KL R2-R4, R6; HQ A1, A3; RS Ch 4; AT Ch 0) A matrix with m rows (horizontal) and n columns (vertical) is said to be m x n. If m = n, then the matrix is square and so has a determinant. The determinant of a matrix can be thought of as a summary statistic. There is a sense in which it is a measure of size (when calculating the inverse matrix). Further, the sign of the determinant (positive or negative) can be thought of as a measure of orientation (when looking at quadratic forms). For example, let A = a b c d , det(A) = |A| = a·d - b·c. More generally, det(A) = j (-1) i+j ·a ij ·A ij = i (-1) i+j ·a ij ·A ij , where a ij is the element in row i, column j, and A ij is the determinant of the matrix found by removing row i, column j. Notice, this is an inductive definition. A more general definition exists but this will be good enough for our purposes. If det(A) 0, then A is said to be nonsingular, and so it has an inverse; there exists a unique matrix, B, such that A·B = I (where I is a diagonal matrix, with 1’s on the leading diagonal, and zero’s everywhere else). Further, B·A = I. B is usually written A -1 ; B is the inverse of A. Finally, if det(A) = 0, then A is singular, and so it does not have an inverse. A minor of an n x n matrix A is the determinant of any matrix obtained by choosing any h rows and any h columns of A where 0 < h n. Here, (n - h) rows (and columns) are eliminated, and an (h x h) sub-matrix remains. A principal minor is the determinant of a matrix found by choosing 1

Upload: phuong-ho

Post on 24-Oct-2015

15 views

Category:

Documents


0 download

DESCRIPTION

fsf

TRANSCRIPT

Page 1: Math Review

MATHS Extensions for ECON/MOFI Honours

The Greek alphabet. α = alpha, β = beta, γ = gamma, δ = delta, ε = epsilon, ζ = zeta, η = eta, θ = theta, ι = iota, κ = kappa, λ = lambda, µ = mu, ν = nu, ξ = xi, ο = omicron, π = pi, ρ = rho, σ = sigma, τ = tau, υ = upsilon, φ = phi, χ = chi, ψ = psi, ω = omega. 0. LOGIC Let p and q be two propositions (for example, 2 + 2 = 4; the set, A, is convex; the function, F: X → Y, has a maximum; New Zealand’s Prime Minister is a woman; all swans are white; all golden mountains are golden; Socrates was mortal; this sentence is not true). Sufficient conditions: p is a sufficient condition for q means that whenever p is true, so is q. Alternative expressions of the same relationship are p implies q (p ⇒ q), if p then q or q if p. Necessary conditions: p is a necessary condition for q means that p must be true whenever q is. Alternative expressions are p is implied by q (p ⇐ q) or q only if p. For example, "4 < x < 5" is a sufficient but not a necessary condition for "x > 3". Conversely, "x > 3" is a necessary but not sufficient condition for "4 < x < 5". Equivalent conditions: p is equivalent to q means that p is both necessary and sufficient for q. That is often expressed as p if and only if q or p iff q (p ⇔ q). 1. ALGEBRA (MI Appendix B; AC Ch 5, TY Ch 10, 11; KL R2-R4, R6; HQ A1, A3; RS Ch 4; AT Ch 0) A matrix with m rows (horizontal) and n columns (vertical) is said to be m x n. If m = n, then the matrix is square and so has a determinant. The determinant of a matrix can be thought of as a summary statistic. There is a sense in which it is a measure of size (when calculating the inverse matrix). Further, the sign of the determinant (positive or negative) can be thought of as a measure of orientation (when looking at quadratic forms).

For example, let A = ⎝⎛

⎠⎞a b

c d , det(A) = |A| = a·d - b·c.

More generally, det(A) = ∑j (-1)i+j·aij·Aij = ∑i (-1)i+j·aij·Aij, where aij is the element in row i, column j, and Aij is the determinant of the matrix found by removing row i, column j. Notice, this is an inductive definition. A more general definition exists but this will be good enough for our purposes. If det(A) ≠ 0, then A is said to be nonsingular, and so it has an inverse; there exists a unique matrix, B, such that A·B = I (where I is a diagonal matrix, with 1’s on the leading diagonal, and zero’s everywhere else). Further, B·A = I. B is usually written A-1; B is the inverse of A. Finally, if det(A) = 0, then A is singular, and so it does not have an inverse. A minor of an n x n matrix A is the determinant of any matrix obtained by choosing any h rows and any h columns of A where 0 < h ≤ n. Here, (n - h) rows (and columns) are eliminated, and an (h x h) sub-matrix remains. A principal minor is the determinant of a matrix found by choosing

1

Page 2: Math Review

the same h rows and columns. Hence, the diagonal of the matrix is part of the diagonal of A. For example, consider the set, {1, 2, 3,..., n}, and then take the subset, {1, 4, 5}. This gives the rows and columns to be used to form a submatrix. The determinant of this matrix then yields a principal minor. For example, the principal minors of

A = ⎝⎜⎜⎛

⎠⎟⎟⎞

a b cd e fg h i

are a, e, i, ⎪⎪

⎪⎪a b

d e , ⎪⎪

⎪⎪e f

h i , ⎪⎪

⎪⎪a c

g i and | A | . The k-th leading principal minor of A is the determinant of the first k rows and columns of A. Thus the leading principal minors of the 3 x 3 matrix above are

a, ⎪⎪

⎪⎪a b

d e , and | A |. A k’th principal minor is said to be of even order if k is even (while it is of odd order if k is odd). The transpose of a matrix, A, is written AT, where AT is constructed by turning the rows of A into columns (and the columns into rows). That is aT

ij = aji., where aij is the element of A in row i, column j. The matrix, A, is symmetric if AT = A. A symmetric matrix has the property that aij = aji. Suppose x is an n-dimensional column victor with real components. Let xT be the transpose of x (where x is a column vector, and xT is a row vector). Then a quadratic form in x is a real-valued expression of the form Q(x) = xT·A·x (= Σi Σj aij·xi·xj) where A is an n x n matrix which will be taken to be symmetric. A matrix A is said to be positive definite (PD) if Q(x) = xT·A·x > 0 for all x ≠ 0. When x = 0, Q(x) = 0. Notice, x ≠ 0 iff at least one of the components of x is non-zero. Note: Results 1-4 all presume that A is symmetric (AT = A). Result 1: A is positive definite if and only if all leading principal minors of A are positive (i.e. > 0). (See note below.)

Example: The leading principal minors of the matrix A = ⎝⎜⎜⎛

⎠⎟⎟⎞

3 1 01 4 -20 -2 4

are

3, ⎪⎪

⎪⎪3 1

1 4 = 11 and | A | = 32. Similarly A is negative definite (ND) if Q(x) < 0 for all x ≠ 0. Result 2: A is negative definite iff the leading principal minors alternate strictly in sign with the first one being negative (note that A is ND iff -A is PD). A is positive semi-definite (PSD) if xT·A·x ≥ 0 for all x, with xT·A·x = 0 for some x ≠ 0.

2

Page 3: Math Review

Note: According to this definition a matrix cannot be both PD and PSD. However, some textbooks define semi-definiteness in such a way that there is an overlap. We will use the term non-negative definite (NND) to mean PD or PSD. Result 3: A is positive semi-definite iff all principal minors (not just the leading ones) are non-negative (≥ 0) with | A | = 0.

Example: A = ⎝⎛

⎠⎞1 0

0 0 is PSD, but B = ⎝⎜⎜⎛

⎠⎟⎟⎞

0 0 00 -1 10 1 2

is not PSD even though its leading

principal minors are all ≥ 0 (consider x = (0, 1, 0)). A is negative semi-definite (NSD) if xT·A·x ≤ 0 for all x, with xT·A·x = 0 for some x ≠ 0 . Result 4: A is NSD iff the principal minors of even order are ≥ 0 (k is even), the principal minors of odd order are ≤ 0 and | A | = 0. A is non-positive definite (NPD) if it is either ND or NSD. A is indefinite if it is neither NND nor NPD. Eigenvalues …. Matrix manipulations, solutions to systems of equations … 2. TOPOLOGICAL CONCEPTS (MI Appendix A-3; AT, Ch 0-e, ED, GS) Topology is used to develop concepts such as closeness, convergence, and continuity. One of the basic concepts in topology is that of an open set. Let (0, 1) = {x ∈ R : 0 ≤ x ≤ 1}, and (0, 1) = {x ∈ R : 0 < x < 1} (square bracket means include the end point, round bracket means do not include; (0, 1) = {x ∈ R : 0 < x ≤ 1}). Consider a point, x0, in Euclidian n-space, Rn, and ε > 0 (a real number). Let the open ball with radius ε and centre x0 be given by B(x0, ε) = {x ∈ Rn: ||x - x0|| < ε}, where the length of a vector, y, is given by ||y|| = {∑j (yj)2}1/2. A subset, S, of a Euclidean space Rn is open if for every x0 in S, there exists ε > 0 (notice, ε can vary with x0), such that B(x0, ε) is contained in S. That is, S ⊃ B(x0, ε) = {x : ||x - x0|| < ε}. For example, {0 < x < 1, or 3 < x < 4} is open. A neighbourhood of a point x is any set with an open subset containing x. That is, S is a neighbourhood of x if there exists U such that (i) U is an open subset of S and (ii) x ∈ U. Thus, any open set is a neighbourbood of each of its points (but a neighbourhood need not be open). Consider a set T, of Rn. A point, x, is an interior point of T if there exists an open set, S, such that (i) S is a subset of T, and (ii) x ∈ S. x is a boundary point of T if every neighbourhood of x contains points in T and points in Tc = {y ∈ Rn: y ∉ T} (the complement of T).

3

Page 4: Math Review

For example, 0 and 2 are boundary points of the interval (0, 2) (and of the interval, (0, 2)) A subset T, of Rn, is closed if it contains all its boundary points. Alternatively, T is closed if Tc is open. Notice, an open set does not contain any of its boundary points. For example, (0, 2) ∪ (3, 5) = {(x,y): 0 ≤ x ≤ 2 or 3 ≤ y ≤ 5} is closed. The closure of a set S is the smallest closed set containing S; that is, S plus its boundary points.

Denote the closure of S by S- . Notice, S- is well-defined as any closed set containing S must also

contain S- (further, S- is closed) . Put another way, any closed set containing S must also contain all the boubdary points of S. Consider now a function from Rn to R; F: Rn → R. F is continuous - at x - if for any neighbourhood of F(x), call it U, there exists a neighbourhhod of x - call it V - such that U ⊃ F(V). That is, no matter how small U is, I can find a neigbourhood of x that maps into U. A function, F: X → Y, is continuous (everywhere) if it is continuous at x, for all x ∈ X. Let U be a set in Y. F-1(U) = {x ∈ X: F(x) ∈ U}, often called the pre-image of U. Further define the graph of F(·) to be {(x, y) ∈ X x Y : y = F(x)}. That is, the graph of F(·) is a subset of X x Y (X x Y is often referred to as the product space), given by all pairs (x, F(x)). Result 5. (i) F: X → Y, is continuous if and only if for any open set in Y - call it U - the set F-1(U) is open. (ii) F: X → Y, is continuous if and only if for any closed set in Y - call it U - the set F-1(U) is closed. (iii) F: X → Y, is continuous if and only if the graph of F(·) is closed. Semi continuity =====> Extensions p.3 The derivative of a function requires the idea of convergence. A necessary condition for a function to be differentiable at x is that it is continuous at x (but this is not sufficient). Consider

F: R → R, and look at G(h) = h

xFhxF )()( −+ . Consider now a sequence, {hn}, such that hn →

0. If G(hn) → G*, for every such sequence (G(hn) is convergent, and the limit is unique, for every sequence, {hn}), then this limit is called the derivative of F(·), at x. F: Rn → R is differentiable if it is differentiable at x, for all x ∈ Rn. Notice, the derivative can be thought of as a function from Rn to Rn; , DF: Rn → Rn. F: Rn → R, is continuously differentiable (C1) if it is (i) differentiable at x, for all x ∈ Rn, and (ii) the derivative, DF: Rn → Rn, is continuous. F: Rn → R, is twice continuously differentiable (C2) if it is differentiable, the derivative, DF, is differentiable, and the second-order derivative, D2F: Rn → Rn2, is continuous. The matrix of second-order derivatives is often called the Hessian matrix. F is Cr if the partial derivatives of order r exist and are continuous functions. 3. THE IMPLICIT FUNCTION THEOREM (AB pp.24-32; HQ pp.372-375; MI p.499; KL p.327)

4

Page 5: Math Review

Let x ∈ Rm, and y ∈ Rn, and consider a function G: Rm+n → Rm. Alternatively, consider m functions from Rm+n to R. G can be thought of as the list of these m functions, g1,..., gm. Further, we can think of elements in the domain as (x, y), where x ∈ Rm, y ∈ Rn. Suppose G is continuously differentiable. The derivative, DG is a function from Rm+n to R(m+n)·m. That is, DG(x*, y*) can be thought of as a matrix of derivatives (evaluated at (x*, y*)), with m rows and (m + n) columns; often called the Jacobian.

J ≡ DG(x*, y*) =

⎝⎜⎜⎜⎜⎛

⎠⎟⎟⎟⎟⎞∂g1

∂x1 ∂g1∂x2

... ∂g1∂yn

∂g2∂x1

∂g2∂x2

... ∂g2∂yn

∂gm∂x1

∂gm∂x2

... ∂gm∂yn

Let DxG(x*, y*) be the m by m matrix of derivatives with respect to x (but not y). Result 6: The implicit function theorem. Let G: Rm+n → Rm, be Cr (r ≥ 1), and consider the system of equations, G(x, y) = c, and suppose (x*, y*) is a solution. That is, G(x*, y*) = c. If DxG(x*, y*) is nonsingular (the determinant is non-zero), then: There exists a neighbourhood of y* - call it U (a subset of Rn) - and a Cr function, F: U → Rm, such that G(F(y), y) = c, for all y ∈ U. That is, x can be expressed as a continuous function of y. To decide whether or not a particular subclass of m variables can be so expressed, calculate the m x m determinant of the corresponding m columns of the Jacobian; if that is non-zero then those variables can be expressed as a function of the others in a neighbourhood of the point at which the determinant was evaluated. Example: Suppose x, y and z satisfy the equations: x + y + z = 4, and x + y + 2·z = 6.

Then J = ⎝⎛

⎠⎞1 1 1

1 1 2 It is possible to solve for y and z in terms of x since the determinant of the second and third

columns of J is ⎪⎪

⎪⎪1 1

1 2 = 1. However it is not possible to solve for x and y in terms of z. Implicit differentiation. Suppose f(x,y)=0. You want to know dy/dx. Calculate total differential: df=f_x dx + f_y dy=0 (f_x, f_y partial derivatives) ===> dy/dx= -f_x/f_y

5

Page 6: Math Review

4. CONCAVE AND CONVEX FUNCTIONS (MI pp.460-467; AB pp.124-128; KL pp.331-334; AT, Ch 0-A-d) Let X be a subset of Rn. X is convex if for every pair of elements x, y ∈ X and every real number α ∈ (0, 1) ≡ {c : 0 < c < 1} (remember, round brackets mean open, square brackets mean closed or includes the boundary point) α·x + (1-α)·y ∈ X i.e every convex linear combination of points in X is also a member of X. A function F: X → R, is concave if for every pair of elements x, x' of X and every α ∈ (0, 1): F(α·x + (1-α)·x') ≥ α·F(x) + (1-α)·F(x') That is, F(·) is concave if the graph of F(x) lies above the line segment connecting any two points, (x, F(x)) and (x', F(x')) A function F: X → R, is convex if for every pair of elements x, x' of X and every α ∈ (0, 1): F(α·x + (1-α)·x') ≤ α·F(x) + (1-α)·F(x') Put another way, let F : Rn → R, and consider the sets G-(F) = {(x, y) ∈ Rn+1: y ≤ F(x)} and G+(F) = {(x, y) ∈ Rn+1: y ≥ F(x)}. F is concave if G-(F) is convex, while F is convex if G+(F) is convex. Concave/convex functions of a single variable look like this:

Concave

Convex

F(.) F(.)

x α·x + y x α·x + y (1-α)·y (1-α)·y F is strictly concave over X if whenever x and y are two distinct elements of X and α ∈ (0, 1). F(α·x + (1-α)·y) > α·F(x) + (1-α)·F(y) If (x, y) and (x', y') ∈ G-(F), then (α·(x, y) + (1 - α)·(x', y')) is an interior point of G-(F). F is strictly convex over X if whenever x and y are two distinct elements of X and α ∈ (0, 1). F(α·x + (1-α)·y) < α·F(x) + (1-α)·F(y) Note: Any linear function is concave (and convex) but not strictly so. Result 7: Consider a twice continuously differentiable (C2) function F : X → R. (i) F(·) is concave over an open convex set X iff the Hessian of F is NPD - ND or NSD - over X. F(·) is convex over an open convex set X iff the Hessian of F is NND - PD or PSD - over X. (ii) If the Hessian is ND (PD) over X then F will be strictly concave (strictly convex) over X. Note: The converse of the last statement is false: F may be strictly concave over X without having a ND Hessian at each point in X. For example, F(x) = 1 - x4 is strictly concave over (-1, 1), but its Hessian, -12x2, is not ND at x = 0.

6

Page 7: Math Review

5.OPTIMIZATION (AB Ch. 2; MI Ch. 2-4; TY Ch. 5; KL Ch. 2, 4, 5; PS Appendix A; HQ A-3; AD Ch. 8; RS Ch.11; AT, Ch 1) Definition: Consider a function F: X → R (X is a subset of Rn). F(·) has a global maximum at a point x* ∈ X if F(x) ≤ F(x*) for all x ∈X. x* is a global minimum if F(x) ≥ F(x*). Notice, F may have a global maximum at x* and x', with x' ≠ x* (but F(x*) = F(x')). F has a local maximum (relative to X) at x* if F(x) ≤ F(x*) for all x ∈ X sufficiently close to x*, That is, there exists δ > 0 such that for all x satisfying ||x - x*|| < δ, F(x) ≤ F(x*). Note: Global optima are sometimes referred to as absolute, while local optima are relative. Result 8: (Weierstrass): If F: X → R is a continuous function defined on a non-empty, compact (closed and bounded) subset X of Rn, then F will have a global maximum (and a global minimum) somewhere in X, either on the boundary or in the interior of X. Many techniques for solving optimization problems yield local maxima or minima rather than global ones. It is useful to be able to argue that a local optimum is also global. Result 9: Suppose X is a convex subset of Rn, and F: X → R is a concave function. If F has a local maximum at x* in X, then F has a global maximum over X at x*. Suppose F: X → R is a convex function. If F has a local minimum at x* in X, then F has a global minimum over X at x*. If F strictly concave (convex) then the global maximum (minimum) is unique. UNCONSTRAINED OPTIMIZATION A differentiable function F defined on a subset of Rn will have a local maximum (minimum) at an interior point of its domain if its gradient vector DF is zero there and F is concave (convex) over a neighbourhood of that point. It will be a strict local maximum (minimum) if in addition F is strictly concave (strictly convex) over a neighbourhood of the point. Result 10: Necessary conditions for a C2 function F: Rn → R, to have a local maximum (minimum) at an interior point, x*, of its domain (in this case, interior means -∞ < x* < ∞), are (10.1) DF(x*) = 0 (10.2) D2F(x*) is ND or NSD (PD or PSD) In terms of determinants, the necessary second-order condition for a maximum is that the even-ordered principal minors of the Hessian of F are ≥ 0 while those of odd order should be ≤ 0. (For a minimum all principal minors must be ≥ 0.) Result 11: Sufficient conditions for a C2 function F: Rn → R, to have a local maximum (minimum) at an interior point x* are (11.1) DF(x*) = 0

7

Page 8: Math Review

(11.2) D2F(x*) is ND (PD). Proof: (11.2) implies the Hessian of F is ND in a neighbourhood of x* (by continuity).

By Taylor's expansion, F(x* + y) = F(x*) + DF(x*)·y + ⎝⎜⎛

⎠⎟⎞1

2 ·yT·D2F(x*)·y = F(x*) + ⎝⎜⎛

⎠⎟⎞1

2

·yT·D2F(x*)·y < F(x*), for all y ∈ B(x*, ε), for ε small enough.

Here, Taylor's Series comes from integration by parts.

For one variable, F(x) = F(0) + ∫x0 DF(v)·dv. Integration by parts says that ∫x0 DG(v)·H(v)·dv =

G(x)·H(x) - G(0)·H(0) - ∫x0 G(v)·DH(v)·dv. Hence, F(x) = F(0) + ∫x0 DF(v)·dv = F(0) + x·DF(x) -

∫x0 v·D2F(v)·dv = F(0) + x·DF(x) - x2·D2F[x]

2 + ∫x0 v2·D3F[v]·dv

3 .

Therefore F(0) = F(x) - x·DF(x) + x2·D2F[x]

2 - ∫x0 v2·D3F[v]·dv

3 , or

F(x) = F(0) + x·DF(0) + x2·D2F[0]

2 + ∫x0 v2·D3F[v]·dv

3 . •

Note 1: If (11.1) and (11.2) hold then the local maximum (minimum) at x* is strict; F(x*) > F(x), for all x ≠ x*, such that x is close to x*. Note 2: Suppose the function, F : Rn → R, can be characterized by m parameters, y ∈ Rm. That is, there exists a function, G : Rn+m → R, such that G(·, y) = F(·) : Rn → R. If G is C2, and F is strictly concave (convex) in x - for any y - then DxG(x, y) = 0 yields a C1 function in (x, y). Hence the implicit function theorem can be appealed to to show that the unconstrained maximum (minimum) can be expressed as a C1 function of the parameters; written x*(y). OPTIMIZATION WITH EQUALITY CONSTRAINTS Suppose we want to maximise (minimize) a function F: Rn → R, subject to m constraints g1(x) = b1, g2(x) = b2,..., gm(x) = bm, or G(x) = b, where G: Rn → Rm. where F, g1,..., gm are all C2 (twice continuously differentiable). Notice, we have n variables (x1,..., xn), and m constraints, G(x) = b. If F has a constrained max (or min) at an interior point x* of its domain (in this case, x* is finite), and the m x n Jacobian matrix DG(x*) has rank m at x*, then there will exist an m-dimensional row vector λ such that: (A) DF(x*)T - λ*Τ·DG(x*) = 0Τ (For proof see Appendix or MI Chapter 3) This is the justification behind the Lagrange multiplier procedure, i.e. forming the Lagrangian function L(x, λ) = F(x) + λΤ·(b - G(x))

8

Page 9: Math Review

and solving the equations (B) DxL(x*, λ*) = 0, and DλL(x*, λ*) = 0. Note: DxL(x*, λ*) = 0, is equivalent to (A), above,while DλL(x*, λ*) = 0 reduces to the constraint equations, G(x) = b. Equations (B) are first-order necessary conditions for a maximum or a minimum; identification of the type of (local) optimum may usually be accomplished by means of the following set of sufficient conditions. Result 12: If F is C2 and G is C2 with rank DG = DxG = m (the matrix of partials is of full rank)1, then F will have a strict local maximum (minimum) at a point x* if there exists λ* such that

(12.1) DxL(x*, λ*) = 0, and DλL(x*, λ*) = 0. (12.2) The matrix, D2

x L(x*, λ*) = D2x F(x*) - λ*·D2

x G(x*) is ND (PD) at x* subject to the m conditions, DG(x*)·dx = 0 (this ensures dx moves so as to preserve G(x) = b) That is, for all dx ≠ 0 such that DG(x*)·dx = 0, (dx)Τ·{D2

x L(x*, λ*)}·dx < 0. (12.2) can be expressed in terms of determinants as follows (see, for example, GD): (12.2)' The last (n - m) leading principal minors of the bordered Hessian (of size (m + n)) 0m,m DxG _______________________

(DxG)Τ D2x L

strictly alternate in sign, with the first being (-1)m+1 and the last (-1)n (for a minimum they should all have the sign of (-1)m). For example, if F is a function of 3 variables and there is a single constraint g(x) = b then condition (12.2)' for a maximum is

⎪⎪⎪

⎪⎪⎪0 g1 g2

g1 L11 L12g2 L21 L22

> 0 ,

⎪⎪⎪⎪

⎪⎪⎪⎪0 g1 g2 g3

g1 L11 L12 L13 g2 L21 L22 L23 g3 L31 L32 L33

< 0

1The rationale for this requirement is pointed out in the appendix.

9

Page 10: Math Review

while for a minimum the condition is that the two determinants should both be negative. Note: Result 12 gives sufficient conditions, not necessary ones; there may still be a maximum or minimum even if one or more of the leading principal minors should equal zero. INTERPRETATION OF THE LAGRANGE MULTIPLIER Let F* denote the maximum value of F(x) subject to G(x) = b. F* can be interpreted as a function of the parameters, b. That is, F*(b). Then it can be shown that the Lagrange multiplier vector corresponding to the optimal x* is

λ* = DbF*(b) = ∂F*∂b

i.e. λ*j is the rate of increase (sensitivity) of the optimal value of the objective function with

respect to increases in the constant for the j’th constraint. (For a minimization problem, λ*j is

the shadow price of the quantity limited by the j’th constraint.) Proof: (Also a proof of the Envelope Theorem see Varian Appendix A.13). The optimal x is x* ≡ {x : max. F(x), G(x) = b} = x(b) which is often stated x(b) ≡ ARG max {F(x): G(x) = b} where ARG stands for the argument of the function F(·). To say x is an argument of F(·) means F is a function of x. The optimal value of the objective function is F(x(b)) ≡ F(x(b)) + λ(b)Τ·{b - G(x(b))} because b - G(x(b)) = 0. Note that λ(b) is the optimal value of λ and so it is also a function of b. Now, change bk. The effect of this on the objective function will be

∂F(x(b))∂bk

= ∂{F(x(b)) + λ(b)Τ[b - G(x(b))]}

∂bk

= DF·(∂x∂bk

) + ∂λ(b)Τ

∂bk ·(b - G(x(b))) + λk(b) − λ(b)·DG·(

∂x∂bk

)

= λk(b) + (DF − λ(b)·DG)·(∂x∂bk

) (as b - G[x(b)] = 0)

= λk(b) (from the first-order conditions (A)). This establishes the result. The envelope theorem would get there directly by using the fact that changes in x(b) and λ(b) can be ignored when differentiating F(x(b)) with respect to one of the b's (constraining variables). OPTIMIZATION UNDER INEQUALITY CONSTRAINTS

10

Page 11: Math Review

Let F: Rn → R, and G: Rn → Rm. Consider the optimization of F(x) subject to the inequality constraints G(x) ≥ b (which can include non-negativity constraints x ≥ 0). Let G(x) = (G1(x),..., Gm(x)) ∈ Rm. Note: An equality constraint Gj(x) = bj can be written as two inequality constraints : Gj(x) ≥ bj and -Gj(x) ≥ -bj. Result 13: Suppose F, G1(x),..., and Gm(x) are (i) C1, (ii) concave functions, and (iii) there exists x0 such that G(x0) » b (called Slater's Constraint Qualification; weaker constraint qualifications are possible, so that failure of Slater's CQ does not necessarily rule out the Lagrange multiplier method). Suppose x* is an interior point of the domain (in this case the domain is Rn). If G(x*) ≥ b and F(x*) ≥ F(x), for all x such that G(x) ≥ b, there exists a vector λ* such that the Lagrangian function: L(x, λ) = F(x) + λΤ·(G(x) - b) satisfies the following conditions at (x*, λ*): (1) DxL(x*, λ*) = 0 (that is, DF(x*) + λ*Τ·DG(x*) = 0Τ) (2) DλL(x*, λ*) ≥ 0 (that is, G(x*) ≥ b) (3) λ*Τ·DλL(x*, λ*) = 0 (4) λ* ≥ 0 Conditions (1) - (4) are called the Kuhn-Tucker conditions. See, for example, AT, pp.72-73, and 90-92; MI pp.56-58. Notice, Equations (2)-(4) imply that for each j = 1,..., m, either λ*

j = 0, or Gj(x*) = bj (or both). The interpretation of the optimal values of the Lagrange multipliers is the same as for the case of equality constraints:

λ*j =

-∂F*∂bj

(as an increase in bj reduces the set of feasible x-values)

Result 13 is derived using a Separating Hyperplane Theorem such the following:

Let X be a non-empty, convex set in Rn, such that 0 ∉ X. Then there exists p ∈ Rn, such that p ≠ 0 and ||p|| < ∞, such that p·x ≤ p·0 = 0, for all x ∈ X. That is, the hyperplane, p·x = 0, separates X from 0. Now, let x* be such that G(x*) ≥ b, and F(x*) ≥ F(x), for all x such that G(x) ≥ b. Consider the new function, H(x) = {F(x) - F(x*), G1(x) - b1,..., Gm(x) - bm} ∈ Rm+1. First, there exists no x ∈ Rn such that H(x) » 0. Second, let Zx = {z ∈ Rm+1: z « H(x)}, and define Z =

x Rn Zx. Notice, 0 ∉ Z and Z is convex - as H is a concave function. By the separating

hyperplane theorem, there exists p such that p·z ≤ 0, for all z ∈ Z. As zi can be made as

11

Page 12: Math Review

negative as we like, it follows that pi ≥ 0, for all i = 1,..., m + 1 (if pi < 0, fix zj, for all j ≠ i, and let zi become very negative; yielding p·z > 0). Let Z* be the closure of Z (to consider z ≤ H(x)). By continuity, p·z ≤ 0, for all z ∈ Z*. Changing the notation slightly, there exists (q, p) such that q·{F(x) - F(x*)} + p·{G(x) - b} ≤ 0, for all x ∈ Rn. Further, if there exists x0 ∈ Rn such that G(x0) » b (Slater’s constraint), then q ≠ 0 (if q = 0, then p ≠ 0 - with p ≥ 0 - and so p·{G(x0) - b} > 0; a contradiction), and so:

F(x) - F(x*) + (pq )·{G(x) - b} ≤ 0. Let (

pq ) = λ* ≥ 0.

Hence, F(x) + λ*·{G(x) - b} ≤ F(x*), for all x ∈ Rn. Now, consider x = x*; F(x) + λ*·{G(x) - b} ≤ F(x*) implies λ*·{G(x*) - b} = 0. Further, as

λ* ≥ 0, and {G(x*) - b} ≥ 0, it must be the case that either Gi(x*) - bi = 0 or λ*i = 0. Finally,

λ·{G(x*) - b} ≥ λ*·{G(x*) - b} = 0, for all λ ≥ 0. Hence: (i) F(x) + λ*·{G(x) - b} ≤ F(x*) + λ*·{G(x*) - b}, for all x ∈ Rn. (ii) F(x*) + λ*·{G(x*) - b} ≤ F(x*) + λ·{G(x*) - b}, for all λ ≥ 0. That is, the point, (x*, λ*) yields a saddlepoint for F(x) + λ·{G(x) - b}; varying x does not increase the function, while varying λ does not decrease it. The final step in establishing Result 13 involves showing that a saddlepoint satisfies the Kuhn-Tucker conditions. To maximize with respect to x requires the first-order condition, DF(x*) + λ*·DG(x*) = 0, as x is unconstrained, while to minimize with respect to λ ≥ 0

requires either Gi(x*) - bi = 0 or λ*i = 0.

Corollary to Result 13. Consider now the problem of maximizing F(x), subject to G(x) ≤ b and x ≥ 0. Notice, the non-negativity constraints are no longer included in G(·), and we look at b - G(x) ≥ 0, rather than G(x) - b ≥ 0. Suppose F, (-1)·G1(x),..., and (-1)·Gm(x) are (i) C1, (ii) concave functions (and so Gi(x) is convex), and (iii) there exists x0 such that G(x0) « b and x0 » 0. If x* ≥ 0, G(x*) ≤ b, and F(x*) ≥ F(x), for all x such that x ≥ 0 and G(x) ≤ b, there exists a vector λ* such that the Lagrangian function L(x, λ) = F(x) + λΤ·(b - G(x)) satisfies the following conditions at (x*, λ*): (1') DxL(x*, λ*) ≤ 0 (that is, DF(x*) - λ*Τ·DG(x*) ≤ 0Τ) (2') DxL(x*, λ*)·x* = 0 (3') x* ≥ 0 (4') DλL(x*, λ*) ≥ 0 (that is, b - G(x) ≥ 0) (5') λ*Τ·DλL(x*, λ*) = 0 (6') λ* ≥ 0

12

Page 13: Math Review

Notice, either ∂L∂xi

= 0, or xi = 0, and either Gj(x) = bj or λ*j = 0

Note: To derive the Kuhn-Tucker conditions for a minimization problem - minimize F: Rn → R, subject to G(x) ≥ b (where F(·) is convex and Gi(x) is concave) - simply maximize (-1)·F. This yields the Lagrangian

L = (-1)·F(x) + λΤ(G(x) - b) in which case equation (1) becomes (1)" DxL(·) = 0 (that is, (-1)·DF(x*) + λ*Τ·DG(x*) = 0Τ)

The interpretation of the optimal values of the Lagrange multipliers is now λ*j =

∂F*∂bj

Return to the original problem: Maximise F(x) subject to the inequality constraints G(x) ≥ b (F: Rn → R, and G: Rn → Rm). Result 14: If F(x), G1(x),..., and Gm(x) are (i) C1, and (ii) concave functions, then the Kuhn-Tucker conditions, (1)-(4), are sufficient for a global maximum. In simple cases it may be possible to solve the Kuhn-Tucker conditions by following these steps: (1) Draw a diagram of the feasible region and hence or otherwise identify the corner-points. (2) Try for an interior solution by solving DxF(x*) = 0. (3) If (2) fails, try for a solution at a likely corner-point, setting xi = 0 or Gj(x) = bj as appropriate. Here, a corner solution means the constraints (holding with equality) completely characterise x*.

Recall that λ*j = 0 if Gj(x*) > bj.

(4) If (3) fails, try for a solution on a likely part of the boundary (some constraints hold with equality, but not enough to completely characterise x*). (5) Having found a solution, try to show that it gives a global optimum, using Result 14 (or extensions - see next section) or heuristic arguments. Example: Maximise 6x1 - 2x1

2 + 2x1x2 - 2x22

subject to 3x1 + 4x2 ≤ 6, x1 + 2 ≥ 4x22, x1 ≥ 0, x2 ≥ 0.

L = 6x1 - 2x12 + 2x1x2 - 2x22 + λ1·(6 - 3x1 - 4x2) + λ2·(x1 + 2 - 4x2

2) + µ1·x1 + µ2·x2.

13

Page 14: Math Review

x2

1

B

1 A

x1

Solving ∂F∂x1

= ∂F∂x2 = 0 gives x1 = 6, x2 = 3, but this point is not feasible.

Trying for a solution at A = (2, 0), we set x1 = 2, x2 = 0, λ2 = 0, and µ1 = 0, and solve

∂L∂x1

= 6 - 4x1 + 2x2 - 3λ1 + λ2 + µ1 = 0

That gives λ1 = -2/3 < 0, so A cannot be a solution. Trying for a solution on AB between A and B we set λ2 = 0 and solve

∂L∂x1

= 6 - 4x1 + 2x2 - 3λ1 = 0 ∂L∂x2

= 2x1 + 4x2 - 4λ1 = 0 3x1 + 4x2 = 6

That gives a solution x1 = 5437 , x2 =

1537 , x3 =

1237 , satisfying all the Kuhn-Tucker conditions. The

objective function is strictly concave and both constraint functions are convex, so we have the unique global maximum by Result 14. Alternatively, the result is geometrically clear since contours of F are ellipses centres on (2, 1). 6. QUASICONCAVITY (AE; MI p.464; RS pp.377-391; AT, Ch 1-E) Let X be a convex subset of Rn, and F: X → R, a function. F is quasiconcave on X if for every distinct pair x, y ∈ X and every real number α ∈ (0, 1). F(α·x + (1-α)·y) ≥ min {F(x), F(y)} That is, on the straight line between x and y, F(·) never falls below the minimum of F(x) and F(y). Similarly, F is quasiconvex if for x, y, and α as above, F(α·x + (1-α)·y) ≤ max{F(x), F(y)}. Result 15: Let X be a convex subset of Rn, and F: X → R, a function. F is quasiconcave over X iff {x ∈ X: F(x) ≥ c} is a convex set for all real numbers c.

14

Page 15: Math Review

In other words, a function is quasiconcave if its AsGood sets, {x ∈ X: F(x) ≥ c}, are all convex. (for quasiconvexity the AsBad sets, {x ∈ X: F(x) ≤ c}, are all convex). What quasi concavity means is that the function is essentially uniodal (single-humped) in every direction i.e. the graph may continue to rise indefinitely, but once it starts decreasing it may not increase again. Thus these are graphs of quasiconcave functions of one variable (or profiles of quasiconcave functions of more than one variable in a given direction):

but the following are not:

Note: (i) F is strictly quasiconcave (strictly quasiconvex) if F(α·x + (1-α)·y) > min{F(x), F(y)}, for all α ∈ (0,1). (ii) F is quasiconvex iff -F is quasiconcave. (iii) A function G: Rn → Rm, is quasiconcave (quasiconvex) if each Gj is. (iv) Any concave function is also quasiconcave, since α·F(x) + (1-α)·F(y) ≥ min{F(x), F(y)}. However, the converse is false. For example, F(x1, x2) = x1·x2 is quasiconcave but not concave. It is possible for even a convex function to be quasiconcave; for example, F(x) = x2 is strictly quasiconcave over (0, ∞). In general, any monotone function of a single variable, x, will be both quasiconcave and quasiconvex. Further, any linear function will be both quasiconcave and quasiconvex. Result 16. A non-decreasing transformation of a concave (or quasiconcave) function will be quasiconcave. That is, consider F: Rn → R, is concave, G: R → R, is non-decreasing (x ≥ y implies G(x) ≥ G(y)), and H: Rn → R, is given by H(x) = G{F(x)}. Then H(·) is quasiconcave.

Result 17: Suppose ∂F∂xi

does not change sign throughout X. Then a sufficient condition for f to

be quasiconcave over X is that (-1)r·Dr should be positive for all x ∈ X and r = 1, 2,..., n

15

Page 16: Math Review

where Dr =

⎪⎪⎪⎪

⎪⎪⎪⎪0 F1 F2 .... Fr

F1 F11 F12 .... F1r.. .. ..

Fr Fr1 Fr2 .... Frr

Note: 1. n must be ≥ 2 for this result to hold 2. The condition requires that the last n leading principal minors of a bordered Hessian should alternate -, +, -, + ... This condition is sufficient for the quadratic form dx'·D2F·dx to be negative for all dx ≠ 0 satisfying DxF·dx = 0 i.e. for F to be strictly concave along the tangent to its contour (which means that the contour should lie on the preferred side of the tangent.). 3. Necessary conditions are that (-1)r·Dr ≥ 0 for all x ∈ X and r = 1,..., n. 4. Arrow and Enthoven state a result very like Result 17, but with X={x : x ≥ 0} and without any

restriction on the sign of ∂F∂x (AE pp.781- 2) .

5. Often you can show that the conditions of Result 17 hold in the open orthant {x : x ≥ 0} but not on the axes. In such cases you can use the following result, provided the function is continuous.

Result 18: If F is quasiconcave over a set S and continuous over its closure S- , then F is also

quasiconcave over S- . QUASICONCAVITY AND SUFFICIENCY RESULTS Some of our earlier results may be extended by weakening concavity requirements to quasiconcavity. For example, the local-global theorem (Result 9) still holds if F is strictly quasiconcave and the global maximum will be unique (quasiconcavity alone is not enough to ensure a global maximum). Result 14 on the sufficiency of the Kuhn-Tucker conditions may be extended by requiring only that the constraint functions be quasiconcave. The concavity requirement on F may be weakened to quasiconcavity provided that an additional condition is imposed. Result 19: Suppose F is a C1 quasiconcave function and each Gj is a C1 quasiconcave function (j = 1, 2,..., m). If the Kuhn-Tucker conditions hold at (x*, λ*) and DxF(x*) ≠ 0 then F has a global maximum at x* subject to G(x) ≥ b. The role of the condition, DxF(x*) ≠ 0 is to rule out F(x*) lying on a plateau that is not a global maximum (F(x) = F(x*), for x near x*, while there exists x+ such that F(x+) > F(x*)). (See Arrow and Enthoven (p. 783) for alternative conditions)

16

Page 17: Math Review

APPENDIX (MI P32)

The rationale for the procedure used to optimise constrained problems by means of Lagrangean techniques is based on the ability to convert the constrained problem into an unconstrained problem and apply techniques for unconstrained problems. Let x ∈ Rm and y ∈ Rn, F: Rm+n → R, and G: Rm+n → Rm. We wish to Maximise F(x, y) subject to G(x, y) = b Now if we could solve for x from the m constraints, which we could express as x = H(b, y) (A1) then we could substitute x in F and end up with the unconstrained problem max F(H(b, y), y) (A2) which involves the choice of y ∈ Rn. The optimal value of x is then obtained by substituting the optimal value for y in (A1). Notice that the requirement of Result 12 that the rank of DG be m means, by the implicit function theorem, that H(·) in A1 exists. Substituting H(·) in the constraint equations we get G(x, y) = G(H(b, y), y) = b which upon differentiation by y yields DxG·DyH + DyG = 0 which (by the Jacobian assumption the mxm matrix DxG is nonsingular) means that DyH = (-1)·(DxG)-1·DyG (A3) Now a necessary condition for a solution to (A2) yields he first order condition: (DyF)T + (DxF)T·DyH = 0, or (from (A3)) (DyF)T - (DxF)T·(DxG)-1·DyG = 0. (A4.1) By definition (as A-1A = I for any nonsingular matrix) (DxF)T - (DxF)T·(DxG)-1·DxG = 0 (A4.2) If we define the Lagrange multiplier to be λT = (DxF)T·(DxG)-1

17

Page 18: Math Review

Then first-order conditions to the unconstrained problem (A2) are (DyF)T - λT·DyG = 0, and (DxF)T - λT·DxG = 0. which are those we would obtain by setting up a Lagrangean.

REFERENCES

(AE) Arrow and Enthoven, Quasi-concave programming, Econometrica 29 (1961) 779-800. (AB) A Benavie, Mathematical Techniques for Economic Analysis (HB 74/M3/B456/M) (AC) A. Chiang, Fundamental methods of mathematical economics (HB135 C532 F) (GD) G. Debreu, Definite and semi-definite quadratic forms, Econometrica 20 (1952) 295-300. (ED) E. Dierker, Topological methods in Walrasian economics (HB135 D563 T). (AD) A.K. Dixit, Optimization in Economic Theory (HB135/F619/0.) (HQ) J. Henderson & R. Quandt, Microeconomic Theory (3rd edition) (HB171/H496/M) (MI) M. Intriligator, Mathematical Optimization and Economic Theory (HB74/M3/161/M) (KL) K. Lancaster, Mathematical Economics (HB74/M3/L244/M) (RS) B. Roberts and D. Schulze, Modern Mathematics and Economic Analysis (HB74/M3/R643.) (PS) P. Samuelson, Foundations of Economic Analysis (HB171/S193/F) (GS) G. F. Simmons, Introduction to topology and modern analysis (QA611 S592 I) (AT) A. Takayama, Mathematical Economics (2nd edition) (HB135/T136/M) (TY) T. Yamane, Mathematics for Economists (2nd edition) (HB74/M3/Y19/M)

18

Page 19: Math Review

EXERCISES

1. Identify the following matrices as PD, PSD, ND, NSD, or none of these:

A = ⎝⎛

⎠⎞1 2

2 3 B = ⎝⎛

⎠⎞5 0

0 0 C = ⎝⎛

⎠⎞-2 1

1 -2

D = ⎝⎛

⎠⎞-2 -1

-1 -2 E = ⎝⎛

⎠⎞4 -1

-1 4 F = ⎝⎛

⎠⎞0 0

0 0

G = ⎝⎜⎜⎛

⎠⎟⎟⎞

1 0 10 2 11 1 1

H = ⎝⎜⎜⎛

⎠⎟⎟⎞

1 0 -2 0 4 4-1 4 5

J = ⎝⎜⎜⎛

⎠⎟⎟⎞

1 2 52 4 105 10 10

2. (a) Under what conditions will a square matrix possess an inverse? (b) Which of the following types of matrices will have inverses? (i) PD (ii) PSD (iii) ND (iv) NSD (v) NND (vi) indefinite matrices. Explain your answers. 3. Give the second-order partial derivatives and also the second total differential of these functions:

f(x1 , x2) = 3x12 -

5x1

x2 + (7x1)·(x2)4 + 11

g(x1 , x2, x3) = 2x1 + 4x2 .log(x3) 4. (a) Find the solutions for y and z in term of x for the system of equations in the example on page 4, and try to solve the system for x and y in terms of z. (b) Suppose w, x, y and z satisfy the equations 7·w - x - y + 4·z = 0 10·w - 2·x + y + z = 0 6·w + 3·x - 2·y - 11·z = 0 Use the implicit function theorem to verify that w, x and y can be expressed as functions of z and calculate these functions. (c) Suppose x, y and z satisfy the equations x + y + z = 1 x2 + y2 + z2 = 1

19

Page 20: Math Review

Is it possible to solve for x and y in terms of z? (Use the implicit function theorem but also draw a diagram). 5. Is q=u1

2 + 6·u22 + 3·u3

2 - 2·u1·u2 - 4·u2·u3 a convex function? 6. What does the Weierstrass Theorem tell you about the existence of solutions to these problems: (i) Maximise x·e-x2 subject to 0 < x < ∞ (ii) Minimise x·e-x2 subject to 0 < x < ∞ (iii) Maximise x1

2 + 2x1·x2 + 4x22 subject to 3·x1 + 4x2 = 7

(iv) Minimise " " " " " Do solutions exist? 7. Obtain solutions to these problems: (i) Maximise x1

2 + x22 for 2 ≤ x1 ≤ 4, 3 ≤ x2 ≤ 5

(ii) Minimise 2x2 + 3·x for -10 ≤ x ≤ 10, (iii) Maximise 2·(x1)1/2 (x2)1/3 - x1 - x2 for x1 ≥ 0, x2 ∈ R (iv) Minimise (x1)-1/4·(x2)-1/2 subject to 2·x1 + 12·x2 = 4 Justify your answers with reference to second-order conditions. 8. Consider the following problem: max 2·x1 + x2 subject to x2 - (1 - x1)3 ≤ 0 (i.e. x2 ≤ (1 - x1)3), x1 ≥ 0, x2 ≥ 0 (a) Use Weierstrass' Theorem to show that the problem has a solution. (b) Sove the problem graphically. (c) Show that the Kuhn-Tucker conditions are not satisfied at the optimum point. (d) Reconcile the above with Result 13. 9. Solve the following problems using the Kuhn-Tucker conditions: (a) Maximise x1

2 + x22 subject to 2 ≤ x1 ≤ 4, 2 ≤ x2 ≤ 4.

20

Page 21: Math Review

(b) max 2x1 + x2 subject to 4·x1

2 + x22 ≤ 16

x1 - 2x2 ≥ -6 x1 + x2 ≥ 3 x1 ≥ 0, x2 ≥ 0 10. Explain without solving whether or not the following problem has a solution: min 3x12 + x22 subject to x1 + 2x2 < 10 x1 + x2 ≥ 4 x1 ≥ 0 11. Express the following problems in the form max F(x) subject to g(x) ≤ b, x ≥ 0 and write down the Kuhn-Tucker conditions in each case: (a) max 4·x13 + 5·x1·x2 subject to x13 + x2 = 8 0 ≤ x1 ≤ 3, x2 ≥ 0 x1 + x2 ≥ -1 (b) min x12 + 3·x1·x2 + 5·x23 subect to x1 ≥ -2, x2 ≥ 0 2·x1 - x2 ≥ - 4 12. Identify whether the following functions are quasiconcave, quasiconvex, (or both/neither): (a) x + x3 (b) (x - 1)3 (c) x1·x22

(d) x2

1+x2 (e) x1·x2·x3 (f) x2x1

-(x12 + 2x22) (g) e 13. Consider the function

F(x) = ⎩⎪⎨⎪⎧

⎭⎪⎬⎪⎫-(x+1)2 x < -1

0 -1 ≤ x ≤ 1(x-1)2 x > 1

Show that F is quasiconcave and has a local maximum at x=0, but does not have a global maximum there. (cf. remarks following Result 17) 14. Consider the problem max (x - 2)3 subject to 0 ≤ x ≤ 3

21

Page 22: Math Review

(a) Show that the Kuhn-Tucker conditions hold at x=2 and that F is quasiconcave but that F does not have a global maximum at x=2. (b) Reconcile the above with Result 18.

22

Page 23: Math Review

1. A - none, B - PSD, C - ND, D - ND, E - PD, F - PSD & NSD, G - none, H - none, J - none.

2 (a) Non-zero determinant . (b) (i) Yes, (ii) No, (iii) Yes, (iv) No, (v) and (vi) cannot say (consider G and H above). To see (ii) and (iv), I will appeal to a new result (there may be ab easier way - but it is not

obvious to me). If A is a nonsingular, symmetric matrix, then A = DT·D, for some nonsingular matrix, D (see, for example, Strang, p.36). Now, xT·A·x = xT·DT·D·x = yT·y = 0, implies y = 0, and so x = 0.

3 f11 = 6, f12 = f21 = 5

x22 + 28·x23, f22 = -10·x1

x23 + 84·x1·x22.

d2f = 6·dx11 + 2·(5

x22 + 28·x23)·dx1·dx2 + (-10·x1

x23 + 84·x1·x22)·dx22.

⎝⎜⎜⎛

⎠⎟⎟⎞

-[2·x1]-3/2 0 0

0 0 4x3

0 4x3

-4·x2x32

d2g = -(2·x1)-3/2·dx12 + 2·(4x3

)·dx2·dx3 - (4·x2x32 )·dx32.

4 (a) ⎝⎛

⎠⎞1 1

1 2 ·⎝⎛

⎠⎞y

z = ⎝⎛

⎠⎞4 - x

6 - x , yields ⎝⎛

⎠⎞y

z = ⎝⎛

⎠⎞2 -1

-1 1 ·⎝⎛

⎠⎞4 - x

6 - x = ⎝⎛

⎠⎞2 - x

2 .

(b) ⎝⎜⎜⎛

⎠⎟⎟⎞

7 -1 -110 -2 16 3 -2

·⎝⎜⎜⎛

⎠⎟⎟⎞

wxy

= z·⎝⎜⎜⎛

⎠⎟⎟⎞

-4-111

. The det of the matrix is -61, yielding

⎝⎜⎜⎛

⎠⎟⎟⎞

wxy

= (-161 )·

⎝⎜⎜⎛

⎠⎟⎟⎞

1 -5 -326 -8 -1742 -27 -4

·⎝⎜⎜⎛

⎠⎟⎟⎞

-4-111

·z = (161 )·

⎝⎜⎜⎛

⎠⎟⎟⎞

32283185

·z = ⎝⎜⎜⎛

⎠⎟⎟⎞

0.524594.639343.03279

·z

(c) Locally, yes. The matrix of partials (wrt x and y) is ⎝⎛

⎠⎞1 1

2x 2y , which is nonsingular so long as x ≠ y. But then there are two solutions (interchange x and y). Hence locally.

Notice, x = y if z = 1 (x = y = 0) or z = -13 (x = y =

23) .

5 Matrix of second-order partials is ⎝⎜⎜⎛

⎠⎟⎟⎞

2 -2 0-2 12 -40 -4 6

, which is PD. Hence, the function is

convex. 6 Weierstrass tells us nothing, as the domain is never compact. (i) a solution exists; f' = 0 implies x = 2-1/2, and f" < 0 (as x → ∞, f' → 0 from above). (ii) The minimum is at x = 0 or x = ∞; neither of which is in the domain. (iii) The objective function can be rewritten (x1 + x2)2 + 3·x22, which → ∞ for x2 → -∞,

or x2 → +∞. (iv) The function can never be negative, and so must have a minimum. Substitute in for

x1; yielding f = (19 )·(7 - x2)2 + 3·x22. f' = (

19 )·(56·x2 - 14) = 0, and so x2 =

14 .

23

Page 24: Math Review

7 (i) Corner solution. Let x1 and x2 be as large as possible; x1 = 4, x2 = 5.

(ii) Consider unconstrained derivative; f' = 4x + 3 = 0, f" = 4 > 0. Hence x = -34 is the

solution.

(iii) Consider unconstrained partials; f1 = (x1)-1/2·(x2)1/3 - 1 = 0, f2 = (23 )·(x1)1/2·(x2)-2/3 -

1 = 0, yielding (x1)1/2 = (x2)1/3, and so (x2)1/3 = 23 , or x1 =

49 , x2 =

827 .

The second-order matrix is ⎝⎜⎛

⎠⎟⎞-1

2 x1-3/2x21/3 13x1-1/2x2-2/3

13x1-1/2x2-2/3 -4

9 x11/2x2-5/3 =

⎝⎜⎛

⎠⎟⎞-9

8 98

98

-98

, which is NSD.

(iv) Let L = -x1-1/4·x2-1/2 + λ·(4 - 2·x1 - 12·x2), L1 = (14 )·x1-5/4·x2-1/2 - 2·λ = 0,

L2 = (12 )·x1-1/4·x2-3/2 - 12·λ = 0, Lλ = 4 - 2·x1 - 12·x2 = 0. From the first two equations

eliminat λ, yielding x1 = 3·x2, and so x1 = 23 , x2 =

29 . The second-order matrix becomes:

⎝⎜⎜⎛

⎠⎟⎟⎞0 -2 -12

-2 -516

32

9/4·92

1/2 -18

32

5/4·92

3/2

-12 -18

32

5/4·92

3/2 -34

32

1/4·92

5/2

, which has a positive determinant.

8. (a) Since 2x1 + x2 is continuous and the feasibile region is closed and bounded (see

diagram), the problem must have a solution by Weierstrass. (b) The maximum value is attained at x1 = 1, x2 = 0 where the last contour of

2x1 + x2 touches the feasible region.

x2

x1

(c) At (1,0) the Kuhn-Tucker conditions imply

(1) ∂L∂x1

= 2 - 3λ(1-x1)2 = 0

24

Page 25: Math Review

(2) ∂L∂x2

= 1 - λ ≤ 0

(3) x2 - (1 - x1)3 = 0 (4) λ ≥ 0 where L = 2x1 + x2 + λ((1-x1)3 - x2).

However, at (1,0) ∂L∂x1

= 2 - 3 λ.0 = 2 so (1) cannot be satisfied

(d) The constraints do not satisfy a constraint qualification and there is a cusp at

(1, 0). 9. (a) Since x1 and x2 are both > 0 throughout the feasible region the Kuhn-Tucker

conditions reduce to

∂L∂x1

= 2x1 + λ1 − λ2 = 0

∂L∂x2

= 2x2 + λ3 - λ4 = 0

2 ≤ x1 ≤ 4, 2 ≤ x2 ≤ 4

λ1(x1 - 2) = 0, λ2 (4 - x1) = 0, (x2 - 2) = 0, (4 - x2) = 0 λi ≥ 0 The only solution is x1 = x2 = 4, λ1=λ3=0, λ2=λ4=8. (b) The feasible region is like this:

A

B

25

Page 26: Math Review

At A the Kuhn-Tucker conditions reduce to

(1) ∂L∂x1

= 2 - 8x1λ1 + λ2 = 0

(2) ∂L∂x2

= 1 - 2x2λ1 − 2λ2 = 0

(3) 4x12 + x22 = 16 (4) x1 - 2x2 = -6 λ1 ≥ 0, λ2 ≥ 0. Solving (3) and (4) gives x1 = 0.98, x2 = 3.49 Substituting in (1) and (2) gives λ1 = 0.22, λ2 = -0.27. Therefore the

Kuhn-Tucker conditions fail at A. For a solution on AB the Kuhn-Tucker conditions reduce to (1) 2 - 8x1λ1 = 0 (2) 1 - 2x2λ1 = 0 (3) 4x12 + x22 = 16 λ1 ≥ 0 Eliminating λ1 between (1) and (2) gives x1 = 2 , x2 = 2 2 .

Substituting in (1) gives λ1 = 1

4 2 .

Thus the Kuhn-Tucker conditions hold at ( 2 , 2 2 ). Since the objective function is linear and therefore concave this gives a global maximum by Result 12. (All the constraint functions are convex.) 10. The feasible region is not closed (or bounded) so Weierstrass' theorem cannot be used, but it is graphically clear that there will be a minimum attained on the line x1 + x2 = 4. (In fact, at (x1 = 1, x2 = 3).)

26

Page 27: Math Review

x1 + x2 = 4 11. (a) max 4x13 + 5x1x2 subject to x13 + x2 ≤ 8 - x13 + x2 ≤ - 8 x1 ≤ 3

- x1 - x2 ≤ 1

x1 ≥ 0, x2 ≥ 0

The Kuhn-Tucker conditions (since the last inequality constraint is in fact superfluous) are

∂L∂x1

= 12x12 + 5x2 - 3x12λ1 + 3x12λ2 - λ3 ≤ 0

∂L∂x2

= 5x1 - λ1 + λ2 ≤ 0

∂L∂x1

x1 = 0, ∂L∂x2

x2 = 0

x13 + x2 = 8, 0 ≤ x1 ≤ 3, x2 ≥ 0 λ1 ≥ 0, λ2 ≥ 0, λ3 ≥ 0. (You could replace λ1-λ2 by λ which would be unrestricted in sign.) The solution is x1 = 1.49493, x2 = 4.65911, λ = 7.47465, λ3 = 0, L* = 48.1888. (b) max -(u-v)2 - 3(u-v)x2 - 4x23 subject to v-u ≤ 2 2(v-u) + x2 ≤ 4

27

Page 28: Math Review

u ≥ 0, v ≥ 0, x2 ≥ 0 In terms of x1 and x2, the K-T conditions are -2x1 - 3x2 + λ1 + 2λ2 = 0 -3x1 - 15x22 - λ2 ≤ 0 (3x1 + 15x22 + λ2) x2 = 0 x1 ≥ -2, x2 ≥ 0 x2 - 2x1 ≤ 4 λ1 (2+x1) = 0 λ2 (4+2x1 - x2) = 0 λ1 ≥ 0, λ2 ≥ 0 . The solution is xa = -0.45, x2 = 0.3, λ1 = 0, λ2 = 0, L* = 0.0945. 12. (a) both (b) both (c) neither (it is quasiconcave if restricted to a single quadrant) (d) quasiconvex (use analogue of Result 13) (e) neither (quasiconcave if restricted to a single orthant) (f) both (g) quasiconcave (ex is a non-decreasing transformation, -x12 - 2x22 is concave). 13. The quasiconcavity is obvious from the graph:

For the extension of the local-global theorem we would need F to be strictly

quasiconcave, which it is not.

28

Page 29: Math Review

14. (a) The global maximum is attained at x=3

(b) Result 15 does not apply since ∂F∂x = 0 at x=2.

29