Transcript
Page 1: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

Numer Algor (2011) 58:23–52DOI 10.1007/s11075-011-9446-9

ORIGINAL PAPER

Extending the applicability of the Gauss–Newtonmethod under average Lipschitz–type conditions

Ioannis K. Argyros · Saïd Hilout

Received: 25 October 2010 / Accepted: 17 January 2011 /Published online: 1 February 2011© Springer Science+Business Media, LLC 2011

Abstract We extend the applicability of the Gauss–Newton method for solvingsingular systems of equations under the notions of average Lipschitz–typeconditions introduced recently in Li et al. (J Complex 26(3):268–295, 2010).Using our idea of recurrent functions, we provide a tighter local as wellas semilocal convergence analysis for the Gauss–Newton method than in Liet al. (J Complex 26(3):268–295, 2010) who recently extended and improvedearlier results (Hu et al. J Comput Appl Math 219:110–122, 2008; Li et al.Comput Math Appl 47:1057–1067, 2004; Wang Math Comput 68(255):169–186,1999). We also note that our results are obtained under weaker or the samehypotheses as in Li et al. (J Complex 26(3):268–295, 2010). Applications tosome special cases of Kantorovich–type conditions are also provided in thisstudy.

Keywords Gauss–Newton method · Newton’s method ·Majorizing sequences · Recurrent functions · Local/semilocal convergence ·Kantorovich hypothesis · Average Lipschitz conditions

Mathematics Subject Classifications (2010) 65G99 · 65H10 · 65B05 · 65N30 ·47H17 · 49M15

I. K. ArgyrosDepartment of Mathematics Sciences, Cameron University,Lawton, OK 73505, USAe-mail: [email protected]

S. Hilout (B)Laboratoire de Mathématiques et Applications, Poitiers University,Bd. Pierre et Marie Curie, Téléport 2, B.P. 30179,86962 Futuroscope Chasseneuil Cedex, Francee-mail: [email protected]–poitiers.fr

Page 2: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

24 Numer Algor (2011) 58:23–52

1 Introduction

In this study we are concerned with the problem of approximating a locallyunique solution x� of equation

F(x) = 0, (1.1)

where, F is a Fréchet–differentiable operator defined on an open, nonempty,convex subset D of R

m with values in Rl, where m, l ∈ N

�.The field of computational sciences has seen a considerable development

in mathematics, engineering sciences, and economic equilibrium theory. Forexample, dynamic systems are mathematically modeled by difference ordifferential equations, and their solutions usually represent the states of thesystems. For the sake of simplicity, assume that a time–invariant system isdriven by the equation x = T(x), for some suitable operator T, where x is thestate. Then the equilibrium states are determined by solving equation (1.1).Similar equations are used in the case of discrete systems. The unknowns ofengineering equations can be functions (difference, differential, and integralequations), vectors (systems of linear or nonlinear algebraic equations), or realor complex numbers (single algebraic equations with single unknowns). Exceptin special cases, the most commonly used solution methods are iterative–whenstarting from one or several initial approximations a sequence is constructedthat converges to a solution of the equation. Iteration methods are also appliedfor solving optimization problems. In such cases, the iteration sequences con-verge to an optimal solution of the problem at hand. Since all of these methodshave the same recursive structure, they can be introduced and discussed ina general framework. We note that in computational sciences, the practiceof numerical analysis for finding such solutions is essentially connected tovariants of Newton’s method.

We shall use the Gauss–Newton method (GNM)

xn+1 = xn − F ′(xn)+ F(xn) (n ≥ 0), (x0 ∈ D), (1.2)

to generate a sequence {xn} approximating a solution x� of equation

F ′(x)+ F(x) = 0, (1.3)

where, F ′(x)+ denotes the Moore–Penrose inverse of matrix F ′(x), (x ∈ D)(see Definition 2.3).

If m = l, and F ′(xn) is invertible, then (GNM) reduces to Newton’s method(NM) given by

xn+1 = xn − F ′(xn)−1 F(xn) (n ≥ 0), (x0 ∈ D),

There is an extensive literature on the local as well as the semilocal con-vergence of (GNM) and (NM) under Lipschitz–type conditions. We refer thereader to [4, 6] and the references there for convergence results on Newton–type methods (see also [1–3, 5, 9–30, 32–52]). In particular, we recommendthe paper by Xu and Li [52], where (GNM) is studied under average Lipschitzconditions.

Page 3: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

Numer Algor (2011) 58:23–52 25

In [31], Li, Hu and Wang provided a Kantorovich–type convergenceanalysis for (GNM) by inaugurating the notions of a certain type of averageLipschitz conditions. (GNM) is also studied using the Smale point estimatetheory. This way, they unified convergence criteria for (GNM). Special casesof their results extend and/or improve important known results [3, 23].

In this study, we are motivated by the elegant work in [31] and optimizationconsiderations. In particular, using our new concept of recurrent functions, weprovide a tighter convergence analysis for (GNM) under weaker or the samehypotheses in [31] for both the local as well the semilocal case.

The study is organized as follows: Section 2 contains some preliminarieson majorizing sequences for (GNM), and well known properties for Moore–Penrose inverses. In Sections 3 and 4, we provide a semilocal convergenceanalysis for (GNM), respectively, using the Kantorovich approach. The semi-local convergence for (GNM) using recurrent functions is given in Section 5.Finally, applications and further conparisons between the Kantorovich and therecurrent functions approach are given in Section 6.

2 Preliminaries

Let R = R ∪ {+∞} and R+ = [0, +∞]. We assume that L and L0 are non–decreasing functions on [0, R), where R ∈ R+, and

L0(t) ≤ L(t), for all t ∈ [0, R). (2.1)

Let β > 0, and 0 ≤ λ < 1 be given parameters. Define function g :[0, R] −→ R by

g(t) = β − t +∫ t

0L0(u) (t − u) du. (2.2)

Moreover, define majorizing function hλ : [0, R] −→ R corresponding to afixed pair (λ, L) by:

hλ(t) = β − (1 − λ) t +∫ t

0L(u) (t − u) du. (2.3)

We have for t ∈ [0, R]

g′(t) = −1 +∫ t

0L0(u) du, (2.4)

g′′(t) = L0(t), (2.5)

h′λ(t) = −(1 − λ) +

∫ t

0L(u) du, (2.6)

Page 4: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

26 Numer Algor (2011) 58:23–52

and

h′′λ(t) = L(t). (2.7)

It follows from (2.1), (2.4), and (2.6) that

g′(t) ≤ h′0(t), for all t ∈ [0, R]. (2.8)

Define

rλ := sup {r ∈ (0, R) :∫ r

0L(u) du ≤ 1 − λ}, (2.9)

and

bλ := (1 − λ) rλ −∫ rλ

0L(u) (rλ − u) du. (2.10)

Set

� =∫ R

0L(u) du.

Then, we have:

rλ ={

R if � < 1 − λ,

tλ if � ≥ 1 − λ,(2.11)

where, t1 ∈ [0, R] is such that∫ tλ

0L(u) du = 1 − λ,

and is guaranteed to exist, since � ≥ 1 − λ in this case.We also get

⎧⎪⎪⎪⎨⎪⎪⎪⎩

bλ ≥∫ rλ

0L(u) u du if � < 1 − λ,

bλ =∫ rλ

0L(u) u du if � ≥ 1 − λ,

(2.12)

Let us define scalar sequence {sλ,n} by

sλ,0 = 0, sλ,n+1 = sλ,n − hλ(sλ,n)

g′(sλ,n), (n ≥ 0). (2.13)

Note that if equality holds in (2.1), then

g(t) = h0(t) t ∈ [0, R], (2.14)

and {sλ,n} reduces to {tλ,n} introduced in [31], and given by:

tλ,0 = 0, tλ,n+1 = tλ,n − hλ(tλ,n)

h′0(sλ,n)

, (n ≥ 0). (2.15)

Page 5: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

Numer Algor (2011) 58:23–52 27

We shall show in Lemma 2.2 that under the same hypothesis (see (2.16)),scalar sequence {sλ,n} is at least as tight as {tλ,n}. But first, we need a crucialresult on majorizing sequences for the (GNM).

Lemma 2.1 Assume:

β ≤ bλ. (2.16)

Then, the following hold

(i) Function hλ is strictly decreasing on [0, rλ], and has exactly one zero t�λ ∈[0, rλ], such that

β < t�λ. (2.17)

(ii) Sequence {sλ,n} given by (2.13) is strictly increasing, and converges to t�λ.

Proof

(i) This part follows immediately from (2.3), (2.6), and (2.10).(ii) We shall show this part using induction on n. It follows from (2.13), and

(2.17) that

0 = sλ,0 < sλ,1 = β < t�λ.

Assume

sλ,k−1 < sλ,k < t�λ for all k ≤ n. (2.18)

In view of (2.5), −g′ is strictly increasing on [0, R], and so by (2.8), (2.18),and the definition of rλ, we have

−g′(sλ,k) > −g′(t�λ) ≥ −g′(rλ) ≥ −h′λ(rλ) + λ ≥ 0. (2.19)

We also have hλ(sλ,k) > 0 by (i). That is, it follows from

sλ,k+1 = sλ,k − hλ(sλ,k)

g′(sλ,k)> sλ,k. (2.20)

Let us define pλ on [0, t�λ] by

pλ(t) = t − hλ(t)g′(t)

(2.21)

We have

g′(t) < 0 for t ∈ [0, t�λ] (2.22)

except if λ = 0, and t = t�λ = rλ. As in [31], we set by convention

hλ(t�λ)g′(t�λ)

= limt−→t� −

λ

hλ(t)g′(t)

= 0 (2.23)

by Hospital’s rule.

Page 6: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

28 Numer Algor (2011) 58:23–52

Therefore, function pλ is well defined, and continuous on [0, t�λ]. It thenfollows from part (i), (2.7), and (2.22):

p′λ(t) = 1 − h′

λ(t) g′(t) − hλ(t) g′′(t)(g′(t))2

=−

(λ +

∫ t

0(L(u) − L0(u)) du

)g′(t) + hλ(t) L0(t)

(g′(t))2

> 0 for a.e. t ∈ [0, t�λ).

(2.24)

Using (2.18), (2.20), and (2.24), we get

sλ,k < sλ,k+1 = pλ(sλ,k) < pλ(t�λ) = t�λ, (2.25)

which completes the induction.Hence {sλ,n} is increasing, bounded above by t�λ, and as such it convergesto its unique least upper bound s� ∈ (0, t�λ], with hλ(s�) = 0. Using part (i),we get s� = t�λ.

That completes the proof of Lemma 2.1.

Next, we compare sequence {sλ,n} with {tλ,n}.

Lemma 2.2 Assume that condition (2.16) holds, then the following hold forn ≥ 0:

sλ,n ≤ tλ,n, (2.26)

and

sλ,n+1 − sλ,n ≤ tλ,n+1 − tλ,n. (2.27)

Moreover, if L0(t) < L(t) for t ∈ [0, t�λ], then (2.26), and (2.27) hold as strictinequalities for n ≥ 1, and n ≥ 0, respectively.

Proof It was shown in [31] that under hypothesis (2.16), assertions (i) and (ii)of Lemma 2.1 hold with {tλ,n} replacing {sλ,n}.

We shall show (2.26), and (2.27) using induction. It follows from (2.1),(2.13), and (2.15) that

sλ,0 = tλ,0, sλ,1 = tλ,1 = β,

sλ,1 = sλ,0 − hλ(sλ,0)

g′(sλ,0)≤ tλ,0 − hλ(tλ,0)

h′0(tλ,0)

= tλ,1, (2.28)

and

sλ,1 − sλ,0 = −hλ(sλ,0)

g′(sλ,0)≤ −hλ(tλ,0)

h′0(tλ,0)

= tλ,1 − tλ,0. (2.29)

Page 7: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

Numer Algor (2011) 58:23–52 29

Hence, (2.26), and (2.27) hold for n = 0. Let us assume that (2.26), and(2.27) hold for all k ≤ n. Then, we have in turn:

sλ,k+1 = sλ,k − hλ(sλ,k)

g′(sλ,k)≤ tλ,k − hλ(tλ,k)

h′0(tλ,k)

= tλ,k+1, (2.30)

and

sλ,k+1 − sλ,k = −hλ(sλ,k)

g′(sλ,k)≤ −hλ(tλ,k)

h′0(tλ,k)

= tλ,k+1 − tλ,k, (2.31)

which completes the induction for (2.26), and (2.27).Moreover, if L0(t) < L(t) for t ∈ [0, t�λ], then (2.28)–(2.31) hold as strict

inequalities.That completes the proof of Lemma 2.1.

It is convenient for us to provide some well known definitions and proper-ties of the Moore–Penrose inverse [4, 12].

Definition 2.3 Let M be a matrix l × m. The m × l matrix M+ is called theMoore–Penrose inverse of M if the following four axioms hold:

(M+ M)T = M+ M, (M M+)T = M M+,

M+ M M+ = M+, M M+ M = M,(2.32)

where, MT is the adjoint of M.In the case of a full rank (l, m) matrix M, with rank M = m, the Moore–

Penrose inverse is given by:

M+ = (MT M)−1 MT .

Denote by Ker M and Im M the kernel and image of M, respectively, and�E the projection onto a subspace E of R

m. We then have

M+ M = �Ker M⊥ and M M+ = �Im M. (2.33)

Note that if M is surjective or equivalently, when M is full row rank, then

M M+ = IRl . (2.34)

We also state a result providing a pertubation upper bound for Moore–Penrose inverses [4, 50]:

Lemma 2.4 Let M and N be two l × m matrices with rankN ≥ 1. Assume that

0 ≤ rankM ≤ rankN , (2.35)

Page 8: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

30 Numer Algor (2011) 58:23–52

and

‖ M − N ‖ ‖ N+ ‖< 1. (2.36)

Then, the following hold

rank M = rankN , (2.37)

and

‖ M+ ‖≤ ‖ N+ ‖1− ‖ M − N ‖ ‖ N+ ‖ . (2.38)

3 Semilocal convergence I of (GNM)

We shall provide a semilocal convergence for (GNM) using majorizing se-quences under the Kantorovich approach.

Let U(x, r) denotes an open ball in Rm with center x and of radius r > 0,

and let U(x, r) denotes its closure. For the remainder of this study, we assumethat F : R

m −→ Rl is continuous Fréchet–differentiable,

‖ F ′(y)+ (I − F ′(x) F ′(x)+) F(x) ‖≤ κ ‖ x − y ‖, for all x, y ∈ D (3.1)

where, κ ∈ [0, 1), I is the identity matrix,

F ′(x0) = 0 for some x0 ∈ D, (3.2)

and

rank (F ′(x)) ≤ rank (F ′(x0)) for all x ∈ D. (3.3)

We need the definition of the modified L–average Lipschitz condition onU(x0, r).

Definition 3.1 [31] Let r > 0 be such that U(x0, r) ⊆ D. Mapping F ′ satisfiesthe modified L–average Lipschitz condition on U(x0, r), if, for any x, y ∈U(x0, r), with ‖ x − x0 ‖ + ‖ y − x ‖< r,

‖ F(x0)+ ‖ ‖ F ′(y) − F ′(x) ‖≤

∫ ‖x−x0‖+‖y−x‖

‖x−x0‖L(u) du. (3.4)

Condition (3.4) was used in [31] as an alternative to L–average Lipschitzcondition [26]

‖ F(x0)+ (F ′(y) − F ′(x)) ‖≤

∫ ‖x−x0‖+‖y−x‖

‖x−x0‖L(u) du, (3.5)

which is a modification of Wang’s condition [41, 42]. Condition (3.4) fits thecase when F ′(x0) is not surjective [31].

Page 9: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

Numer Algor (2011) 58:23–52 31

We also introduce the condition.

Definition 3.2 Let r > 0 be such that U(x0, r) ⊆ D. Mapping F ′ satisfies themodified center L0–average Lipschitz condition on U(x0, r), if, for any x ∈U(x0, r), with 2 ‖ x − x0 ‖< r,

‖ F(x0)+ ‖ ‖ F ′(y) − F ′(x0) ‖≤

∫ 2 ‖x−x0‖

‖x−x0‖L0(u) du. (3.6)

If (3.4) holds, then so do (3.6), and (2.1). Therefore, (3.6) is not an

additional hypothesis. We also note thatLL0

can be arbitrarily large [1–7].

We shall use the perturbation result on Moore–Penrose inverses.

Lemma 3.3 Let

r0 = sup{r ∈ (0, R) :∫ r

0L0(u) du ≤ 1}.

Suppose that 0 ≤ r ≤ r0 satisf ies U(x0, r) ⊆ D, and that F ′ satisf ies (3.6) onU(x0, r).

Then, for each x ∈ U(x0, r), rank (F ′(x)) = rank (F ′(x0)), and

‖ F ′(x)+ ‖≤ −g′(‖ x − x0 ‖)−1 ‖ F ′(x0)+ ‖ . (3.7)

Remark 3.4 If equality holds in (2.1), then Lemma 3.3 reduces to Lemma 3.2in [31]. Otherwise, (i.e. if strict inequality hold in (2.1)), (3.7) is a more preciseestimate than

‖ F ′(x)+ ‖≤ −h′0(‖ x − x0 ‖)−1 ‖ F ′(x0)

+ ‖ (3.8)

given in [31], since

−g′(r)−1 < −h′0(r)

−1 r ∈ [0, R]. (3.9)

We also have

r0 ≤ r0, (3.10)

and

λ0 = κ

(1 −

∫ β

0L(u) du

)≤ λ0 = κ

(1 −

∫ β

0L0(u) du

), (3.11)

where,

β =‖ F ′(x0)+ F(x0) ‖ . (3.12)

Page 10: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

32 Numer Algor (2011) 58:23–52

We also state the result.

Lemma 3.5 [26, 31] Let 0 ≤ c < R. Def ine

χ(t) = 1t

∫ t

0L(c + u) (t − u) du, 0 ≤ t < R − c.

Then, function χ is increasing on [0, R − c).

We can show the following semilocal convergence result for (GNM). Ourapproach differs from the corresponding [31, Theorem 3.1, p. 273] since we use(3.7) instead (3.8).

Theorem 3.6 Let λ ≥ λ0. Assume

β ≤ bλ and U(x0, t�λ) ⊆ D; (3.13)

F ′ satisf ies (3.4) and (3.6) on U(x0, t�λ).Then, sequence {xn} generated by (GNM) is well def ined, remains in

U(x0, t�λ) for all n ≥ 0, and converges to a zero x� of F ′(.)+ F(.) in U(x0, t�λ).Moreover, the following estimates hold:

‖ xn+1 − xn ‖≤ sλ,n+1 − sλ,n, (3.14)

and

‖ xn − x� ‖≤ t�λ − sλ,n. (3.15)

Proof We first use mathematical induction to prove that

‖ xn − xn−1 ‖≤ sλ,n − sλ,n−1 (3.16)

hold for each n ≥ 0.We first have

‖ x1 − x0 ‖=‖ F ′(x0)+ F(x0) ‖= β ≤ sλ,1 − sλ,0

then (3.16) holds for n = 0.Assume that (3.16) hold for all n ≤ k. Then

‖ xk − xk−1 ‖≤ sλ,k − sλ,k−1. (3.17)

For θ ∈ [0, 1] we denote by

xθk = xk−1 + θ (xk − xk−1) and sθ

λ,k = sλ,k−1 + θ (sλ,k − sλ,k−1).

Then for all θ ∈ [0, 1]

‖ xθk − x0 ‖≤‖ xθ

k − xk−1 ‖ +k−1∑i=1

‖ xi − xi−1 ‖≤ sθλ,k < t�λ ≤ rλ ≤ r0. (3.18)

In particular

‖ xk−1 − x0 ‖≤ sλ,k−1 and ‖ xk − x0 ‖≤ sλ,k. (3.19)

Page 11: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

Numer Algor (2011) 58:23–52 33

By Lemma 3.3 and the monotony property of function −g′(.)−1, we obtain

‖ F ′(xk)+ ‖ ≤ −g′(‖ xk − x0 ‖)−1 ‖ F ′(x0)

+ ‖≤ ‖ F ′(xk)

+ ‖ ≤ −g′(sλ,k)−1 ‖ F ′(x0)

+ ‖ . (3.20)

We deduce by Lemma 3.5, (2.7) and (3.17) that∫ 1

0

∫ ‖xk−1−x0‖+θ ‖xk−xk−1‖

‖xk−1−x0‖L(u) du ‖ xk − xk−1 ‖ dθ

= 1‖ xk − xk−1 ‖

∫ ‖xk−xk−1‖

0L(‖ xk−1 − x0 ‖ +u)

×(‖ xk−1 − xk ‖ −u) du ‖ xk − xk−1 ‖

≤ 1sλ,k − sλ,k−1

∫ sλ,k−sλ,k−1

0h′′

λ(sλ,k−1 + u) (sλ,k − sλ,k−1 − u) du ‖ xk − xk−1 ‖

= (hλ(sλ,k) − hλ(sλ,k−1) − h′λ(sλ,k−1) (sλ,k − sλ,k−1))

‖ xk − xk−1 ‖sλ,k − sλ,k−1

. (3.21)

Since

h′λ = g′ + λ, and − hλ(sλ,k−1) − g′(sλ,k−1) (sλ,k − sλ,k−1) = 0

then estimate (3.21) becomes∫ 1

0

∫ ‖xk−1−x0‖+θ ‖xk−xk−1‖

‖xk−1−x0‖L(u) du ‖ xk − xk−1 ‖ dθ ≤ hλ(sλ,k) − λ (sλ,k − sλ,k−1).

(3.22)

Using (2.1), we obtain the identity

xk+1 − xk = −F ′(xk)+ F(xk)

= −F ′(xk)+

∫ 1

0(F ′(xk−1 + θ (xk − xk−1)) − F ′(xk)) (xk − xk−1) dθ

−F ′(xk)+ (I − F ′(xk−1) F ′(xk−1)

+) F(xk−1). (3.23)

By (3.1), (3.4), (3.21), and (3.23), we obtain

‖ xk+1 − xk ‖=‖ F ′(xk)+ F(xk)‖

≤∫ 1

0‖ F ′(xk)

+ ‖ ‖ F ′(xk−1 + θ (xk − xk−1)) − F ′(xk) ‖×‖ xk − xk−1 ‖ dθ + κ ‖ xk − xk−1 ‖

≤ −g′(sλ,k)−1

∫ 1

0

∫ ‖xk−1−x0‖+θ ‖xk−xk−1‖

‖xk−1−x0‖×L(u) du ‖ xk − xk−1 ‖ dθ + κ ‖ xk − xk−1 ‖

≤ −g′(sλ,k)−1hλ(sλ,k)+(κ + λ g′(sλ,k)

−1) (sλ,k − sλ,k−1). (3.24)

Page 12: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

34 Numer Algor (2011) 58:23–52

Note that β = sλ,1 ≤ sλ,k. By (3.11), we have

λ ≥ λ0 ≥ −g′(sλ,k) κ,

then κ + λ g′(sλ,k)−1 ≤ 0. Consequently, by (3.24), we get

‖ xk+1 − xk ‖≤ −g′(sλ,k)−1 hλ(sλ,k) = sλ,k+1 − sλ,k

and the induction for (3.14) is completed.In view of Lemma 3.5, sequence {xn} (n ≥ 0) converges to some x� ∈

U(x0, t�λ). Estimate (3.15) follows from (3.14) by using standard majorizationtechniques [4, 6].

By letting k −→ ∞ in the inequality

‖ F ′(x�)+ F(xk) ‖ ≤ ‖ F ′(x�)+ (I − F ′(xk) F ′(xk)+) F(xk) ‖

+ ‖ F ′(x�)+ ‖ ‖ F ′(xk) F ′(xk)+ F(xk) ‖

≤ κ ‖ xk − x� ‖ + ‖ F ′(x�)+ ‖ ‖ F ′(xk) ‖ ‖ xk+1 − xk ‖we obtain F ′(x�)+ F(x�) = 0.

That completes the proof of Theorem 3.6.

Remark 3.7

(a) If equality holds in (2.1), then Theorem 3.6 reduces to Theorem 3.1in [31]. Otherwise, it follows from Lemmas 2.1, 2.2, and Remark 3.4that under the same computational cost and hypotheses, we obtain theadvantages over the works in [31] as stated in the introduction of thisstudy. It also follows from (3.24) that {vλ,n} defined by

vλ,0 = 0, vλ,1 = β,

vλ,n+1 = vλ,n −

(∫ 1

0

∫ vλ,n−1+θ (vλ,n−vλ,n−1)

vλ,n−1

L(u) du dθ + κ

)(vλ,n − vλ,n−1)

g′(vλ,n)

(3.25)

is a tighter majorizing sequence for (GNM) than {sλ,n}, so that

vλ,n ≤ sλ,n, (3.26)

vλ,n − vλ,n−1 ≤ sλ,n − sλ,n−1, (3.27)

and

limn−→∞ vλ,n = v�

λ ≤ t�λ. (3.28)

(see also the proof in Lemma 2.2 and Theorem 3.6). Hence, {vλ,n}, v�λ can

replace {sλ,n}, s�λ, respectively in Theorem 3.6.

At this point, we are wondering if conditions (3.13) can be weakened,since this way the applicability of (GNM) will be extended. It turns out

Page 13: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

Numer Algor (2011) 58:23–52 35

that if {vλ,n} is used as a majorizing sequence for (GNM), conditions (3.13)can be replaced by weaker ones in many cases. The claim is justified inSection 5. However, before doing that we present the rest of our resultsfor Section 3, which also constitute improvements of the corresponding[31]. The proofs are omitted, since they follow from [31] by simply usingg′ instead of h′

0. Note also that Corollary 3.8, Theorem 3.9, and Corollary3.10 hold, with {vn}, v�

0 replacing {sn}, s�0, respectively.

(b) The condition (3.13) depends upon the choice of λ. The best choice ofλ is λ0 given by (3.11) and in this case, (3.13) would be implicit. In thefollowing corollary, we consider the simple choice λ = κ such that thecondition (3.11) is explicit.

Corollary 3.8 Assume

β ≤ b κ and U(x0, t�κ) ⊆ D (3.29)

and F ′ satisf ies (3.4), and (3.6) on U(x0, t�λ).Then, the conclusions of Theorem 3.6 hold in U(x0, t�κ) with λ = κ .

We have the following improved version of Theorem 3.6 for κ = 0.

Theorem 3.9 Assume

β ≤ b 0 and U(x0, t�0) ⊆ D, (3.30)

‖ F ′(y)+ (I − F ′(x) F ′(x)+) F(x) ‖= 0, for all x, y ∈ D (3.31)

and F ′ satisf ies (3.4), and (3.6) on U(x0, t�0).Then, the conclusions of Theorem 3.6 hold in U(x0, t�0) with λ = 0. Furthe-

more, we have the additional estimate for n ≥ 1:

‖ xn+1 − xn ‖ ≤ s0,n+1 − s0,n

s0,n − s0,n−1‖ xn − xn−1 ‖ .

In the case when F ′(x0) is surjective, we have the following corollary.

Corollary 3.10 Assume that F ′(x0) is surjective, and the conditions of Theorem3.9 hold.

Then, the conclusions of Theorem 3.9 hold for t�0 replaced by s�0. Furthemore,

we have the additional estimate for n ≥ 1:

‖ F ′(x0)+ F(xn) ‖ ≤ s0,n+1 − s0,n

s0,n − s0,n−1‖ F ′(x0)

+ F(xn−1) ‖ .

4 Local convergence of (GNM)

We shall provide a local convergence analysis for (GNM).

Page 14: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

36 Numer Algor (2011) 58:23–52

Assume:There exists x� ∈ D such that F(x�) = 0, and F ′(x�) = 0 ;

U(x�, r0) ⊆ D,

and

rank (F ′(x)) ≤ rank (F ′(x�)) for all x ∈ D. (4.1)

Next, we provide an upper bound on the norm ‖ F ′(x0)+ F(x0) ‖ needed in

the proof of the main result of this section.

Lemma 4.1 Let 0 < r ≤ r0. Suppose that F ′ satisf ies (3.4) and (3.6) in U(x�, r).Then for each x0 ∈ U(x�, r), the following hold:

rank (F ′(x0)) = rank (F ′(x�)),

and

‖ F ′(x0)+ F(x0) ‖≤

‖ x0 − x� ‖ +∫ ‖x0−x�‖

0L(u) (u− ‖ x0 − x� ‖) du

1 −∫ ‖x0−x�‖

0L0(u) du

. (4.2)

Proof Let x0 ∈ U(x�, r). Using Lemma 3.3, we have rank (F ′(x0)) =rank (F ′(x�)), and

‖ F ′(x0)+ ‖≤ ‖ F ′(x�)+ ‖

1 −∫ ‖x0−x�‖

0L0(u) du

. (4.3)

We have also the identity

− F ′(x0)+ F(x0)= F ′(x0)

+ (F(x�)−F(x0)−F ′(x0) (x�−x0))

+ F ′(x0)+ F ′(x0) (x�−x0)

= F ′(x0)+

∫ 1

0(F ′(x0 − F ′(x� + θ (x0 − x�)) (x� − x0) dθ

+ �(Ker F ′(x0))⊥ (x� − x0). (4.4)

Using (3.4), (3.6), (4.3), and (4.4), we obtain

‖ F ′(x0)+ F(x0) ‖ ≤ 1

1 −∫ ‖x0−x�‖

0L0(u) du

∫ 1

0

∫ ‖x0−x�‖

θ ‖x0−x�‖L(u) ‖ x0 − x� ‖ du dθ

+ ‖ x0 − x� ‖

=‖ x0 − x� ‖ +

∫ ‖x0−x�‖

0L(u) (u− ‖ x0 − x� ‖) du

1 −∫ ‖x0−x�‖

0L0(u) du

.

Page 15: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

Numer Algor (2011) 58:23–52 37

That completes the proof of Lemma 4.1.

Remark 4.2 If equality holds in (2.1), then Lemma 4.1 reduces to Lemma 4.1in [31]. Otherwise, it constitutes an improvement, since, the estimate

‖ F ′(x0)+ F(x0) ‖ ≤

‖ x0 − x� ‖ +∫ ‖x0−x�‖

0L(u) (u− ‖ x0 − x� ‖) du

1 −∫ ‖x0−x�‖

0L(u) du

(4.5)

is used in [31], and in this case

1

1 −∫ ‖x0−x�‖

0L0(u) du

<1

1 −∫ ‖x0−x�‖

0L(u) du

.

Using (4.2) insted of (4.5), we obtain the following improvements ofLemmas 4.2, 4.3, Theorem 4.1, and Corollaries 4.1–4.3 in [31].

Lemma 4.3 Suppose that F ′ satisf ies (3.4) and (3.6) in U(x�, r0). Let x0 ∈U(x�, r0), and let L : = [0, R− ‖ x0 − x� ‖) −→ R def ined by

L(u) = L(u+ ‖ x0 − x� ‖)1 −

∫ ‖x0−x�‖

0L0(u) du

for all u ∈ . (4.6)

Then, the following hold:

(i) rκ ≤ rκ+ ‖ x0 − x� ‖≤ r0, where,

rκ := sup{

r ∈ (0, R) :∫ 1

0L(u) du ≤ 1 − κ

}.

(ii) F ′ satisf ies L–average Lipschitz condition in U(x0, r0− ‖ x0 − x� ‖).

Lemma 4.4 Let φκ : [0, rκ ] −→ R def ined by

φκ(t) = b κ − (2 − κ) t + κ (rκ − t)∫ t

0L(u) du + 2

∫ t

0L(u) (t − u) du.

Then, φκ is a strictly decreasing function on [0, rκ ], and has exact one zerorκ ∈ [0, rκ ] satisfying

b κ

2 − κ< rκ < rκ .

Theorem 4.5 Suppose that F ′ satisf ies (3.4) and (3.6) in U(x�, r0). Let x0 ∈U(x�, rκ), where rκ is given in Lemma 4.4. Then {xn} generated by (GNM)starting at x0 converges to a zero of F ′(.)+ F(.).

Page 16: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

38 Numer Algor (2011) 58:23–52

Corollary 4.6 Suppose that F ′ satisf ies (3.4) in U(x�, r0). Let x0 ∈U

(x�,

b κ

2 − κ

). Then {xn} generated by (GNM) starting at x0 converges to

a zero of F ′(.)+ F(.).

Corollary 4.7 Suppose that F ′ satisf ies (3.4) in U(x�, r0), and the condition(3.31) holds. Let x0 ∈ U(x�, r0), where r0 is the exact one zero of functionφ0 : [0, r0] −→ R def ined by

φ0(t) = b 0 − 2 t + 2∫ t

0L(u) (t − u) du.

Then {xn} generated by (GNM) starting at x0 converges to a zero ofF ′(.)+ F(.).

Corollary 4.8 Suppose that F ′ is surjective, and satisf ies (3.4) in U(x�, r0).Let x0 ∈ U(x�, r0), where r0 is given in Corollary 4.7. Then {xn} generated by(GNM) starting at x0 converges to a solution of F(x) = 0.

Remark 4.9 The local results obtained can also be used to solve equationof the form F(x) = 0, where F ′ satisfies the autonomous differential equa-tion [4]:

F ′(x) = T(F(x)), (4.7)

where T : Y −→ X is a known continuous operator.Since F ′(x�) = T(F(x�)) = T(0), we can apply our results without actually

knowing the solution of x� of equation F(x) = 0. As an example, let X = Y =(−∞, +∞), D = U(0, 1), and define function F on D by

F(x) = ex − 1. (4.8)

Then, for x� = 0, we can set T(x) = x + 1 in (4.7).

5 Semilocal convergence II of (GNM)

We shall provide a semilocal convergence for (GNM) using our new conceptof recurrent functions. This idea has already produced a finer convergenceanalysis for iterative methods using invertible operators or outer or general-ized inverses [5–7].

We need to define some parameters, sequences, and functions.

Definition 5.1 Let x0 ∈ D, κ ∈ [0, 1), and λ ∈ [0, 1).

Page 17: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

Numer Algor (2011) 58:23–52 39

Define parameter β, iteration {vλ,n}, functions fλ,n, ελ,n, μλ,n on [0, 1), and

ξλ on Iξλ= [0, 1]2 × [1,

11 − dλ

]2 (dλ ∈ [0, 1)) by

vλ,0 = 0, vλ,1 = β,

vλ,n+1 = vλ,n + δλ,n + κ

1 − δλ,n(vλ,n − vλ,n−1), (n ≥ 1),

(5.1)

fλ,n(dλ) =∫ 1

0

∫ zλ,n−1(dλ)

wλ,n−1(dλ)

L(u) du dθ + dλ

∫ wλ,n(dλ)

0L0(u) du + cλ, (5.2)

ελ,n(dλ) =∫ 1

0

(∫ zλ,n(dλ)

wλ,n(dλ)

L(u) du −∫ zλ,n−1(dλ)

wλ,n−1(dλ)

L(u) du)

+ dλ

∫ wλ,n+1(dλ)

wλ,n(dλ)

L0(u) du, (5.3)

μλ,n(dλ) =∫ 1

0

(∫ zλ,n+1(dλ)

wλ,n+1(dλ)

L(u) du +∫ zλ,n−1(dλ)

wλ,n−1(dλ)

L(u) du

− 2∫ zλ,n(dλ)

wλ,n(dλ)

L(u) du)

+ dλ

(∫ zλ,n+1(dλ)

wλ,n+1(dλ)

L0(u) du −∫ zλ,n(dλ)

wλ,n(dλ)

L0(u) du)

, (5.4)

ξλ(θ, dλ, aλ, eλ) =∫ 1

0

(∫ (aλ+eλ+eλ dλ+θ d2λ) β

(aλ+eλ+eλ dλ) β

L(u) du +∫ (aλ+θ eλ) β

aλ β

L(u) du

− 2∫ (aλ+eλ+θ eλ dλ) β

(aλ+eλ) β

L(u) du)

+ dλ

(∫ (aλ+eλ+eλ dλ+d2λ eλ) β

(aλ+eλ+eλ dλ) β

L0(u) du −∫ (aλ+eλ+eλ dλ) β

(aλ+eλ) β

L0(u) du)

,

(5.5)where,

δλ,n =∫ 1

0

∫ vλ,n−1+θ (vλ,n−vλ,n−1)

vλ,n−1

L(u) du dθ, (5.6)

δλ,n =∫ vλ,n

0L0(u) du, (5.7)

zλ,n(dλ) =(

1 − dnλ

1 − dλ

+ θ dnλ

)β, (5.8)

Page 18: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

40 Numer Algor (2011) 58:23–52

wλ,n(dλ) = 1 − dnλ

1 − dλ

β, (5.9)

cλ = κ − dλ. (5.10)

Define function fλ,∞ on [0, 1) by

fλ,∞(dλ) = limn−→∞ fλ,n(dλ). (5.11)

Remark 5.2 Using (5.2), and (5.11), we get

fλ,∞(dλ) = dλ

∫ β

1−dλ

0L0(u) du + cλ. (5.12)

It then follows from (5.2)–(5.10) that the following identities hold:

fλ,n+1(dλ) = fλ,n(dλ) + ελ,n(dλ), (5.13)

ελ,n+1(dλ) = ελ,n(dλ) + μλ,n(dλ), (5.14)

and

μλ,n(dλ) = ξλ(θ, dλ, aλ = (1 + dλ + · · · + dn−2λ ) β, eλ = dn−1

λ β). (5.15)

We need the following result on majorizing sequences for (GNM)

Lemma 5.3 Let parameters β, κ , λ, iteration {vλ,n}, and functions fλ,n, ελ,n, μλ,n,and ξλ be as in Def inition 5.1.

Assume there exists αλ ∈ (0, 1) such that∫ 1

0

∫ θ β

0L(u) du dθ + κ ≤ αλ

(1 −

∫ β

0L0(u) du

), (5.16)

cλ = κ − αλ < 0, (5.17)

ξλ(θ, q1, q2, q3) ≥ 0 on Iξλ, (5.18)

ελ,1(αλ) ≥ 0, (5.19)

and

fλ,∞(αλ) ≤ 0. (5.20)

Page 19: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

Numer Algor (2011) 58:23–52 41

Then, iteration {vλ,n} given by (5.1) is non–decreasing, bounded fromabove by

v��λ = β

1 − αλ

, (5.21)

and converges to its unique least bound v�λ such that

v�λ ∈ [0, v��

λ ]. (5.22)

Moreover, the following estimates hold for all n ≥ 0:

0 ≤ vλ,n+1 − vλ,n ≤ αλ (vλ,n − vλ,n−1) ≤ αnλ β, (5.23)

and

0 ≤ v�λ − vλ,n ≤ αn

λ β

1 − αλ

. (5.24)

Proof Estimate (5.23) is true, if

δλ,n + κ ≤ αλ (1 − δλ,n) (5.25)

holds for all n ≥ 1.It follows from (5.1), (5.16), and (5.17) that estimate (5.25) holds for n = 1.

Then, we also have that (5.23) holds for n = 1, and

vλ,n ≤ 1 − αnλ

1 − αλ

β < v��λ . (5.26)

Using the induction hypotheses, and (5.26), estimate (5.25) is true, if

δλ,k + αλ δλ,k + cλ ≤ 0 (5.27)

or ∫ 1

0

∫ zλ,k−1(αλ)

wλ,k−1(αλ)

L(u) du dθ + αλ

∫ wλ,k(αλ)

0L0(u) du + cλ ≤ 0 (5.28)

hold for all k ≤ n.Estimate (5.28) (for dλ = αλ) motivates us to introduce function fλ,k given

by (5.2), and show instead of (5.28)

fλ,k(αλ) ≤ 0. (5.29)

We have by (5.13)–(5.15) (for dλ = αλ) and (5.19) that

fλ,k+1(αλ) ≥ fλ,k(αλ). (5.30)

In view of (5.11), (5.12), and (5.30), estimate (5.29) shall holds, if (5.20) istrue, since

fλ,k(αλ) ≤ fλ,∞(αλ), (5.31)

and the induction is completed.

Page 20: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

42 Numer Algor (2011) 58:23–52

It follows from (5.23) and (5.26) that iteration {vλ,n} is non–decreasing,bounded from above by v��

λ given by (5.21), and as such it converges to v�λ.

Finally, estimate (5.24) follows from (5.23) by using the standard majorizationtechniques [4–6].

That completes the proof of Lemma 5.3.

We can show the following semilocal convergence result for (GNM) usingrecurrent functions, wich is the analog of Theorem 3.6.

Theorem 5.4 Let λ ≥ λ0. Assume

and U(x0, v�λ) ⊆ D; (5.32)

F ′ satisf ies (3.4), and (3.6) on U(x0, v�λ); and Hypotheses of Lemma 2.2 hold.

Then, sequence {xn} generated by (GNM) is well def ined, remains inU(x0, v

�λ) for all n ≥ 0, and converges to a zero x� of F ′(.)+ F(.) in U(x0, v

�λ).

Moreover, the following estimates hold:

‖ xn+1 − xn ‖≤ vλ,n+1 − vλ,n, (5.33)

and

‖ xn − x� ‖≤ v�λ − vλ,n. (5.34)

Proof As in Theorem 3.6, we arrive at the estimate on line above (3.24) (withvλ,k replacing sλ,k), which in view of (5.1) leads to

‖ xk+1 − xk ‖≤ vλ,k+1 − vλ,k. (5.35)

Estimates (3.18), (3.19), (5.35), and Lemma 5.3 implie that sequence {xk}is a complete sequence in R

m, and as such it converges to some x� ∈ U(x0, v�λ)

(since U(x0, v�λ) is a closed set).

That completes the proof of Theorem 5.4.

Remark 5.5

(a) The point v��λ given in closed form by (5.21) can replace v�

λ in hypothesis(5.32).

(b) Hypotheses of Lemma 5.2 involve only computations at the initial data.These hypotheses differ from (3.13) given in Theorem 3.6.In practice, we shall test to see which of the two are satisfied if any. Ifboth conditions are satisfied, we shall use the more precise error boundsgiven in Theorem 5.4 (see also Section 4).

In Section 6, we show that the conditions of Theorem 5.4 can be weakerthan those of Theorem 3.6.

Page 21: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

Numer Algor (2011) 58:23–52 43

6 Applications

We compare the Kantorovich–type conditions introduced in Section 5 with thecorresponding ones in Section 3.

6.1 Semilocal case

An operator Q : Rm −→ R

l is said to be Lipschitz continuous on D0 ⊆ D withmodulus L > 0 if

‖ Q(x) − Q(y) ‖≤ L ‖ x − y ‖ for all x, y ∈ D0, (6.1)

and center–Lipschitz continuous on D0 with modulus L0 > 0 if

‖ Q(x) − Q(x0) ‖≤ L0 ‖ x − x0 ‖ for all x ∈ D0. (6.2)

Let x0 ∈ D, and r > 0 be such that U(x0, r) ⊆ D. Clearly, if ‖ F ′(x0)+ ‖ F ′

is Lipschitz continuous on U(x0, r) with modulus L, then F ′ satisfies themodified L–average Lipschitz condition on U(x0, r). Similarily, if ‖ F ′(x0)

+ ‖F ′ is center–Lipschitz continuous on U(x0, r) with modulus L0, then F ′ satisfiesthe modified center L0–average Lipschitz condition on U(x0, r).

Using (2.2), (2.3), (2.9), and (2.10), we get for t ≥ 0:

g(t) = β − t + L0

2t2, (6.3)

hλ(t) = β − (1 − λ) t + L2

t2, (6.4)

rλ = 1 − λ

L, (6.5)

and

bλ = (1 − λ)2

2 L. (6.6)

Moreover, if

β ≤ bλ, (6.7)

then,

t�λ = 1 − λ − √(1 − λ)2 − 2 β L

L. (6.8)

We have the following improvement of Theorem 5.1 in [31].

Theorem 6.1 Let λ = λ0 = (1 − β L0) κ . Assume

β L ≤ � = (1 − κ)2

κ2 − κ + 1 + √2 κ2 − 2 κ + 1

; (6.9)

U(x0, v�λ) ⊆ D; (6.10)

and ‖ F ′(x0)+ ‖ F ′ satisf ies (6.1), and (6.2) on U(x0, t�λ).

Page 22: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

44 Numer Algor (2011) 58:23–52

Then, sequence {xn} generated by (GNM) is well def ined, remains inU(x0, t�λ) for all n ≥ 0, and converges to a zero x� of F ′(.)+ F(.) in U(x0, t�λ).

Moreover, the following estimates hold:

‖ xn+1 − xn ‖≤ sλ,n+1 − sλ,n, (6.11)

and

‖ xn − x� ‖≤ t�λ − sλ,n. (6.12)

Proof Similarily replace {tλ,n} by {sλ,n} in the proof of Theorem 5.1 in [31].That completes the proof of Theorem 6.1.

We provide now example, where κ = 0, and the hypotheses of Theorem 6.1are satisfied, but not earlier ones [23, 26].

Example 6.2 [8, 26] Let i = j = 2, and R2 be equipped with the �1–norm.

Choose:

x0 = (.2505, 0)T, D = {x = (v, w)T : −1 < v < 1 and − 1 < w < 1}.Define function F on U(x0, σ ) ⊆ D (σ = .72) by

F(x) = (v − w, .5 (v − w)2), x = (v, w)T ∈ D. (6.13)

Then, for each x = (v, w)T ∈ D, the Fréchet–derivative of F at x, and theMoore–Penrose–pseudoinverse of F ′(x) are given by

F ′(x) =[

1 −1v − w w − v

](6.14)

and

F ′(x)+ = 12 (1 + (v − w)2)

[1 v − w

−1 w − v

](6.15)

respectively.Let x = (v1, w1)

T ∈ D and y = (v2, w2)T ∈ D. By (6.14), we have

‖ F ′(x) − F ′(y) ‖= |(v1 − v2) − (w1 − w2)| ≤‖ x − y ‖ .

That is L = L0 = 1.Using (6.14), (6.15) and (3.1), we obtain:

‖ F ′(y)+ (I − F ′(x) F ′(x)+) F(x) ‖≤ 25

‖ x − y ‖ .

Hence the constant κ in hypothesis (3.1) is given by:

κ = .4.

Using hypotheses of Theorem 6.1, (6.13)–(6.15), Theorem 2.4 in [23], andTheorem 3.1 in [26] are not applicable. However, our Theorem 6.1 can applyto solve equation (6.13).

Page 23: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

Numer Algor (2011) 58:23–52 45

Remark 6.3 If L = L0, Theorem 6.1 reduces to [31, Theorem 5.1]. Otherwise,it constitues an improvements (see Lemma 2.2).

If κ = 0 (Newton’s method for h0), we have λ = 0, and � = 1/2. Thensequences {tλ,n} and {vλ,n} reduce to:

t0 = 0, t1 = β, tn+1 = tn + L (tn − tn−1)2

2 (1 − L tn), (6.16)

and

v0 = 0, v1 = β, vn+1 = vn + L (vn − vn−1)2

2 (1 − L0 vn), (6.17)

respectively. The corresponding sufficient convergence conditions are:

hLHW = β L ≤ 12

[31], (6.18)

and

hAH = β L ≤ 12, (6.19)

where,

L = 18

(L + 4 L0 +

√L2 + 8 L0 L

)(6.20)

(see Lemma 5.3).Note that

hLHW ≤ 12

=⇒ hAH ≤ 12

(6.21)

but not necessarily vice versa unless if L = L0. Moreover, since L0L can be

arbitrarily small, we have by (6.18), and (6.19) that

hLHW

hAH−→ 1

4as

L0

L−→ 0. (6.22)

That is our approach extends the applicability of (GNM) by at must fourtimes.

Concerning the error bounds, we have already shown (see Lemma 2.2) that{vn} is a tighter majorizing sequence for {xn} than {tn} (see Example 6.4 (b)).

Example 6.4 Let X = Y = R2, be equipped with the max–norm, D = [�, 2 −

�]2, � ∈ [0, 1), and define function F on D by

F(x) = (ξ 31 − �, ξ 3

2 − �)T , x = (ξ1, ξ2)T . (6.23)

The Fréchet–derivative of operator F is given by

F ′(x) =[

3 ξ 21 0

0 3 ξ 22

].

Page 24: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

46 Numer Algor (2011) 58:23–52

(a) Let x0 = (1, 1)T . Using hypotheses of Theorem 6.1, we get:

β = 13

(1 − �), L0 = 3 − �, and L = 2 (2 − �).

Condition (6.18) is violated, since

2 hLHW = 43

(1 − �) (2 − �) > 1 for all � ∈ [0, .5).

Hence, there is no guarantee that (NM) converges to x� = ( 3√

�, 3√

�)T ,starting at x0. However, our condition (6.19) is true for all � ∈ I =[.450339002, 1

2

). Hence, the conclusions of our Theorem 6.1 can apply to

solve equation (6.23) for all � ∈ I.(b) Let x0 = (.9, .9)T , and � = .7. Using hypotheses of Theorem 6.1, we get:

β = .1, L = 2.6, L0 = 2.3, L = 2.39864766,

hLHW = .26 and hAH = .239864766.

Then (6.18) and (6.19) are satisfied.We have also:

F ′(.9, .9)−1 = .4115226337 I,

where, I is the identity 2 × 2 matrix.The hypotheses of our Theorem 6.1, and the Kantorovich theoremare satisfied. Then (NM) converges to x� = (.8879040017, .8879040017)T ,starting at x0. We also can provide the comparison table using thesoftware Maple 13.

Comparison table

n (NM) (6.17) (6.16)‖ xn+1 − xn ‖ vn+1 − vn tn+1 − tn

0 .01193415638 .1 .11 .0001618123748 .01688311688 .017567567572 2.946995467e-8 .0005067933842 .00057783552373 4.228114294e-11 4.57383463010e-7 6.265131450e-74 ∼ 3.725461907e-13 7.365175646e-135 ∼ 2.471607273e-25 1.017862116e-246 ∼ 1.087872853e-49 1.944019580e-487 ∼ 2.107538365e-98 7.091269701e-968 ∼ 7.909885354e-196 9.435626465e-1919 ∼ 1.114190851e-390 1.670568212e-38010 ∼ 2.210743650e-780 5.236621208e-760

The table shows that our error bounds vn+1 − vn are tighter than tn+1 − tn.

We also have the following result on error bound for (GNM).

Page 25: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

Numer Algor (2011) 58:23–52 47

Lemma 6.5 [7] Assume that there exist constants L0 ≥ 0, L ≥ 0, with L0 ≤ L,and β ≥ 0, such that:

q0 = L β

⎧⎪⎪⎪⎨⎪⎪⎪⎩

≤ 12

i f L0 = 0

<12

i f L0 = 0,

(6.24)

where, L is given by (6.20).Then, sequence {vk} (k ≥ 0) given by (6.17) is well def ined, nondecreasing,

bounded above by v��, and converges to its unique least upper bound v� ∈[0, v��], where

v�� = 2 β

2 − δ,

1 ≤ δ = 4 L

L +√

L2 + 8 L0 L< 2 for L0 = 0.

Moreover, the following estimates hold:

L0 v� ≤ 1,

0 ≤ vk+1 − vk ≤ δ

2(vk − vk−1) ≤ · · · ≤

2

)k

β, (k ≥ 1),

vk+1 − vk ≤(

δ

2

)k

(2 q0)2k−1 β, (k ≥ 0),

0 ≤ v� − vk ≤(

δ

2

)k(2 q0)

2k−1 β

1 − (2 q0)2k , (2 q0 < 1), (k ≥ 0).

Remark 6.6 If L = L0, the error bounds reduce to the classical ones [31].Otherwise, the error bounds are tighter since the ratio 2 hAH is smaller than2 hLHW (see also the comparison table in Example 6.4).

We provide numerical examples [6], where L0 < L.

Example 6.7 Define the scalar function F by F(x) = c0 x + c1 + c2 sin ec3 x,x0 = 0, where ci, i = 0, 1, 2, 3 are given parameters. Then it can easily be seen

that for c3 large and c2 sufficiently small,LL0

can be arbitrarily large.

Example 6.8 Let X = Y = C[0, 1], equipped with the max–norm. Consider thefollowing nonlinear boundary value problem [4]{

u′′ = −u3 − γ u2

u(0) = 0, u(1) = 1.

Page 26: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

48 Numer Algor (2011) 58:23–52

It is well known that this problem can be formulated as the integral equation

u(s) = s +∫ 1

0Q(s, t) (u3(t) + γ u2(t)) dt (6.25)

where, Q is the Green function:

Q(s, t) ={

t (1 − s), t ≤ ss (1 − t), s < t.

We observe that

max0≤s≤1

∫ 1

0|Q(s, t)| dt = 1

8.

Then problem (6.25) is in the form (1.1), where, F : D −→ Y is defined as

[F(x)] (s) = x(s) − s −∫ 1

0Q(s, t) (x3(t) + γ x2(t)) dt.

If we set u0(s) = s, and D = U(u0, R), then since ‖ u0 ‖= 1, it is easy toverify that U(u0, R) ⊂ U(0, R + 1). If 2 γ < 5, then, the operator F ′ satisfiesconditions (6.1), and (6.2), with

β = 1 + γ

5 − 2 γ, L = γ + 6 R + 3

4, L0 = 2 γ + 3 R + 6

8.

Note that L0 < L.

Other applications and examples including the solution of nonlinearChandrasekhar–type integral equations appearing in radiative transfer are alsofind in [4–6].

6.2 Local case

Let x� ∈ D be such that F(x�) = 0, F ′(x�) = 0, and (4.1) holds.We present the following improvement of Theorem 5.2 in [31].

Theorem 6.9 Assume that ‖ F ′(x�)+ ‖ F ′ satisf ies (6.1), and (6.2) on

U(x�,1

L0). Then, sequence {xn} generated by (GNM) is well def ined, remains in

U(x�, R0) for all n ≥ 0, and converges to a zero x� of F ′(.)+ F(.) in U(x�, R0),where,

R0 =

⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩

L+2�L0−√

(L + 2 � L0)2−2 �(L2+2 � L20)

2 (L2+2�L20)

if L+2�(2L0−L)≥0

1L0

if L+2�(2L0−L)<0.

(6.26)

Page 27: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

Numer Algor (2011) 58:23–52 49

Proof Note that the quantity under the radical in non–negative by (6.26).Let

L = L1 − L0 ‖ x0 − x� ‖ , r = 1

L0− ‖ x0 − x� ‖,

and

λ0 = (1 − β L0) �.

Then, by Lemma 4.3, ‖ F ′(x�) ‖+ F ′ is L–Lipschitz continuous on U(x0, r),and

t�λ0

≤ t�κ ≤ rκ ≤ r0− ‖ x0 − x� ‖= r, (6.27)

since

λ0 ≤ λ.

That is

U(x�, t�λ) ⊆ U(x0, r) ⊆ D.

Using Lemma 4.1, we have

β ≤ (2 − L ‖ x0 − x� ‖) ‖ x0 − x� ‖2 (1 − L0 ‖ x0 − x� ‖) ,

so,

β L ≤ (2 − L ‖ x0 − x� ‖) L ‖ x0 − x� ‖2 (1 − L0 ‖ x0 − x� ‖)2 ≤ �

by the choice of R0, and the function

ϒ : (s, t) −→ 2 (2 − t) t2 (1 − s)2 .

For s ≤ t, ϒ is increasing on (0, 1)2. Therefore, Theorem 6.1 is applicable.That completes the proof of Theorem 6.9.

Remark 6.10 If L = L0, Theorem 6.9 reduces to Theorem 5.2 in [31]. Other-wise, it is constitues an improvement since the radius of convergence

R1 =1 − 1√

2 � + 1L

given in [31] is smaller than R0. Hence, our Theorem 6.9 allows a wider choiceof initial guesses x0, and also provides tighter error bounds than Theorem 5.2in [31].

Example 6.11 [4, 6] Let X = Y = R. Define function F on D = [−1, 1],given by

F(x) = ex − 1. (6.28)

Page 28: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

50 Numer Algor (2011) 58:23–52

Then, for x� = 0, using (6.28), we have F(x�) = 0, and F ′(x�) = e0 = 1. More-over, L = e > L0 = e − 1.

Example 6.12 Let X = Y = C[0, 1], the space of continuous functions definedon [0, 1], equipped with the max norm, and D = U(0, 1). Define function F onD, given by

F(h)(x) = h(x) − 5∫ 1

0x θ h(θ)3 dθ. (6.29)

Then, we have:

F ′(h[u])(x) = u(x) − 15∫ 1

0x θ h(θ)2 u(θ) dθ for all u ∈ D.

Using (6.29), for x�(x) = 0 for all x ∈ [0, 1], we get

L = 15 and L0 = 7.5.

We also have in this example L0 < L.

7 Conclusion

Using our new concept of recurrent functions, and a combination of averageLipschitz/center–Lipschitz conditions, we provided a semilocal/local conver-gence analysis for (GNM) to approximate a locally unique solution of asystem of equations in finite dimensional spaces. Our analysis has the followingadvantages over the work in [31]: weaker sufficient convergence conditions,and larger convergence domain. Applications of our results to Kantorovich’sanalysis are also provided in this study.

References

1. Argyros, I.K.: On the Newton–Kantorovich hypothesis for solving equations. J. Comput. Appl.Math. 169, 315–332 (2004)

2. Argyros, I.K.: A unifying local–semilocal convergence analysis and applications for two–pointNewton–like methods in Banach space. J. Math. Anal. Appl. 298, 374–397 (2004)

3. Argyros, I.K.: On the semilocal convergence of the Gauss–Newton method. Adv. NonlinearVar. Inequal. 8, 93–99 (2005)

4. Argyros, I.K.: Computational theory of iterative methods. In: Chui, C.K., Wuytack, L. (eds.)Studies in Computational Mathematics, vol. 15. Elsevier, New York (2007)

5. Argyros, I.K.: On a class of Newton–like methods for solving nonlinear equations. J. Comput.Appl. Math. 228, 115–122 (2009)

6. Argyros, I.K., Hilout, S.: Efficient Methods for Solving Equations and Variational Inequalities.Polimetrica Publisher, Milano (2009)

7. Argyros, I.K., Hilout, S.: Enclosing roots of polynomial equations and their applications toiterative processes. Surv. Math. Appl. 4, 119–132 (2009)

8. Argyros, I.K., Hilout, S.: On the solution of systems of equations with constant rank deriva-tives. Numer. Algor. (to appear) doi:10.1007/s11075-010-9426-5

Page 29: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

Numer Algor (2011) 58:23–52 51

9. Argyros, I.K., Hilout, S.: Improved generalized differentiability conditions for Newton–likemethods. J. Complex. 26(3), 316–333 (2010)

10. Blum, L., Cucker, F., Shub, M., Smale, S.: Complexity and Real Computation. With a Fore-word by Richard M. Karp. Springer–Verlag, New York (1998)

11. Ben–Israel, A.: A Newton–Raphson method for the solution of systems of equations. J. Math.Anal. Appl. 15, 243–252 (1966)

12. Ben–Israel, A., Greville, T.N.E.: Generalized inverses. Theory and Applications, 2nd edn.CMS Books in Mathematics/Ouvrages de Mathématiques de la SMC, vol. 15. Springer–Verlag,New York (2003)

13. Chen, P.Y.: Approximate zeros of quadratically convergent algorithms. Math. Comput. 63,247–270 (1994)

14. Dedieu, J.P., Kim, M–H.: Newton’s method for analytic systems of equations with constantrank derivatives. J. Complex. 18, 187–209 (2002)

15. Dedieu, J.P., Shub, M.: Newton’s method for overdetermined systems of equations. Math.Comput. 69, 1099–1115 (2000)

16. Deuflhard, P.: A study of the Gauss–Newton algorithm for the solution of nonlinear leastsquares problems. Special topics of applied mathematics (Proc. Sem., Ges. Math. Datenver-arb., Bonn, 1979), pp. 129–150. North–Holland, Amsterdam–New York (1980)

17. Deuflhard, P., Heindl, G.: Affine invariant convergence theorems for Newton’s method andextensions to related methods. SIAM J. Numer. Anal. 16, 1–10 (1979)

18. Ezquerro, J.A., Hernández, M.A.: Generalized differentiability conditions for Newton’smethod. IMA J. Numer. Anal. 22, 187–205 (2002)

19. Ezquerro, J.A., Hernández, M.A.: On an application of Newton’s method to nonlinear opera-tors with w–conditioned second derivative. BIT 42, 519–530 (2002)

20. Gragg, W.B., Tapia, R.A.: Optimal error bounds for the Newton–Kantorovich theorem. SIAMJ. Numer. Anal. 11, 10–13 (1974)

21. Gutiérrez, J.M.: A new semilocal convergence theorem for Newton’s method. J. Comput.Appl. Math. 79, 131–145 (1997)

22. Gutiérrez, J.M., Hernández, M.A.: Newton’s method under weak Kantorovich conditions.IMA J. Numer. Anal. 20, 521–532 (2000)

23. Häubler, W.M.: A Kantorovich–type convergence analysis for the Gauss–Newton–method.Numer. Math. 48, 119–125 (1986)

24. He, J.S., Wang, J.H., Li, C.: Newton’s method for underdetermined systems of equations underthe γ –condition. Numer. Funct. Anal. Optim. 28, 663–679 (2007)

25. Hernández, M.A.: The Newton method for operators with Hölder continuous first derivative.J. Optim. Theory Appl. 109, 631–648 (2001)

26. Hu, N., Shen, W., Li, C.: Kantorovich’s type theorems for systems of equations with constantrank derivatives. J. Comput. Appl. Math. 219, 110–122 (2008)

27. Kantorovich, L.V., Akilov, G.P.: Functional Analysis. Pergamon Press, Oxford (1982)28. Li, C., Ng, K.F.: Majorizing functions and convergence of the Gauss-Newton method for

convex composite optimization. SIAM J. Optim. 18(2), 613–642 (2007)29. Li, C., Wang, J.: Newton’s method on Riemannian manifolds: Smale’s point estimate theory

under the γ –condition. IMA J. Numer. Anal. 26(2), 228–251 (2006)30. Li, C., Zhang, W–H., Jin, X–Q.: Convergence and uniqueness properties of Gauss–Newton’s

method. Comput. Math. Appl. 47, 1057–1067 (2004)31. Li, C., Hu, N., Wang, J.: Convergence bahavior of Gauss–Newton’s method and extensions to

the Smale point estimate theory. J. Complex. 26(3), 268–295 (2010)32. Potra, F.A.: On the convergence of a class of Newton–like methods. In: Iterative Solution

of Nonlinear Systems of Equations (Oberwolfach, 1982). Lecture Notes in Math., vol. 953,pp. 125–137. Springer, Berlin–New York (1982)

33. Potra, F.A.: On an iterative algorithm of order 1.839 · · · for solving nonlinear operator equa-tions. Numer. Funct. Anal. Optim. 7(1), 75–106 (1984/85)

34. Potra, F.A.: Sharp error bounds for a class of Newton–like methods. Libertas Mathematica 5,71–84 (1985)

35. Shub, M., Smale, S.: Complexity of Bezout’s theorem. IV. Probability of success, extensions.SIAM J. Numer. Anal. 33, 128–148 (1996)

36. Smale, S.: The fundamental theorem of algebra and complexity theory. Bull. Am. Math. Soc.4, 1–36 (1981)

Page 30: Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions

52 Numer Algor (2011) 58:23–52

37. Smale, S.: Newton’s method estimates from data at one point. In: The Merging of Disciplines:New Directions in Pure, Applied, and Computational Mathematics (Laramie, Wyo., 1985),pp. 185–196. Springer, New York (1986)

38. Smale, S.: Complexity theory and numerical analysis. In: Acta Numerica, pp. 523–551.Cambridge University Press, Cambridge (1997)

39. Stewart, G.W., Sun, J.G.: Matrix perturbation theory. Computer Science and ScientificComputing. Academic Press, Inc., Boston, MA (1990)

40. Traub, J.F., Wozniakowski, H.: Convergence and complexity of Newton iteration for operatorequations. J. Assoc. Comput. Mach. 26, 250–258 (1979)

41. Wang, X.H.: Convergence of Newton’s method and inverse function theorem in Banach space.Math. Comput. 68(255), 169–186 (1999)

42. Wang, X.H.: Convergence of Newton’s method and uniqueness of the solution of equations inBanach space. IMA J. Numer. Anal. 20(1), 123–134 (2000)

43. Wang, X.H., Han, D.F.: On dominating sequence method in the point estimate and Smaletheorem. Sci. China, Ser. A 33(2), 135–144 (1990)

44. Wang, X.H., Han, D.F.: Criterion α and Newton’s method under weak conditions. (Chinese)Math. Numer. Sin. 19(1), 103–112 (1997); translation in Chinese J. Numer. Math. Appl. 19(2),96–105 (1997)

45. Wang, X.H., Li, C.: The local and global behaviors of methods for solving equations. (Chinese)Kexue Tongbao 46(6), 444–451 (2001)

46. Wang, X.H., Li, C.: Local and global behavior for algorithms of solving equations. Chin. Sci.Bull. 46(6), 441–448 (2001)

47. Wang, G.R., Wei, Y.M., Qiao, S.Z.: Generalized inverse: Theory and Computations. SciencePress, Beijing, New York (2004)

48. Wang, X.H., Xuan, X.H.: Random polynomial space and computational complexity theory.Sci. Sin. Ser. A 30(7), 673–684 (1987)

49. Wang, D.R., Zhao, F.G.: The theory of Smale’s point estimation and its applications,Linear/nonlinear iterative methods and verification of solution (Matsuyama, 1993). J. Comput.Appl. Math. 60(1–2), 253–269 (1995)

50. Wedin, P.A.: Perturbation theory for pseudo–inverse. BIT 13, 217–232 (1973)51. Xu, X.B., Li, C.: Convergence of Newton’s method for systems of equations with constant rank

derivatives. J. Comput. Math. 25, 705–718 (2007)52. Xu, X.B., Li, C.: Convergence criterion of Newton’s method for singular systems with constant

rank derivatives. J. Math. Anal. Appl. 345, 689–701 (2008)


Top Related