Volume 2012, Issue 1 751975
Research Article
Open Access

On a New Method for Computing the Numerical Solution of Systems of Nonlinear Equations

H. Montazeri

H. Montazeri

Department of Mathematics, Islamic Azad University, Sirjan Branch, Sirjan, Iran intl.iau.ir

Search for more papers by this author
F. Soleymani

F. Soleymani

Department of Mathematics, Islamic Azad University, Zahedan Branch, Zahedan, Iran intl.iau.ir

Search for more papers by this author
S. Shateyi

Corresponding Author

S. Shateyi

Department of Mathematics, University of Venda, Private Bag X5050, Thohoyandou 0950, South Africa univen.ac.za

Search for more papers by this author
S. S. Motsa

S. S. Motsa

School of Mathematical Sciences, University of KwaZulu-Natal, Private Bag X01, Pietermaritzburg 3209, South Africa ukzn.ac.za

Search for more papers by this author
First published: 10 November 2012
Citations: 41
Academic Editor: Changbum Chun

Abstract

We consider a system of nonlinear equations F(x) = 0. A new iterative method for solving this problem numerically is suggested. The analytical discussions of the method are provided to reveal its sixth order of convergence. A discussion on the efficiency index of the contribution with comparison to the other iterative methods is also given. Finally, numerical tests illustrate the theoretical aspects using the programming package Mathematica.

1. Preliminaries

In this work, we consider the following system of nonlinear equations:
(1.1)
wherein the function F is F(x) = (f1(x), f2(x), …, fn(x)) T and the functions f1(x), f2(x), …, fn(x) are the coordinate functions, [1].

There are many approaches to solve the system (1.1). One of the best iterative methods to challenge this problem is Newton’s method. This method starts with an initial guess x(0) and after k updates and by considering a stopping criterion satisfies (1.1).

To find an update per full cycle of such fixed point methods, the linear system(s) involved in the process should be solved by direct or indirect methods. In terms of computational point of view, when dealing with large-scale systems arising from the discretization of nonlinear PDEs or integral equations, solving the system of linear equations by a direct method such as LU decomposition is not that easy. Hence, it seems reasonable to solve the linearized system approximately using iterative methods such as GMRES. In this way, one of such approaches may be categorized as an inexact method [2]. For example, Newton’s method to solve (1.1) can be written in the form below:
(1.2)
and it could be seen in the inexact form by considering η(k) which satisfies
(1.3)
with
(1.4)
as the forcing term [3].

A benefit of Newton’s method is that the obtained sequence converges quadratically if the initial estimate is close to the exact solution. However, there are some downsides of the method. One is the selection of the initial guess. A good initial guess can lead to convergence in a couple of steps. To improve the initial guess, several ideas can be found in the literature; see, for example, [46].

In what follows, we assume that F(x) is a smooth function of x in the open convex set Dn. There are plenty of other solvers to tackle the problem (1.1); see, for example, [710]. Among such methods, the third-order iterative methods like the Halley and Chebyshev methods [11] are considered less practically from a computational point of view because they need to compute the expensive second-order Frechet derivatives in contrast to the quadratically convergent method (1.2), in which the first-order Frechet derivative needs to be calculated.

Let us now review some of the most current methods for solving (1.1). In 2012, Sharma et al. in [12] presented the following quadratically convergent method:
(1.5)
The fourth-order method of Soleymani (the first two steps of relation (11) in [13]) for systems of nonlinear equations is as follows:
(1.6)

As could be seen, (1.5) and (1.6) include the evaluations F(x(k)), F(x(k)), and F(y(k)).

The primary goal of the present paper is to achieve both high order of convergence and low computational load in order to solve (1.1) with a special attention to the matter of efficiency index. Hence, we propose a new iterative method with sixth-order convergence to find both real and complex solutions. The new proposed method needs not the evaluation of the second-order Frechet derivative.

The rest of this paper is prepared as follows. In Section 2, the construction of the novel scheme is offered. Section 3 includes its analysis of convergence behavior, and it shows the suggested method has sixth order. Section 4 contains a discussion on the efficiency of the iterative method. This is followed by Section 5 where some numerical tests will be furnished to illustrate the accuracy and efficiency of the method. Section 6 ends the paper, where a conclusion of the study is given.

2. The Proposed Method

This section contains the new method of this paper. We aim at having an iteration method to have high order of convergence with an acceptable efficiency index. Hence, in order to reach the sixth order of convergence without imposing the computation of further Frechet derivatives, we consider a three-step structure using a Jarratt-type method as the predictor, while the corrector step would be designed as if no new Frechet derivative has been used. Thus, we should use F(x(k)) and F(y(k)) in the third step, and hence we could suggest the following iteration method:
(2.1)
Per computing step of the new method (2.1), for not large-scale problems, we may use the LU decomposition to prevent the computation of the matrix inversion which is costly. Simplifying method (2.1) for the sake of implementation yields in
(2.2)
wherein F(x(k))V(k) = F(x(k)), F(x(k))M(k) = F(y(k)), and F(x(k))W(k) = F(z(k)).

Now the implementation of (2.2) depends on the involved linear algebra problems. For large-scale problems, one may apply the GMRES iterative solver which is well known for its efficiency for large sparse linear systems.

Remark 2.1. The interesting point in (2.2) is that three linear systems should be solved per computing step, but all have the same coefficient matrix. Hence, one LU factorization per full cycle is needed, which reduces the computational load of the method when implementing.

We are about to prove the convergence behavior of the proposed method (2.2) using n-dimensional Taylor expansion. This is illustrated in the next section. An important remark on the error equation in this case comes next.

Remark 2.2. In the next section, e(k) = x(k)x* is the error in the kth iteration and is the error equation, where L is a p-linear function, that is, L(n, n, …, n) and p is the order of convergence. Observe that .

3. Convergence Analysis

Let us assess the analytical convergence rate of method (2.2) in what follows.

Theorem 3.1. Let F : Dnn be sufficiently Frechet differentiable at each point of an open convex neighborhood D of x*n, that is a solution of the system F(x) = 0. Let one suppose that F(x) is continuous and nonsingular in x*. Then, the sequence {x(k)} k≥0 obtained using the iterative method (2.2) converges to x* with convergence rate six, and the error equation reads

(3.1)

Proof. Let F : Dnn be sufficiently Frechet differentiable in D. By using the notation introduced in [14], the qth derivative of F at un, q ≥ 1, is the q-linear function F(q)(u) : n × ⋯×nn such that F(q)(u)(v1, …, vq) ∈ n. It is well known that, for x* + hn lying in a neighborhood of a solution x* of the nonlinear system F(x) = 0, Taylor expansion can be applied and we have

(3.2)
where Cq = (1/q!)[F(x*)] −1F(q)(x*), q ≥ 2. We observe that Cqhqn since F(q)(x*) ∈ (n × ⋯×n, n) and [F(x*)] −1(n). In addition, we can express F as
(3.3)
wherein I is the identity matrix of the same order to the Jacobian matrix. Note that qCqhq−1(n). From (3.2) and (3.3) we obtain
(3.4)
(3.5)
where Ck = (1/k!)[F(x*)] −1F(k)(x*), k = 2,3, …, and e(k) = x(k)x*. From (3.5), we have
(3.6)
where X1 = −2C2, , , …. Note that based on Remark 2.2. is a singular matrix, not a vector. Then,
(3.7)
and the expression for y(k) is
(3.8)
The Taylor expansion of the Jacobian matrix F(y(k)) is
(3.9)
where N1 = (2/3)C2, , and . Therefore,
(3.10)
From an analogous reasoning as in (3.9), we obtain
(3.11)
Hence, taking into account (3.11), it will be easy to write the Taylor series of F(z(k)) as follows:
(3.12)
We now should find the Taylor series at the third step of (2.2), thus using (3.6) and (3.12), we have
(3.13)
By using (3.13), we could have
(3.14)
Combining the error equation terms (3.14) and (3.11) in the iteration method (2.2) will end in the final error equation (3.1), which shows that the new method has sixth order of convergence for solving systems of nonlinear equations.

4. Concerning the Efficiency Index

Now we assess the efficiency index of the new iterative method in contrast to the existing methods for systems of nonlinear equations. In the iterative method (2.2), three linear systems based on LU decompositions are needed to obtain the sixth order of convergence. The point is that for applying to large-scale problems one may solve the linear systems by iterative solvers and the number of linear systems should be in harmony with the convergence rate. For example, the method of Sharma et al. (1.5) requires three different linear systems for large sparse nonlinear systems, that is, the same with the new method (2.2), but its convergence rate is only 4, which clearly shows that method (2.2) is better than (1.5).

Now let us invite the number of functional evaluations to obtain the classical efficiency index for different methods. The iterative method (2.2) has the following computational cost (without the index k): n evaluations of scalar functions for F(x), n evaluations of scalar functions F(z), n2 evaluations of scalar functions for the Jacobian F(x), and n2 evaluations of scalar functions for the Jacobian F(y).

We now provide the comparison of the classical efficiency indices for methods (1.2), (1.5), and (1.6) alongside the new proposed method (2.2). The plot of the index of efficiencies according to the definition of the efficiency index of an iterative method, which is given by E = p1/C, where p is the order of convergence and C stands for the total computational cost per iteration in terms of the number of functional evaluations, is given in Figure 1. A comparison over the number of functional evaluations of some iterative methods is also illustrated in Table 1. Note that both (1.5) and (1.6) in this way have the same classical efficiency index.

Table 1. Comparison of efficiency indices for different methods.
Iterative methods (1.2) (1.5) (1.6) (2.2)
Number of steps 1 2 2 3
Rate of convergence 2 4 4 6
Number of functional evaluations n + n2 n + 2n2 n + 2n2 2n + 2n2
The classical efficiency index
Number of LU factorizations 1 2 2 1
  • Cost of LU factorizations
  • (based on flops)
2n3/3 4n3/3 4n3/3 2n3/3
Cost of linear systems (based on flops) (2n3/3) + 2n2 (10n3/3) + 2n2 (7n3/3) + 2n2 (5n3/3) + 4n2
Flops-like efficiency index
Details are in the caption following the image
The plot of the (traditional) efficiency indices for different methods ((a), for n = 2, …, 10 and (b), for n = 11, …, 20).
Details are in the caption following the image
The plot of the (traditional) efficiency indices for different methods ((a), for n = 2, …, 10 and (b), for n = 11, …, 20).

In Figures 1 and 2, the colors yellow, black, purple, and red stand for methods (1.6), (1.2), (1.5), and (2.2), respectively. It is clearly obvious that the new method (2.2) for any n ≥ 2 has dominance with respect to the traditional efficiency indices on the other well-known and recent methods.

Details are in the caption following the image
The plot of the flops-like efficiency indices for different methods ((a), for n = 2, …, 10 and (b), for n = 11, …, 20).
Details are in the caption following the image
The plot of the flops-like efficiency indices for different methods ((a), for n = 2, …, 10 and (b), for n = 11, …, 20).

As was positively pointed out by the second referee, taking into account only the number of evaluations for scalar functions cannot be the effecting factor for evaluating the efficiency of nonlinear solvers. The number of scalar products, matrix products, decomposition LU of the first derivative, and the resolution of the triangular linear systems are of great importance in assessing the real efficiency of such schemes. Some extensive discussions on this matter can be found in the recent literature [15, 16].

To achieve this goal, we in what follows consider a different way. Let us count the number of matrix quotients, products, summations, and subtractions along with the cost of solving two triangular systems, that is, based on flops (the real cost of solving systems of linear equations). In this case, we remark that the flops for obtaining the LU factorization are 2n3/3, and to solve two triangular system, the flops would be 2n2. Note that if the right-hand side is a matrix, the cost (flops) of the two triangular systems is 2n3, or roughly n3 as considered in this paper. Table 1 also reveals the comparisons of flops and the flops-like efficiency index. Note that to the best of the authors’ knowledge, such an index has not been given in any other work. Results of this are reported in Table 1 and Figure 2 as well. It is observed that the new scheme again competes all the recent or well-known iterations when comparing the computational efficiency indices.

5. Numerical Testing

We employ here the second-order method of Newton (1.2), the fourth-order scheme of Sharma et al. (1.5), the fourth-order scheme of Soleymani (1.6), and the proposed sixth-order iterative method (2.2) to compare the numerical results obtained from these methods in solving test nonlinear systems. In this section, the residual norm along with the number of iterations in Mathematica 8 [17] is reported in Tables 2-3. The computer specifications are Microsoft Windows XP Intel(R), Pentium(R) 4 CPU, 3.20 GHz with 4 GB of RAM. In numerical comparisons, we have chosen the floating point arithmetic to be 256 using the command SetAccuarcy[expr, 256] in the written codes. The stopping criterion (for the first two examples) is the residual norm of the multivariate function to be less than 1 · E − 150, that is, | | F(x(k))|| ≤ 10−150. Note that for some iterative methods if their residual norms at the specified iteration exceed the bound of 10−200, we will consider that such an approximation is the exact solution and denoted the residual norm by 0 in some cells of the tables. We consider the test problems as follows.

Table 2. Results of comparisons for different methods in Example 5.1.
Iterative methods (1.2) (1.5) (1.6) (2.2)
Number of iterations 13 7 7 6
Residual norm 3.03 × 10−108 2.44 × 10−162 0 6.54 × 10−137
Number of functional evaluations 3120 3255 3255 2830
Time 1.59 1.76 1.79 1.51
Table 3. Results of comparisons for different methods in Example 5.2 when n = 99.
Iterative methods (1.2) (1.5) (1.6) (2.2)
Number of iterations 9 4 4 4
Residual norm 0 1.57 × 10−101 7.63 × 10−112 0
Number of functional evaluations 89100 78804 78804 79200
Time 1.12 1.58 2.64 1.20

Example 5.1. As the first test, we take into account the following hard system of 15 nonlinear equations with 15 unknowns having a complex solution to reveal the capability of the new scheme in finding (n-dimensional) complex zeros:

(5.1)
where its complex solution up to 10 decimal places is as follows: x* ≈ (1.981336305+ 0.983441720i, 3.409384824 − 0.764284796i, 1.813731796 − 0.637692532i, 3.491727320 + 0.872252620i, 6.550770690 − 0.907243589i, 1.336919493 − 1.019442606i, 79.096785866 + 48.25743733i, 3.082975105 + 0.835126126i, 5.320491462 − 1.520266411i, 0.000020217 + 0.000010961i, 0.013114096 + 0.08934367440i, 13.79912588 − 26.64001284i, 1.144969346 + 2.175231550i, −2.699288335 − 6.949621654i, −3.302042978 − 0.005478294i)T.

In this test problem, the approximate solution up to 2 decimal places can be constructed based on a line search method for obtaining robust initial value as follows: x* ≈ (1.98 + 0.98I, 3.40 − 0.76I,1.81 − 0.63I, 3.49 + 0.87I, 6.55 − 0.90I,1.33 − 1.01I, 79.10 + 48.26I,3.08 + 0.83I, 5.32 − 1.52I, 0.02 + 0.01I, 0.00 + 0.1I, 13.80 − 26.64I,1.14 + 2.17I, −2.69 − 6.94I, −3.30 + 0.00I) T. The results for this test problem are given in Table 2.

Results in Table 2 show that the new scheme can be considered for complex solutions of hard nonlinear systems as well. In this test, we have used the LU decomposition, when dealing with linear systems.

Example 5.2. In order to tackle with large-scale nonlinear systems, we have included this example in this work:

(5.2)
where its solution is the vector x* = (1, …, 1) T for odd n, and its first Frechet derivative has the following sparse pattern:
(5.3)

We report the numerical results for solving Example 5.2 in Table 3 based on the initial value x0 = Table  [2., {i, 1,99}]. The case for 99   ×   99 is considered. In Table 3, the new scheme is good in terms of the obtained residual norm in a few number of iterations.

Throughout the paper, the computational time has been reported by the command AbsoluteTiming[]. The mean of 5 implementations is listed as the time. In the following example, we consider the application of such nonlinear solvers when challenging hard nonlinear PDEs or nonlinear integral equations, some of such applications have also been given in [1820].

Example 5.3. Consider the following nonlinear system of Volterra integral equations:

(5.4)
where f1(x) = cos (x)−(1/2)sin2(x), f2(x) = sin(x) − x. In order to solve this nonlinear problem, there are many methods. Recently, Samadi and Tohidi [21] showed that the spectral method is a much more reliable treatment in solving such problems. In fact, they discussed that traditional solvers will require very fine grid point which may cause obtaining large-scale (ill-conditioned) nonlinear systems of equations after implementation to find the final solution. As a remedy, they presented a stable Legendre collocation method for solving systems of Volterra integral equations (SVIEs) of the second kind. Hence, now by applying the same procedure in [21] and by considering N = 3, and 5 digits at the first part of the process, the following nonlinear system of eight equations (5.5) will be attained where its solutions vector is {0.9975…, 0.946…, 0.783…, 0.597…, 0.0693…, 0.324…, 0.62…, 0.80…}.

Numerical results for this example are reported in Table 4. We again used 256-digit floating point in our calculations but the stopping criterion is the residual norm to be accurate for seven decimal places at least, that is to say, | | F(x(k))|| ≤ 10−7. The starting values are chosen as (−10, …, −10) T. In this case, we only report the time of solving the system (5.4) by the iterative methods and do not include the computational time of spectral method to first discretize (5.4).

Table 4. Results of comparisons for different methods in Example 5.3 when n = 8.
Iterative methods (1.2) (1.5) (1.6) (2.2)
Number of iterations 8 5 5 4
Residual norm 2.47 × 10−7 3.40 × 10−16 1.26 × 10−16 4.47 × 10−10
Number of functional evaluations 576 680 680 576
Time 0.03 0.03 0.03 0.015
The aim of the above example was twofold, first to show the clear reduction in number of steps in solving practical problems and also to reveal the fact that the proposed iterative method could be applied on inexact systems as well. By the word inexact, we mean the coefficients of the nonlinear systems are not integer and they are some number with floating arithmetic in essence, which itself may cause some round-off errors. We observe from Tables 24 that not only the order of convergence but also the number of new functional evaluations and operations is important in order to obtain new efficient iterative methods to solve nonlinear systems of equations.
(5.5)

6. Conclusions

In this paper, an efficient iterative method for finding the real and complex solutions of nonlinear systems has been presented. We have supported the proposed iteration by a mathematical proof through the n-dimensional Taylor expansion. This let us analytically find the sixth order of convergence. Per computing step, the method is free from second-order Frechet derivative.

A complete discussion on the efficiency index of the new scheme was given. Nevertheless, the efficiency index is not the only aspect to take into account: the number of operations per iteration is also important; hence, we have given the comparisons of efficiencies based on the flops and functional evaluations. Some different numerical tests have been used to compare the consistency and stability of the proposed iteration in contrast to the existing methods. The numerical results obtained in Section 5 reverified the theoretical aspects of the paper. In fact, numerical tests have been performed, which not only illustrate the method practically but also serve to check the validity of theoretical results we had derived.

Future studies could focus on two aspects, one to extend the order of convergence alongside the computational efficiency and second to present some hybrid algorithm based on convergence guaranteed methods (at the beginning of the iterations) to ensure being in the trust region to apply the high-order methods.

Acknowledgments

The authors thank the reviewers for offering some helpful comments on an earlier version of this work. The work of the third author is financially supported through a Grant by the University of Venda, South Africa.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.