Volume 2013, Issue 1 845459
Research Article
Open Access

A Prediction-Correction Dynamic Method for Large-Scale Generalized Eigenvalue Problems

Xin-long Luo

Corresponding Author

Xin-long Luo

School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, P.O. Box 101, Beijing 100876, China bupt.edu.cn

Search for more papers by this author
Jia-ru Lin

Jia-ru Lin

School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, P.O. Box 101, Beijing 100876, China bupt.edu.cn

Search for more papers by this author
Wei-ling Wu

Wei-ling Wu

School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, P.O. Box 101, Beijing 100876, China bupt.edu.cn

Search for more papers by this author
First published: 11 September 2013
Citations: 3
Academic Editor: Chang-Hua Lien

Abstract

This paper gives a new prediction-correction method based on the dynamical system of differential-algebraic equations for the smallest generalized eigenvalue problem. First, the smallest generalized eigenvalue problem is converted into an equivalent-constrained optimization problem. Second, according to the Karush-Kuhn-Tucker conditions of this special equality-constrained problem, a special continuous dynamical system of differential-algebraic equations is obtained. Third, based on the implicit Euler method and an analogous trust-region technique, a prediction-correction method is constructed to follow this system of differential-algebraic equations to compute its steady-state solution. Consequently, the smallest generalized eigenvalue of the original problem is obtained. The local superlinear convergence property for this new algorithm is also established. Finally, in comparison with other methods, some promising numerical experiments are presented.

1. Introduction

In this paper, we consider the smallest generalized eigenvalue problem, which is often encountered in engineering applications such as automatic control, dynamical analysis of structure, electronic structure calculations, and quantum chemistry (see [1, 2] and references therein). For this old and active problem, recently, Gao et al. gave an interesting continuous projected method [3, 4]. This article follows this line and gives a new prediction-correction method based on the dynamical system of differential-algebraic equations (DAEs) for this problem.

First, we convert the smallest eigenvalue problem into an equivalent equality-constrained optimization problem. Second, from the Karush-Kuhn-Tucker conditions of this optimization problem, we obtain a variant of the Rayleigh quotient gradient flow [5, 6], which is formulated by a system of DAEs. Third, applying the implicit Euler method [7], a projected technique [4, 8, 9], and a new time-stepping strategy to that dynamical system of DAEs, we construct a new prediction-correction method to compute a steady-state solution of that special dynamical system. Consequently, we obtain the smallest generalized eigenvalue of (A, B). We also establish the local superlinear convergence property for this new method. Last, in comparison with other methods, some promising numerical experiments are presented. Throughout this article, ∥·∥ denotes the Euclidean vector norm and the corresponding induced matrix norm.

2. Continuous Dynamical Model

The generalized eigenvalue problem (A, B) is to find a scalar λ and nonzero vector xn to satisfy
()
where A and B are n × n real symmetric matrices and B is positive definite. Problem (1) can be converted to an equivalent equality-constrained optimization problem [4]:
()
If x* is an optimal solution of problem (2), according to the Karush-Kuhn-Tucker conditions [10, p. 328], then there exists a Lagrange multiplier λ* such that the following conditions are satisfied at (x*, λ*):
()
Therefore, the solution x* of problem (2) is a generalized eigenvector of (A, B).

Property 1 (Gao et al., 2008 [4]). If x* is an optimal solution of problem (2) and λ* is the corresponding Lagrange multiplier, then they satisfy the second-order optimality condition

()
Furthermore, a local minimizer of (2) is also its global minimizer. If x* is a global minimizer of (2), then it is an eigenvector associated with the smallest generalized eigenvalue of (1).

From the first-order optimality conditions (3) of problem (2), we construct the following negative gradient flow:
()
on the generalized spherical surface,
()
The system of DAEs (5)-(6) is a special dynamical system which is derived from the equality-constrained optimization problem (2). According to the definition of the differential index of DAEs [11], we can verify that the index of DAEs (5)-(6) is two. Therefore, the solution of DAEs (5)-(6) is not easy to be obtained by the general software package such as bdf15s [7, 11, 12]. Thus, we need to devise a special technique for its steady-state solution.
Differentiating the algebraic constraint xTBx = 1 along the trajectory (x(t), λ(t)) of (5) and (6), we obtain
()
Consequently, we obtain
()
Substituting (8) into (5), we also obtain the following generalized gradient flow:
()
Conversely, if x(t) is a solution of (9), then d(xTBx)/dt = 0; that is, x(t) satisfies xTBx = 1, and consequently x(t) also satisfies (5)-(6). Therefore, the continuous dynamical system of (5) and (6) is equivalent to the generalized gradient flow (9).
Since the right-hand-side function of (9) is Lipschitz continuous, it has a unique solution x(t) for any initial point x(0) = x0. We can verify that the objective function f(x) = xTAx is monotonically decreasing along the trajectory x(t) of (5)-(6). Actually, from (5)–(8), using the Cauchy-Schwartz inequality |yTz | ≤ ∥y∥∥z∥, we have
()
Furthermore, the solution x(t) of (9) converges to an eigenvector x* associated with the smallest generalized eigenvalue λ* of (A, B), for almost all initial condition x(0) = x0, where x0 satisfies ∥x0∥ = 1, except for a set of measure zero, that is, the case of a multiple generalized eigenvalue λ* [5]. Therefore, by following the trajectory x(t) of (5)-(6), we can compute a global minimum point x* of problem (2), and consequently obtain the smallest generalized eigenvalue λ* of (A, B).

3. A Prediction-Correction Method

In this section, in order to obtain a steady-state solution of DAEs (5)-(6), we construct a prediction-correction method to follow its trajectory. According to the discussion of Section 2, we know that a steady-state solution of DAEs (5)-(6) is a generalized eigenvector associated with the smallest generalized eigenvalue of (A, B). Therefore, by following the discrete trajectory of DAEs (5)-(6), we can compute an approximate eigenvector associated with the smallest generalized eigenvalue and consequently obtain the smallest generalized eigenvalue.

Note that we mainly consider the steady state of the system of DAEs (5)-(6) and do not care about an accurate solution at its transient-state phase. In order to avoid consuming unnecessary computing time, we adopt the first-order implicit Euler method to follow its trajectory. The time steps of the implicit Euler method are not restricted by the absolute stability property for the linear test equation dx/dt = −μx, where μ > 0 [7], and consequently the implicit Euler method can take large steps at the steady-state phase so that the iteration sequence {xk} converges rapidly to a stationary point x* of the system of DAEs (5)-(6).

By applying one implicit Euler iteration to the system of DAEs (5)-(6), we obtain the following iteration formula of the form:
()
()
where Δtk is the time step. Note that the system of (11)-(12) is nonlinear, and generally, its solution cannot explicitly be obtained. Therefore, we use the alternating projection idea [13, 14] to handle it.
We replace λk+1 with λk in (11) to obtain the predicted point as follows:
()
Since the predicted point moves away from the elliptic spherical surface, we project this predicted point onto the unit elliptic spherical surface, so that the iterated point xk+1 satisfies the algebraic constraint (6); namely, we find the shortest distance between and the unit elliptic spherical surface. We achieve this aim by solving the following equality-constrained problem:
()
Since the optimal solution of the previously problem is not apparently obtained, we turn to achieve its suboptimal solution. Since matrix B is positive, it can be decomposed as B = LTL. Let y = Lx. Then, problem (14) is equivalent to the following problem:
()
Since , we obtain the suboptimal solution by solving the following problem:
()
From the shortest distance between a point and a unit spherical surface being the intersection point of the straight line through the center of that circle and the point and that unit spherical surface, we obtain the previous minimum point as . Then, the suboptimal solution point of problem (14) is obtained as
()
Now, we use the correction point xk+1 to update λk+1, so that the steady-state equation AxλBx = 0 is satisfied as possible. Namely, we find the minimum point of the following problem:
()
Since matrix B is positive, it can be decomposed as B = LTL. Let zk+1 = Lxk+1. Then, the previous minimum problem is equivalent to the following problem:
()
Using the property ∥LT(LTAxk+1λLxk+1)∥ ≤ ∥LT∥∥(LTAxk+1λLxk+1)∥, we obtain the suboptimal solution of problem (18) by solving the following problem:
()
It is not difficult to obtain the minimum solution of the previous problem as
()
which is the suboptimal solution of problem (18).
Another issue is how to choose the time step Δtk at every iteration. We adopt an analogous trust-region technique to adaptively adjust the time step Δtk. Our intuition is that the time step Δtk can be enlarged when the predicted point is also near the elliptic spherical surface xTBx = 1; otherwise, the time step Δtk is reduced. In order to measure the distance between and the unit elliptic spherical surface, we construct the following approximate model:
()
Then,
()
where constants η1, η2, γ1, and γ2 satisfy
()

According to the previous discussion, we give the following discrete dynamical method for the smallest generalized eigenvalue of (A, B).

Algorithm 1. Prediction-correction method for the smallest generalized eigenvalue.

Step  1. Initialize the parameters. Specify a tolerated error TOL and an initial point x0 to satisfy

()
Compute and the residual r0 = ∥Ax0λ0Bx0∥.

Step  2. Repeat for k = 0,1, 2, … until rk = ∥AxkλkBxk∥ ≤ TOL:

  • (a)

    solve (13) to obtain the predicted point ;

  • (b)

    evaluate (17) to obtain the corrected point xk+1;

  • (c)

    compute (21) to update the Lagrange multiplier λk+1;

  • (d)

    compute the residual

    ()

  • (e)

    update the time step Δtk+1 according to the time-stepping scheme (22)-(23);

  • (f)

    accept the trial vector. If ρk < η2, let xk = xk+1, λk = λk+1 and rk = rk+1;

  • (g)

    try again starting at (a) with kk + 1.

4. Local Convergence Analysis

The sequence (xk, λk) generated by Algorithm 1 follows the continuous trajectory (5)-(6). Adopting the similar estimation technique in [15, 16], we obtain that the local truncation error of xk is Otk). According to the standard analysis of numerical ordinary differential equations [7, 11], we know that xk is close to x(tk) when Δtk is small enough in the finite time interval [0, T]. Therefore, we only need to estimate the local convergence rate of xk in the steady-state phase.

For the convenience of analysis, we decompose matrix B as B = LTL and denote C = LTAL−1. Then, the smallest generalized eigenvalue λ* of (A, B) and associated with the eigenvector x* is equivalent to the smallest eigenvalue of symmetric matrix C and associated with the eigenvector y* = Lx*, respectively. Therefore, we only need to consider the convergence property of (yk, λk), where yk = Lxk. xk is generated by Algorithm 1.

We denote the error angle between yk and y* by ϕk = (yk, y*). Then, the current iteration yk can be decomposed as
()
where and ∥uk∥ = ∥y*∥ = 1. From (27), we obtain
()

According to the previous discussion, we give the following local convergence property for Algorithm 1.

Theorem 2. Assume that the sequence (xk, λk) is generated by Algorithm 1. Then, the sequence of error angles ϕk tends to zero superlinearly.

Proof. From (21) and Cy* = λ*y*, we have

()
It gives
()
Thus, we choose a small enough positive ε1 > 0, such as ε1 = 3/4 | λ*μ2|, and a large enough k to satisfy
()
Then, combining inequality (30), we have
()

Consequently, we obtain

()

We can verify that there exists a positive constant εt such that all the time steps Δtk are bounded below by Δtmin ; that is, Δtk ≥ Δtmin . Actually, from (13), we know that will be close to ∥xk∥, when Δtk is small enough. Consequently, for small enough Δtk, we have

()
According to the time-stepping strategy (23), Δtk+1 will be enlarged. Therefore, Δtk are bounded below by a positive number Δtmin .

On the other hand, from (13) and (27), we obtain

()
where uk+1 = (I + Δtk(CλkI)) −1uk/∥(I + Δtk(CλkI)) −1uk∥ and . Noticing and yk+1 = y*cos ϕk+1 + uk+1sinϕk+1, from (35), we obtain
()

Since uk is orthogonal to y* and ∥uk∥ = 1,  uk can be decomposed as uk = α2z2 + ⋯+αnzn, where . Let μi    (i = 1,2, …, n) be the eigenvalues of C = LTAL−1 in ascending order. Thus, from the spectral decomposition of C, for large enough k, we obtain

()
It gives
()
From (36) and (38), for large enough k, we have
()

Combining (33) and (39), we know that ϕk converges linearly to zero; namely, yk converges linearly to y*, when we choose the initial k such that (yk, λk) is close enough (y*, λ*).

Since (yk, λk) converges linearly to (y*, λ*), according to the time-stepping strategy (23), we obtain that the time step Δtk will tend to infinity. Consequently, from inequality (39), we know that ϕk superlinearly converges to 0.

Remark 3. From (28) and inequality (39), we have

()
Consequently, the error angles ϕk tend to zero cubically, if we adopt a suitable time-stepping strategy such as 1/Δtk ≈ sin2ϕk for large enough k.

Remark 4. The convergence rate of the Rayleigh quotient iteration is cubic; however, its iterate point sequence {xk} is not guaranteed to converge to the smallest generalized eigenvalue of (A, B) [17]. The convergence rate of Algorithm 1 is less than the convergence rate of the Rayleigh quotient iteration, but the iterate point sequence {xk} of Algorithm 1 follows the trajectory of the special dynamical system of DAEs (5)-(6), and consequently it converges to a stationary point x* of the system of DAEs (5)-(6), that is, one of generalized eigenvectors associated with the smallest generalized eigenvalue λ* of (A, B).

5. Numerical Experiments

We give some numerical experiments for Algorithm 1 (which is denoted by the PCM method), in comparison with the recent continuous projected method by Gao et al. (which is denoted by the GGL method [4]) the restarted Arnoldi method (the EIGS method [18]), and the Jacobi-Davidson method (JDQZ [19, 20]). For Algorithm 1, we pick η1 = 0.25, η2 = 0.75, γ1 = 2, and γ2 = 0.5 and compute an initial time-step length Δt0 as follows:
()
For the GGL method, we use the solver ODE45 [12] for the continuous projected dynamical method and set RelTol = 10−6 and AbsTol = 10−9, which are suggested in [4]. All of our tests were run in Matlab 2009a environment on a Lenovo Thinkpad laptop X200 with 2.4 GHz processor. The compared methods for the test problems are terminated when the condition
()
is satisfied. The descriptions of test problems and the numerical results are presented by the following.

Example 5 (see [4], [21].)This test problem has four generalized eigenvalues. The first three generalized eigenvalues are 0, 0, and 0, and the other is 2. The stiffness and mass matrices are given as follows:

()

It is not difficult to verify that matrix B is positive definite. We choose an initial point x0 = [0.2,0.6, −0.8,0.6] T.

Example 6 (see [4], [21].)The stiffness and mass matrices are given as follows:

()
and B = diag (m1, m2, …, mn) with n = 20, ki = 4 + i, and mi = 35 − i for i = 1,2, …, 20.

Example 7 (see [22].)This test problem is from fluid flow generalized eigenvalues (see the Bcsstruc1 set in [22]).

Consider the following

()
Here, the mass matrix M (bcsstm13.mtx) is a real symmetric positive semidefinite matrix; and the stiffness matrix K (bcsstk13.mtx) is a real positive definite matrix. Therefore, its smallest generalized eigenvalue is zero. In order to test the effect of algorithms, we apply a sift to M as follows:
()
Then, the smallest generalized eigenvalue of (K, B) is 1/2.

Example 8. In order to test the performance of these two methods for large-scale problems, we construct a large-sparse matrix A as follows:

  • (1)

    let A1 = sparse(diag (ones(n, 1)));

  • (2)

    let A2 = sparse(diag (ones(n − 1,1), 1));

  • (3)

    let A3 = sparse(diag (ones(n − 1,1), −1));

  • (4)

    let A = 2A1 + A2 + A3;

  • (5)

    let B = 4A1 + A2 + A3.

Here, sparse.m, diag.m, and ones.m are the Matlab functions, and n is the dimension of A. In our numerical experiments, we set n = 1000 for this test problem.

Numerical results for these four test problems are reported in Table 1. Iter and CPU designate the number of iterations and the computed time (given in seconds), respectively. From the computed results for Examples 5 and 8 in Table 1, we see that the proposed method (the PCM method) manages to find a smaller minimum eigenvalue than the compared method (the GGL method). Furthermore, the computed time of the PCM method is less than the GGL method and the JDQZ method for these test problems. For Example 5, the EIGS method fails since it gives the message: “(A-sigma*B) is singular. The shift is an eigenvalue. Try to use some other shift please,” when we execute it.

Table 1. Comparison with different methods.
Examples PCM GGL EIGS JDQZ
λk Iter (CPU) λk Iter (CPU) CPU CPU
Example 5  1.0e − 13 12 (0.021) 4.5e − 11 77 (0.178) FAIL FAIL
Example 6 0.01337 18 (0.004) 0.01344 3389 (0.36) 0.01 0.49
Example 7 0.4998 49 (2.556) 0.5 33 (6.643) 0.24 14.17
Example 8  1.1e − 5 28 (0.024) 0.0122 514 (169.3) 0.02 0.686

In order to record the convergence history, we draw the iteration trajectory of the smallest eigenvalue computed by the PCM method and by the GGL method for every test problem in Figures 1, 2, 3, and 4. From those figures, we find that Algorithm 1 (the PCM method) achieves the steady state of the dynamical system faster than the GGL method.

Details are in the caption following the image
Numerical results of Example 5 with PCM and GGL methods.
Details are in the caption following the image
Numerical results of Example 6 with PCM and GGL methods.
Details are in the caption following the image
Numerical results of Example 7 with PCM and GGL methods.
Details are in the caption following the image
Numerical results of Example 8 with PCM and GGL methods.
In order to test the sensitivity of Algorithm 1, we compute the test problems with Algorithm 1 for different parameters γ1, γ2, η1, and η2. The robustness of Algorithm 1 is mainly determined by the parameters η1 and η2 and the parameters η1 and η2 affect its efficiency. Thus, we fix parameters γ1 = 2 and γ2 = 1/2. We choose four groups of parameters η1 and η2 as following:
  • (I)

    η1 = 0.25, η2 = 0.75;

  • (II)

    η1 = 0.1,   η2 = 0.9;

  • (III)

    η1 = 0.3, η2 = 0.7;

  • (IV)

    η1 = 0.4, η2 = 0.8.

The numerical results are reported in Table 2. From Table 2, we find that Algorithm 1 is not sensitive for different parameters η1 around 0.25 and η2 around 0.75. Thus, we set the default parameters η1 = 0.25 and η2 = 0.75 in Algorithm 1.

Table 2. Numerical results for different parameters of Algorithm 1.
Examples (I) (II) (III) (IV)
Iter (CPU) Iter (CPU) Iter (CPU) Iter (CPU)
Example 5 12 12 12 12
Example 6 19 19 19 17
Example 7 49 49 50 45
Example 8 26 31 22 22

6. Conclusions

This paper discusses the connection between the smallest generalized eigenvalue problem and the continuous dynamical system of DAEs. The trajectory of this special dynamical system is followed by a new prediction-correction method (Algorithm 1), which is derived from the implicit Euler formula and an analogous trust-region technique. Consequently, an equilibrium point x* of this continuous dynamical system is obtained. This equilibrium point x* is also a generalized eigenvector associated with the smallest generalized eigenvalue. The local superlinear property for the new prediction-correction method is presented. Numerical experiments indicate that Algorithm 1 is promising for the smallest generalized eigenvalue problem. Another interesting issue is whether or not the second-order, pseudo-transient method [23] obtains the superior numerical property for this special dynamical system of DAEs (5)-(6). We would like to consider this issue in our future works.

Acknowledgments

Xin-long Luo is grateful to Proffessor Li-Zhi Liao for fruitful discussions when he visited Hong Kong Baptist University in winter, 2011. This work was supported in part by Grant 2009CB320401 from the National Basic Research Program of China, Grant YBWL2011085 from Huawei Technologies Co., Ltd., and Grant YJCB2011003HI from the Innovation Research Program of Huawei Technologies Co., Ltd., Grant BUPT2009RC0118 from the Fundamental Research Funds for the Central Universities.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.