Volume 2013, Issue 1 435730
Research Article
Open Access

Geometrical and Spectral Properties of the Orthogonal Projections of the Identity

Luis González

Corresponding Author

Luis González

Department of Mathematics, Research Institute SIANI, University of Las Palmas de Gran Canaria, Campus de Tafira, 35017 Las Palmas de Gran Canaria, Spain ulpgc.es

Search for more papers by this author
Antonio Suárez

Antonio Suárez

Department of Mathematics, Research Institute SIANI, University of Las Palmas de Gran Canaria, Campus de Tafira, 35017 Las Palmas de Gran Canaria, Spain ulpgc.es

Search for more papers by this author
Dolores García

Dolores García

Department of Mathematics, Research Institute SIANI, University of Las Palmas de Gran Canaria, Campus de Tafira, 35017 Las Palmas de Gran Canaria, Spain ulpgc.es

Search for more papers by this author
First published: 09 May 2013
Academic Editor: K. Sivakumar

Abstract

We analyze the best approximation AN (in the Frobenius sense) to the identity matrix in an arbitrary matrix subspace AS (An×n nonsingular, S being any fixed subspace of n×n). Some new geometrical and spectral properties of the orthogonal projection AN are derived. In particular, new inequalities for the trace and for the eigenvalues of matrix AN are presented for the special case that AN is symmetric and positive definite.

1. Introduction

The set of all n × n real matrices is denoted by n×n, and I denotes the identity matrix of order n. In the following, AT and tr (A) denote, as usual, the transpose and the trace of matrix An×n. The notations 〈·,·〉F and ∥·∥F stand for the Frobenius inner product and matrix norm, defined on the matrix space n×n. Throughout this paper, the terms orthogonality, angle, and cosine will be used in the sense of the Frobenius inner product.

Our starting point is the linear system
()
where A is a large, nonsingular, and sparse matrix. The resolution of this system is usually performed by iterative methods based on Krylov subspaces (see, e.g., [1, 2]). The coefficient matrix A of the system (1) is often extremely ill-conditioned and highly indefinite, so that in this case, Krylov subspace methods are not competitive without a good preconditioner (see, e.g., [2, 3]). Then, to improve the convergence of these Krylov methods, the system (1) can be preconditioned with an adequate nonsingular preconditioning matrix N, transforming it into any of the equivalent systems
()
the so-called left and right preconditioned systems, respectively. In this paper, we address only the case of the right-hand side preconditioned matrices AN, but analogous results can be obtained for the left-hand side preconditioned matrices NA.
The preconditioning of the system (1) is often performed in order to get a preconditioned matrix AN as close as possible to the identity in some sense, and the preconditioner N is called an approximate inverse of A. The closeness of AN to I may be measured by using a suitable matrix norm like, for instance, the Frobenius norm [4]. In this way, the problem of obtaining the best preconditioner N (with respect to the Frobenius norm) of the system (1) in an arbitrary subspace S of n×n is equivalent to the minimization problem; see, for example, [5]
()

The solution N to the problem (3) will be referred to as the “optimal” or the “best” approximate inverse of matrix A in the subspace S. Since matrix AN is the best approximation to the identity in subspace AS, it will be also referred to as the orthogonal projection of the identity matrix onto the subspace AS. Although many of the results presented in this paper are also valid for the case that matrix N is singular, from now on, we assume that the optimal approximate inverse N (and thus also the orthogonal projection AN) is a nonsingular matrix. The solution N to the problem (3) has been studied as a natural generalization of the classical Moore-Penrose inverse in [6], where it has been referred to as the S-Moore-Penrose inverse of matrix A.

The main goal of this paper is to derive new geometrical and spectral properties of the best approximations AN (in the sense of formula (3)) to the identity matrix. Such properties could be used to analyze the quality and theoretical effectiveness of the optimal approximate inverse N as preconditioner of the system (1). However, it is important to highlight that the purpose of this paper is purely theoretical, and we are not looking for immediate numerical or computational approaches (although our theoretical results could be potentially applied to the preconditioning problem). In particular, the term “optimal (or best) approximate inverse” is used in the sense of formula (3) and not in any other sense of this expression.

Among the many different works dealing with practical algorithms that can be used to compute approximate inverses, we refer the reader to for example, [4, 79] and to the references therein. In [4], the author presents an exhaustive survey of preconditioning techniques and, in particular, describes several algorithms for computing sparse approximate inverses based on Frobenius norm minimization like, for instance, the well-known SPAI and FSAI algorithms. A different approach (which is also focused on approximate inverses based on minimizing ∥AMIF) can be found in [7], where an iterative descent-type method is used to approximate each column of the inverse, and the iteration is done with “sparse matrix by sparse vector” operations. When the system matrix is expressed in block-partitioned form, some preconditioning options are explored in [8]. In [9], the idea of “target” matrix is introduced, in the context of sparse approximate inverse preconditioners, and the generalized Frobenius norms (H symmetric positive definite) are used, for minimization purposes, as an alternative to the classical Frobenius norm.

The last results of our work are devoted to the special case that matrix AN is symmetric and positive definite. In this sense, let us recall that the cone of symmetric and positive definite matrices has a rich geometrical structure and, in this context, the angle that any symmetric and positive definite matrix forms with the identity plays a very important role [10]. In this paper, the authors extend this geometrical point of view and analyze the geometrical structure of the subspace of symmetric matrices of order n, including the location of all orthogonal matrices not only the identity matrix.

This paper has been organized as follows. In Section 2, we present some preliminary results required to make the paper self-contained. Sections 3 and 4 are devoted to obtain new geometrical and spectral relations, respectively, for the orthogonal projections AN of the identity matrix. Finally, Section 5 closes the paper with its main conclusions.

2. Some Preliminaries

Now, we present some preliminary results concerning the orthogonal projection AN of the identity onto the matrix subspace ASn×n. For more details about these results and for their proofs, we refer the reader to [5, 6, 11].

Taking advantage of the prehilbertian character of the matrix Frobenius norm, the solution N to the problem (3) can be obtained using the orthogonal projection theorem. More precisely, the matrix product AN is the orthogonal projection of the identity onto the subspace AS, and it satisfies the conditions stated by the following lemmas; see [5, 11].

Lemma 1. Let An×n be nonsingular and let S be a linear subspace of n×n. Let N be the solution to the problem (3). Then,

()
()

An explicit formula for matrix N can be obtained by expressing the orthogonal projection AN of the identity matrix onto the subspace AS by its expansion with respect to an orthonormal basis of AS [5]. This is the idea of the following lemma.

Lemma 2. Let An×n be nonsingular. Let S be a linear subspace of  n×n of dimension d and {M1, …, Md} a basis of  S such that {AM1, …, AMd} is an orthogonal basis of AS. Then, the solution N to the problem (3) is

()
and the minimum (residual) Frobenius norm is
()

Let us mention two possible options, both taken from [5], for choosing in practice the subspace S and its corresponding basis . The first example consists of considering the subspace S of n × n matrices with a prescribed sparsity pattern, that is,
()
Then, denoting by Mi,j, the n × n matrix whose only nonzero entry is mij = 1, a basis of subspace S is clearly {Mi,j : (i, j) ∈ K}, and then {AMi,j : (i, j) ∈ K} will be a basis of subspace AS (since we have assumed that matrix A is nonsingular). In general, this basis of AS is not orthogonal, so that we only need to use the Gram-Schmidt procedure to obtain an orthogonal basis of AS, in order to apply the orthogonal expansion (6).
For the second example, consider a linearly independent set of n × n real symmetric matrices {P1, …, Pd} and the corresponding subspace
()
which clearly satisfies
()

Hence, we can explicitly obtain the solution N to the problem (3) for subspace S, from its basis {P1AT, …, PdAT}, as follows. If {AP1AT, …, APdAT} is an orthogonal basis of subspace AS, then we just use the orthogonal expansion (6) for obtaining N. Otherwise, we use again the Gram-Schmidt procedure to obtain an orthogonal basis of subspace AS, and then we apply formula (6). The interest of this second example stands in the possibility of using the conjugate gradient method for solving the preconditioned linear system, when the symmetric matrix AN is positive definite. For a more detailed exposition of the computational aspects related to these two examples, we refer the reader to [5].

Now, we present some spectral properties of the orthogonal projection AN. From now on, we denote by and the sets of eigenvalues and singular values, respectively, of matrix AN arranged, as usual, in nonincreasing order, that is,
()

The following lemma [11] provides some inequalities involving the eigenvalues and singular values of the preconditioned matrix AN.

Lemma 3. Let An×n be nonsingular and let S be a linear subspace of n×n. Then,

()

The following fact [11] is a direct consequence of Lemma 3.

Lemma 4. Let An×n be nonsingular and let S be a linear subspace of n×n. Then, the smallest singular value and the smallest eigenvalue’s modulus of the orthogonal projection AN of the identity onto the subspace AS are never greater than 1. That is,

()

The following theorem [11] establishes a tight connection between the closeness of matrix AN to the identity matrix and the closeness of σn(|λn|) to the unity.

Theorem 5. Let An×n be nonsingular and let S be a linear subspace of n×n. Let N be the solution to the problem (3). Then,

()

Remark 6. Theorem 5 states that the closer the smallest singular value σn of matrix AN is to the unity, the closer matrix AN will be to the identity, that is, the smaller ∥ANIF will be, and conversely. The same happens with the smallest eigenvalue’s modulus |λn| of matrix AN. In other words, we get a good approximate inverse N of A when σn(|λn|) is sufficiently close to 1.

To finish this section, let us mention that, recently, lower and upper bounds on the normalized Frobenius condition number of the orthogonal projection AN of the identity onto the subspace AS have been derived in [12]. In addition, this work proposes a natural generalization (related to an arbitrary matrix subspace S of n×n) of the normalized Frobenius condition number of the nonsingular matrix A.

3. Geometrical Properties

In this section, we present some new geometrical properties for matrix AN, N being the optimal approximate inverse of matrix A, defined by (3). Our first lemma states some basic properties involving the cosine of the angle between matrix AN and the identity, that is,
()

Lemma 7. Let An×n be nonsingular and let S be a linear subspace of n×n. Let N be the solution to the problem (3). Then,

()
()
()

Proof. First, using (15) and (4) we immediately obtain (16). As a direct consequence of (16), we derive that cos (AN, I) is always nonnegative. Finally, using (5) and (16), we get

()
and the proof is concluded.

Remark 8. In [13], the authors consider an arbitrary approximate inverse Q of matrix A and derive the following equality:

()
that is, the typical decomposition (valid in any inner product space) of the strong convergence into the convergence of the norms and the weak convergence (1 − cos (AQ, I))∥AQFIF. Note that for the special case that Q is the optimal approximate inverse N defined by (3), formula (18) has stated that the strong convergence is reduced just to the weak convergence and, indeed, just to the cosine cos (AN, I).

Remark 9. More precisely, formula (18) states that the closer cos (AN, I) is to the unity (i.e., the smaller the angle (AN, I) is), the smaller ∥ANIF will be, and conversely. This gives us a new measure of the quality (in the Frobenius sense) of the approximate inverse N of matrix A, by comparing the minimum residual norm ∥ANIF with the cosine of the angle between AN and the identity, instead of with tr (AN), ∥ANF (Lemma 1), or σn,  |λn| (Theorem 5). So for a fixed nonsingular matrix An×n and for different subspaces Sn×n, we have

()
Obviously, the optimal theoretical situation corresponds to the case
()

Remark 10. Note that the ratio between cos (AN, I) and cos (A, I) is independent of the order n of matrix A. Indeed, assuming that tr (A) ≠ 0 and using (16), we immediately obtain

()

The following lemma compares the trace and the Frobenius norm of the orthogonal projection AN.

Lemma 11. Let An×n be nonsingular and let S be a linear subspace of  n×n. Let N be the solution to the problem (3). Then,

()
()

Proof. Using (4), we immediately obtain the four leftmost equivalences. Using (16), we immediately obtain the two rightmost equivalences.

The next lemma provides us with a relationship between the Frobenius norms of the inverses of matrices A and its best approximate inverse N in subspace S.

Lemma 12. Let An×n be nonsingular and let S be a linear subspace of n×n. Let N be the solution to the problem (3). Then,

()

Proof. Using (4), we get

()
and hence
()
and the proof is concluded.

The following lemma compares the minimum residual norm ∥ANIF with the distance (with respect to the Frobenius norm) between the inverse of A and the optimal approximate inverse N of A in any subspace Sn×n. First, note that for any two matrices A, Bn×n (A nonsingular), from the submultiplicative property of the Frobenius norm, we immediately get
()

However, for the special case that B = N (the solution to the problem (3)), we also get the following inequality.

Lemma 13. Let An×n be nonsingular and let S be a linear subspace of n×n. Let N be the solution to the problem (3). Then,

()

Proof. Using the Cauchy-Schwarz inequality and (5), we get

()

The following extension of the Cauchy-Schwarz inequality, in a real or complex inner product space (H, 〈·, ·〉), was obtained by Buzano [14]. For all a, x, bH, we have
()

The next lemma provides us with lower and upper bounds on the inner product 〈AN,BF, for any n × n real matrix B.

Lemma 14. Let An×n be nonsingular and let S be a linear subspace of n×n. Let N be the solution to the problem (3). Then, for every Bn×n, we have

()

Proof. Using (32) for a = I,  x = AN,  b = B, and (4), we get

()

The next lemma provides an upper bound on the arithmetic mean of the squares of the n2 terms in the orthogonal projection AN. By the way, it also provides us with an upper bound on the arithmetic mean of the n diagonal terms in the orthogonal projection AN. These upper bounds (valid for any matrix subspace S) are independent of the optimal approximate inverse N, and thus they are independent of the subspace S and only depend on matrix A.

Lemma 15. Let An×n be nonsingular with tr (A) ≠ 0 and let S be a linear subspace of n×n. Let N be the solution to the problem (3). Then,

()

Proof. Using (32) for a = AT,  x = I,  and b = AN, and the Cauchy-Schwarz inequality for and (4), we get

()

Remark 16. Lemma 15 has the following interpretation in terms of the quality of the optimal approximate inverse N of matrix A in subspace S. The closer the ratio nAF/|tr (A)| is to zero, the smaller tr (AN) will be, and thus, due to (5), the larger ∥ANIF will be, and this happens for any matrix subspace S.

Remark 17. By the way, from Lemma 15, we obtain the following inequality for any nonsingular matrix An×n. Consider any matrix subspace S s.t. A−1S. Then, N = A−1, and using Lemma 15, we get

()

4. Spectral Properties

In this section, we present some new spectral properties for matrix AN, N being the optimal approximate inverse of matrix A, defined by (3). Mainly, we focus on the case that matrix AN is symmetric and positive definite. This has been motivated by the following reason. When solving a large nonsymmetric linear system (1) by using Krylov methods, a possible strategy consists of searching for an adequate optimal preconditioner N such that the preconditioned matrix AN is symmetric positive definite [5]. This enables one to use the conjugate gradient method (CG-method), which is, in general, a computationally efficient method for solving the new preconditioned system [2, 15].

Our starting point is Lemma 3, which has established that the sets of eigenvalues and singular values of any orthogonal projection AN satisfy
()

Let us particularize (38) for some special cases.

First, note that if AN is normal (i.e., for all 1 ≤ in: σi = |λi| [16]), then (38) becomes
()
In particular, if AN is symmetric (σi = |λi| = ±λi), then (38) becomes
()
In particular, if AN is symmetric and positive definite (σi = |λi| = λi+), then the equality holds in all (38), that is,
()

The next lemma compares the traces of matrices AN and (AN)2.

Lemma 18. Let An×n be nonsingular and let S be a linear subspace of  n×n. Let N be the solution to the problem (3). Then,

  • (i)

    for any orthogonal projection AN

    ()

  • (ii)

    for any symmetric orthogonal projection AN

    ()

  • (iii)

    for any symmetric positive definite orthogonal projection AN

    ()

Proof. (i) Using (38), we get

()

(ii) It suffices to use the obvious fact that and the following equalities taken from (40):

()

(iii) It suffices to use (43) and the fact that (see, e.g., [17, 18]) if P and Q are symmetric positive definite matrices then tr (PQ) ≤ tr (P)tr (Q) for P = Q = AN.

The rest of the paper is devoted to obtain new properties about the eigenvalues of the orthogonal projection AN for the special case that this matrix is symmetric positive definite.

First, let us recall that the smallest singular value and the smallest eigenvalue’s modulus of the orthogonal projection AN are never greater than 1 (see Lemma 4). The following theorem establishes the dual result for the largest eigenvalue of matrix AN (symmetric positive definite).

Theorem 19. Let An×n be nonsingular and let S be a linear subspace of n×n. Let N be the solution to the problem (3). Suppose that matrix AN is symmetric and positive definite. Then, the largest eigenvalue of the orthogonal projection AN of the identity onto the subspace AS is never less than 1. That is,

()

Proof. Using (41), we get

()
Now, since λn ≤ 1 (Lemma 4), then . This implies that at least one summand in the rightmost sum in (48) must be less than or equal to zero. Suppose that such summand is the kth one (1 ≤ kn − 1). Since AN is positive definite, then λk > 0, and thus
()
and the proof is concluded.

In Theorem 19, the assumption that matrix AN is positive definite is essential for assuring that |λ1| ≥ 1, as the following simple counterexample shows. Moreover, from Lemma 4 and Theorem 19, respectively, we have that the smallest and largest eigenvalues of AN (symmetric positive definite) satisfy λn ≤ 1 and λ1 ≥ 1, respectively. Nothing can be asserted about the remaining eigenvalues of the symmetric positive definite matrix AN, which can be greater than, equal to, or less than the unity, as the same counterexample also shows.

Example 20. For n = 3, let

()
let I3 be identity matrix of order 3, and let S be the subspace of all 3 × 3 scalar matrices; that is, S = span {I3}. Then the solution Nk to the problem (3) for subspace S can be immediately obtained by using formula (6) as follows:
()
and then we get
()

Let us arrange the eigenvalues and singular values of matrix AkNk, as usual, in nonincreasing order (as shown in (11)).

On one hand, for k = −2, we have
()
and then
()
Hence, A−2N−2 is indefinite and σ1 = |λ1| = 3/7 < 1.
On the other hand, for 1 < k < 3, we have (see matrix (52))
()
and then
()
Hence, for AkNk positive definite, we have (depending on k) λ2 < 1,  λ2 = 1, or λ2 > 1.

The following corollary improves the lower bound zero on both tr (AN), given in (4), and cos (AN, I), given in (17).

Corollary 21. Let An×n be nonsingular and let S be a linear subspace of  n×n. Let N be the solution to the problem (3). Suppose that matrix AN is symmetric and positive definite. Then,

()
()

Proof. Denote by ∥·∥2 the spectral norm. Using the well-known inequality ∥·∥2 ≤ ∥·∥F [19], Theorem 19, and (4), we get

()
Finally, (58) follows immediately from (57) and (25).

Let us mention that an upper bound on all the eigenvalues moduli and on all singular values of any orthogonal projection AN can be immediately obtained from (38) and (4) as follows:
()

Our last theorem improves the upper bound given in (60) for the special case that the orthogonal projection AN is symmetric positive definite.

Theorem 22. Let An×n be nonsingular and let S be a linear subspace of  n×n. Let N be the solution to the problem (3). Suppose that matrix AN is symmetric and positive definite. Then, all the eigenvalues of matrix AN satisfy

()

Proof. First, note that the assertion is obvious for the smallest singular value since |λn| ≤ 1 for any orthogonal projection AN (Lemma 4). For any eigenvalue of AN, we use the fact that xx2 ≤ 1/4 for all x > 0. Then from (41), we get

()

5. Conclusion

In this paper, we have considered the orthogonal projection AN (in the Frobenius sense) of the identity matrix onto an arbitrary matrix subspace AS (An×n nonsingular, Sn×n). Among other geometrical properties of matrix AN, we have established a strong relation between the quality of the approximation ANI and the cosine of the angle (AN, I). Also, the distance between AN and the identity has been related to the ratio nAF/|tr (A)| (which is independent of the subspace S). The spectral analysis has provided lower and upper bounds on the largest eigenvalue of the symmetric positive definite orthogonal projections of the identity.

Acknowledgments

The authors are grateful to the anonymous referee for valuable comments and suggestions, which have improved the earlier version of this paper. This work was partially supported by the “Ministerio de Economía y Competitividad” (Spanish Government), and FEDER, through Grant Contract CGL2011-29396-C03-01.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.