On Nonnegative Moore-Penrose Inverses of Perturbed Matrices
Abstract
Nonnegativity of the Moore-Penrose inverse of a perturbation of the form A − XGYT is considered when A† ≥ 0. Using a generalized version of the Sherman-Morrison-Woodbury formula, conditions for to be nonnegative are derived. Applications of the results are presented briefly. Iterative versions of the results are also studied.
1. Introduction
The main objective of the present work is to study certain structured perturbations A − XYT of matrices A such that the Moore-Penrose inverse of the perturbation is nonnegative whenever the Moore-Penrose inverse of A is nonnegative. Clearly, this class of matrices includes the class of matrices that have nonnegative inverses, especially M-matrices. In our approach, extensions of SMW formula for singular matrices play a crucial role. Let us mention that this problem has been studied in the literature. (See, for instance [2] for matrices and [3] for operators over Hilbert spaces). We refer the reader to the references in the latter for other recent extensions.
In this paper, first we present alternative proofs of generalizations of the SMW formula for the cases of the Moore-Penrose inverse (Theorem 5) and the group inverse (Theorem 6) in Section 3. In Section 4, we characterize the nonnegativity of (A − XGYT) †. This is done in Theorem 9 and is one of the main results of the present work. As a consequence, we present a result for M-matrices which seems new. We present a couple of applications of the main result in Theorems 13 and 15. In the concluding section, we study iterative versions of the results of the second section. We prove two characterizations for (A − XYT) † to be nonnegative in Theorems 18 and 21.
Before concluding this introductory section, let us give a motivation for the work that we have undertaken here. It is a well-documented fact that M-matrices arise quite often in solving sparse systems of linear equations. An extensive theory of M-matrices has been developed relative to their role in numerical analysis involving the notion of splitting in iterative methods and discretization of differential equations, in the mathematical modeling of an economy, optimization, and Markov chains [4, 5]. Specifically, the inspiration for the present study comes from the work of [1], where the authors consider a system of linear inequalities arising out of a problem in third generation wireless communication systems. The matrix defining the inequalities there is an M-matrix. In the likelihood that the matrix of this problem is singular (due to truncation or round-off errors), the earlier method becomes inapplicable. Our endeavour is to extend the applicability of these results to more general matrices, for instance, matrices with nonnegative Moore-Penrose inverses. Finally, as mentioned earlier, since matrices with nonnegative generalized inverses include in particular M-matrices, it is apparent that our results are expected to enlarge the applicability of the methods presently available for M-matrices, even in a very general framework, including the specific problem mentioned above.
2. Preliminaries
Let ℝ, ℝn, and ℝm×n denote the set of all real numbers, the n-dimensional real Euclidean space, and the set of all m × n matrices over ℝ. For A ∈ ℝm×n, let R(A), N(A), R(A) ⊥, and AT denote the range space, the null space, the orthogonal complement of the range space, and the transpose of the matrix A, respectively. For x = (x1, x2, …, xn) T ∈ ℝn, we say that x is nonnegative, that is, x ≥ 0 if and only if xi ≥ 0 for all i = 1,2, …, n. As mentioned earlier, for a matrix B we use B ≥ 0 to denote that all the entries of B are nonnegative. Also, we write A ≤ B if B − A ≥ 0.
Let ρ(A) denote the spectral radius of the matrix A. If ρ(A) < 1 for A ∈ ℝn×n, then I − A is invertible. The next result gives a necessary and sufficient condition for the nonnegativity of (I − A) −1. This will be one of the results that will be used in proving the first main result.
Lemma 1 (see [5], Lemma 2.1, Chapter 6.)Let A ∈ ℝn×n be nonnegative. Then, ρ(A) < 1 if and only if (I − A) −1 exists and
More generally, matrices having nonnegative inverses are characterized using a property called monotonicity. The notion of monotonicity was introduced by Collatz [6]. A real n × n matrix A is called monotone if Ax ≥ 0⇒x ≥ 0. It was proved by Collatz [6] that A is monotone if and only if A−1 exists and A−1 ≥ 0.
One of the frequently used tools in studying monotone matrices is the notion of a regular splitting. We only refer the reader to the book [5] for more details on the relationship between these concepts.
The notion of monotonicity has been extended in a variety of ways to singular matrices using generalized inverses. First, let us briefly review the notion of two important generalized inverses.
For A ∈ ℝm×n, the Moore-Penrose inverse is the unique Z ∈ ℝn×m satisfying the Penrose equations: AZA = A, ZAZ = Z, (AZ) T = AZ, and (ZA) T = ZA. The unique Moore-Penrose inverse of A is denoted by A†, and it coincides with A−1 when A is invertible.
The following theorem by Desoer and Whalen, which is used in the sequel, gives an equivalent definition for the Moore-Penrose inverse. Let us mention that this result was proved for operators between Hilbert spaces.
Theorem 2 (see [7].)Let A ∈ ℝm×n. Then A† ∈ ℝn×m is the unique matrix X ∈ ℝn×m satisfying
- (i)
ZAx = x, ∀x ∈ R(AT),
- (ii)
Zy = 0, ∀y ∈ N(AT).
Now, for A ∈ ℝn×n, any Z ∈ ℝn×n satisfying the equations AZA = A, ZAZ = Z, and AZ = ZA is called the group inverse of A. The group inverse does not exist for every matrix. But whenever it exists, it is unique. A necessary and sufficient condition for the existence of the group inverse of A is that the index of A is 1, where the index of a matrix is the smallest positive integer k such that rank (Ak+1) = rank (Ak).
Some of the well-known properties of the Moore-Penrose inverse and the group inverse are given as follows: R(AT) = R(A†), N(AT) = N(A†), , and AA† = PR(A). In particular, x ∈ R(AT) if and only if x = A†Ax. Also, R(A) = R(A#), N(A) = N(A#), and A#A = AA# = PR(A),N(A). Here, for complementary subspaces L and M of ℝk, PL,M denotes the projection of ℝk onto L along M. PL denotes PL,M if M = L⊥. For details, we refer the reader to the book [8].
In matrix analysis, a decomposition (splitting) of a matrix is considered in order to study the convergence of iterative schemes that are used in the solution of linear systems of algebraic equations. As mentioned earlier, regular splittings are useful in characterizing matrices with nonnegative inverses, whereas, proper splittings are used for studying singular systems of linear equations. Let us next recall this notion. For a matrix A ∈ ℝm×n, a decomposition A = U − V is called a proper splitting [9] if R(A) = R(U) and N(A) = N(U). It is rather well-known that a proper splitting exists for every matrix and that it can be obtained using a full-rank factorization of the matrix. For details, we refer to [10]. Certain properties of a proper splitting are collected in the next result.
Theorem 3 (see [9], Theorem 1.)Let A = U − V be a proper splitting of A ∈ ℝm×n. Then,
- (a)
A = U(I − U†V),
- (b)
I − U†V is nonsingular and,
- (c)
A† = (I − U†V) −1U†.
The following result by Berman and Plemmons [9] gives a characterization for A† to be nonnegative when A has a proper splitting. This result will be used in proving our first main result.
Theorem 4 (see [9], Corollary 4.)Let A = U − V be a proper splitting of A ∈ ℝm×n, where U† ≥ 0 and U†V ≥ 0. Then A† ≥ 0 if and only if ρ(U†V) < 1.
3. Extensions of the SMW Formula for Generalized Inverses
The primary objects of consideration in this paper are generalized inverses of perturbations of certain types of a matrix A. Naturally, extensions of the SMW formula for generalized inverses are relevant in the proofs. In what follows, we present two generalizations of the SMW formula for matrices. We would like to emphasize that our proofs also carry over to infinite dimensional spaces (the proof of the first result is verbatim and the proof of the second result with slight modifications applicable to range spaces instead of ranks of the operators concerned). However, we are confining our attention to the case of matrices. Let us also add that these results have been proved in [3] for operators over infinite dimensional spaces. We have chosen to include these results here as our proofs are different from the ones in [3] and that our intention is to provide a self-contained treatment.
Theorem 5 (see [3], Theorem 2.1.)Let A ∈ ℝm×n, X ∈ ℝm×k, and Y ∈ ℝn×k be such that
Proof. Set Z = A† + A†XH†YTA†, where H = G† − YTA†X. From the conditions (3) and (4), it follows that AA†X = X, A†AY = Y, H†HXT = XT, HH†YT = YT, GG†XT = XT, and G†GYT = YT.
Now,
Let y ∈ N(ΩT). Then, ATy − YGTXTy = 0 so that . Substituting XT = GG†XT and simplifying it, we get HTGTXTy = 0. Also, ATy = YGTXTy = YHH†GTXTy = 0 and so A†y = 0. Thus, Zy = 0 for y ∈ N(ΩT). Hence, by Theorem 2, Z = Ω†.
The result for the group inverse follows.
Theorem 6. Let A ∈ ℝn×n be such that A# exists. Let X, Y ∈ ℝn×k and G be nonsingular. Assume that R(X)⊆R(A), R(Y)⊆R(AT) and G−1 − YTA#X is nonsingular. Suppose that rank (A − XGYT) = rank (A). Then, (A − XGYT) # exists and the following formula holds:
Proof. Since A# exists, R(A) and N(A) are complementary subspaces of ℝn×n.
Suppose that rank (A − XGYT) = rank (A). As R(X)⊆R(A), it follows that R(A − XGYT)⊆R(A). Thus R(A − XGYT) = R(A). By the rank-nullity theorem, the nullity of both A − XGYT and A are the same. Again, since R(Y)⊆R(AT), it follows that N(A − XGYT) = N(A). Thus, R(A − XGYT) and N(A − XGYT) are complementary subspaces. This guarantees the existence of the group inverse of A − XGYT.
Conversely, suppose that (A − XGYT) # exists. It can be verified by direct computation that Z = A# + A#X(G−1 − YTA#X) −1YTA# is the group inverse of A − XGYT. Also, we have (A − XGYT)(A − XGYT) # = AA#, so that R(A − XGYT) = R(A), and hence the rank (A − XGYT) = rank (A).
We conclude this section with a fairly old result [2] as a consequence of Theorem 5.
Theorem 7 (see [2], Theorem 15.)Let A ∈ ℝm×n of rank r, X ∈ ℝm×r and Y ∈ ℝn×r. Let G be an r × r nonsingular matrix. Assume that R(X)⊆R(A), R(Y)⊆R(AT), and G−1 − YTA†X is nonsingular. Let Ω = A − XGYT. Then
4. Nonnegativity of (A − XGYT) †
In this section, we consider perturbations of the form A − XGYT and derive characterizations for (A − XGYT) † to be nonnegative when A† ≥ 0, X ≥ 0, Y ≥ 0 and G ≥ 0. In order to motivate the first main result of this paper, let us recall the following well known characterization of M-matrices [5].
Theorem 8. Let A be a Z-matrix with the representation A = sI − B, where B ≥ 0 and s ≥ 0. Then the following statements are equivalent:
- (a)
A−1 exists and A−1 ≥ 0.
- (b)
There exists x > 0 such that Ax > 0.
- (c)
ρ(B) < s.
Let us prove the first result of this article. This extends Theorem 8 to singular matrices. We will be interested in extensions of conditions (a) and (c) only.
Theorem 9. Let A = U − V be a proper splitting of A ∈ ℝm×n with U† ≥ 0, U†V ≥ 0 and ρ(U†V) < 1. Let G ∈ ℝk×k be nonsingular and nonnegative, X ∈ ℝm×k, and Y ∈ ℝn×k be nonnegative such that R(X)⊆R(A), R(Y)⊆R(AT) and G−1 − YTA†X is nonsingular. Let Ω = A − XGYT. Then, the following are equivalent
- (a)
Ω† ≥ 0.
- (b)
(G−1 − YTA†X) −1 ≥ 0.
- (c)
ρ(U†W) < 1 where W = V + XGYT.
Proof. First, we observe that since R(X)⊆R(A), R(Y)⊆R(AT), and G−1 − YTA†X is nonsingular, by Theorem 7, we have
(a)⇒(b): By taking E = A and F = XGYT, we get Ω = E − F as a proper splitting for Ω such that E† = A† ≥ 0 and E†F = A†XGYT ≥ 0 (since XGYT ≥ 0). Since Ω† ≥ 0, by Theorem 4, we have ρ(A†XGYT) = ρ(E†F) < 1. This implies that ρ(YTA†XG) < 1. We also have YTA†XG ≥ 0. Thus, by Lemma 1, (I − YTA†XG) −1 exists and is nonnegative. But, we have I − YTA†XG = (G−1 − YTA†X)G. Now, 0 ≤ (I − YTA†XG) −1 = G−1(G−1 − YTA†X) −1. This implies that (G−1 − YTA†X) −1 ≥ 0 since G ≥ 0. This proves (b).
(b)⇒(c): We have U − W = U − V − XGYT = A − XYT = Ω. Also R(Ω) = R(A) = R(U) and N(Ω) = N(A) = N(U). So, Ω = U − W is a proper splitting. Also U† ≥ 0 and U†W = U†(V + XGYT) ≥ U†V ≥ 0. Since (G−1 − YTA†X) −1 ≥ 0, it follows from (9) that Ω† ≥ 0. (c) now follows from Theorem 4.
(c)⇒(a): Since A = U − V, we have Ω = U − V − XGYT = U − W. Also we have U† ≥ 0, U†V ≥ 0. Thus, U†W ≥ 0, since XGYT ≥ 0. Now, by Theorem 4, we are done if the splitting Ω = U − W is a proper splitting. Since A = U − V is a proper splitting, we have R(U) = R(A) and N(U) = N(A). Now, from the conditions in (10), we get that R(U) = R(Ω) and N(U) = N(Ω). Hence Ω = U − W is a proper splitting, and this completes the proof.
The following result is a special case of Theorem 9.
Theorem 10. Let A = U − V be a proper splitting of A ∈ ℝm×n with U† ≥ 0, U†V ≥ 0 and ρ(U†V) < 1. Let X ∈ ℝm×k and Y ∈ ℝn×k be nonnegative such that R(X)⊆R(A), R(Y)⊆R(AT), and I − YTA†X is nonsingular. Let Ω = A − XYT. Then the following are equivalent:
- (a)
Ω† ≥ 0.
- (b)
(I − YTA†X) −1 ≥ 0.
- (c)
ρ(U†W) < 1, where W = V + XYT.
The following consequence of Theorem 10 appears to be new. This gives two characterizations for a perturbed M-matrix to be an M-matrix.
Corollary 11. Let A = sI − B where, B ≥ 0 and ρ(B) < s (i.e., A is an M-matrix). Let X ∈ ℝm×k and Y ∈ ℝn×k be nonnegative such that I − YTA†X is nonsingular. Let Ω = A − XYT. Then the following are equivalent:
- (a)
Ω−1 ≥ 0.
- (b)
(I − YTA†X) −1 ≥ 0.
- (c)
ρ(B(I + XYT)) < s.
Proof. From the proof of Theorem 10, since A is nonsingular, it follows that ΩΩ† = I and Ω†Ω = I. This shows that Ω is invertible. The rest of the proof is omitted, as it is an easy consequence of the previous result.
In the rest of this section, we discuss two applications of Theorem 10. First, we characterize the least element in a polyhedral set defined by a perturbed matrix. Next, we consider the following Suppose that the “endpoints” of an interval matrix satisfy a certain positivity property. Then all matrices of a particular subset of the interval also satisfy that positivity condition. The problem now is if that we are given a specific structured perturbation of these endpoints, what conditions guarantee that the positivity property for the corresponding subset remains valid.
The first result is motivated by Theorem 12 below. Let us recall that with respect to the usual order, an element x* ∈ 𝒳⊆ℝn is called a least element of 𝒳 if it satisfies x* ≤ x for all x ∈ 𝒳. Note that a nonempty set may not have the least element, and if it exists, then it is unique. In this connection, the following result is known.
Theorem 12 (see [11], Theorem 3.2.)For A ∈ ℝm×n and b ∈ ℝm, let
Now, we obtain the nonnegative least element of a polyhedral set defined by a perturbed matrix. This is an immediate application of Theorem 10.
Theorem 13. Let A ∈ ℝm×n be such that A† ≥ 0. Let X ∈ ℝm×k, and let Y ∈ ℝn×k be nonnegative, such that R(X)⊆R(A), R(Y)⊆R(AT), and I − YTA†X is nonsingular. Suppose that (I − YTA†X) −1 ≥ 0. For b ∈ ℝm, b ≥ 0, let
Proof. From the assumptions, using Theorem 10, it follows that Ω† ≥ 0. The conclusion now follows from Theorem 12.
To state and prove the result for interval matrices, let us first recall the notion of interval matrices. For A, B ∈ ℝm × n, an interval (matrix) J = [A, B] is defined as J = [A, B] = {C : A ≤ C ≤ B}. The interval J = [A, B] is said to be range-kernel regular if R(A) = R(B) and N(A) = N(B). The following result [12] provides necessary and sufficient conditions for C† ≥ 0 for C ∈ K, where K = {C ∈ J : R(C) = R(A) = R(B), N(C) = N(A) = N(B)}.
Theorem 14. Let J = [A, B] be range-kernel regular. Then, the following are equivalent:
- (a)
C† ≥ 0 whenever C ∈ K,
- (b)
A† ≥ 0 and B† ≥ 0.
In such a case, we have .
Now, we present a result for the perturbation.
Theorem 15. Let J = [A, B] be range-kernel regular with A† ≥ 0 and B† ≥ 0. Let X1, X2, Y1, and Y2 be nonnegative matrices such that R(X1)⊆R(A), R(Y1)⊆R(AT), R(X2)⊆R(B), and R(Y2)⊆R(BT). Suppose that and are nonsingular with nonnegative inverses. Suppose further that . Let , where and . Finally, let . Then,
- (a)
E† ≥ 0 and F† ≥ 0.
- (b)
Z† ≥ 0 whenever .
In that case, .
Proof. It follows from Theorem 7 that
5. Iterations That Preserve Nonnegativity of the Moore-Penrose Inverse
In this section, we present results that typically provide conditions for iteratively defined matrices to have nonnegative Moore-Penrose inverses given that the matrices that we start with have this property. We start with the following result about the rank-one perturbation case, which is a direct consequence of Theorem 10.
Theorem 16. Let A ∈ ℝn×n be such that A† ≥ 0. Let x ∈ ℝm and y ∈ ℝn be nonnegative vectors such that x ∈ R(A), y ∈ R(AT), and 1 − yTA†x ≠ 0. Then (A − xyT) † ≥ 0 if and only if yTA†x < 1.
Example 17. Let us consider . It can be verified that A can be written in the form where e1 = (1,0) T and C is the 2 × 2 circulant matrix generated by the row (1, 1/2). We have . Also, the decomposition A = U − V, where , and is a proper splitting of A with U† ≥ 0, U†V ≥ 0, and ρ(U†V) < 1.
Let x = y = (1/3,1/6,1/3) T. Then, x and let y are nonnegative, x ∈ R(A) and y ∈ R(AT). We have 1 − yTA†x = 5/12 > 0 and ρ(U†W) < 1 for W = V + xyT. Also, it can be seen that . This illustrates Theorem 10.
Let x1, x2, …, xk ∈ ℝm and y1, y2, …, yk ∈ ℝn be nonnegative. Denote Xi = (x1, x2, …, xi) and Yi = (y1, y2, …, yi) for i = 1,2, …, k. Then . The following theorem is obtained by a recurring application of the rank-one result of Theorem 16.
Theorem 18. Let A ∈ ℝm×n and let Xi, Yi be as above. Further, suppose that , and be nonzero. Let , for all i = 1,2, …, k. Then if and only if , where is taken as A.
Proof. Set for i = 0,1, …, k, where B0 is identified as A. The conditions in the theorem can be written as:
Now, assume that . Then by Theorem 16 and the conditions in (14) for i = k, we have . Also since by assumption for all i = 0,1 … , k − 1, we have for all i = 1,2 … , k.
The converse part can be proved iteratively. Then condition and the conditions in (14) for i = 1 imply that . Repeating the argument for i = 2 to k proves the result.
The following result is an extension of Lemma 2.3 in [1], which is in turn obtained as a corollary. This corollary will be used in proving another characterization for .
Theorem 19. Let A ∈ ℝn×n, b, c ∈ ℝn be such that −b ≥ 0, −c ≥ 0 and α(∈ℝ) > 0. Further, suppose that b ∈ R(A) and c ∈ R(AT). Let . Then, if and only if A† ≥ 0 and cTA†b < α.
Proof. Let us first observe that AA†b = b and cTA†A = cT. Set
Suppose that A† ≥ 0 and β = α − cTA†b > 0. Then, . Conversely suppose that . Then, we must have β > 0. Let {e1, e2, …, en+1} denote the standard basis of ℝn+1. Then, for i = 1,2, …, n, we have , −(cTA†ei/β)) T. Since c ≤ 0, it follows that A†ei ≥ 0 for i = 1,2, …, n. Thus, A† ≥ 0.
Corollary 20 (see [1], Lemma 2.3.)Let A ∈ ℝn×n be nonsingular. Let b, c ∈ ℝn be such that −b ≥ 0, −c ≥ 0 and α(∈ℝ) > 0. Let . Then, is nonsingular with if and only if A−1 ≥ 0 and cTA−1b < α.
Now, we obtain another necessary and sufficient condition for to be nonnegative.
Theorem 21. Let A ∈ ℝm×n be such that A† ≥ 0. Let xi, yi be nonnegative vectors in ℝn and ℝm respectively, for every i = 1, …, k, such that
Proof. The range conditions in (16) imply that R(Xk)⊆R(A) and R(Yk)⊆R(AT). Also, from the assumptions, it follows that Hk is nonsingular. By Theorem 10, if and only if . Now,
Now, applying the above argument to the matrix Hk−1, we have that holds if and only if holds and . Continuing the above argument, we get, if and only if , ∀i = 1, …, k. This is condition (17).
We conclude the paper by considering an extension of Example 17.
Example 22. For a fixed n, let C be the circulant matrix generated by the row vector (1, (1/2), (1/3), …(1/(n − 1))). Consider
Let xi be the first row of written as a column vector multiplied by 1/ni and let yi be the first column of multiplied by 1/ni, where , with B0 being identified with A, i = 0,2, …, k. We then have xi ≥ 0, yi ≥ 0, x1 ∈ R(A) and y1 ∈ R(AT). Now, if for each iteration, then the vectors xi and yi will be nonnegative. Hence, it is enough to check that in order to get . Experimentally, it has been observed that for n > 3, the condition holds true for any iteration. However, for n = 3, it is observed that , , , and . But, , and in that case, we observe that is not nonnegative.