Norm-Constrained Least-Squares Solutions to the Matrix Equation AXB = C
Abstract
An iterative method to compute the least-squares solutions of the matrix AXB = C over the norm inequality constraint is proposed. For this method, without the error of calculation, a desired solution can be obtained with finitely iterative step. Numerical experiments are performed to illustrate the efficiency and real application of the algorithm.
1. Introduction
Throughout this paper, Rm×n denotes the set of all m × n real matrices. I represents the identity matrix of size implied by context. AT, ∥A∥ denote, respectively, the transpose and the Frobenius norm of the matrix A. For the matrices A = (aij) ∈ Rm×n, B = (bij) ∈ Rp×q, A ⊗ B represents the Kronecker production of the matrices A and B defined as A ⊗ B = (aijB) ∈ Rmp×nq. The inner product in the matrix set space Rm×n is defined as 〈A, B〉 = trace(BTA) for all the matrices A, B ∈ Rm×n. Obviously, Rm×n is a Hilbert inner product space and the norm of a matrix generated by this inner product space is the Frobenius norm.
In this paper, an iterative method is proposed to compute the solutions of the problem (1). We will use the generalized Lanczos trust region algorithm (GLTR) [8], which is based on Steihaug-Toint algorithm [9, 10], as the frame method for deriving this iterative method. The basic idea is as follows. First, by using the Kronecker production of matrices, we transform the least-squares problem (1) into the trust-region subproblem in vector form which can be solved by the GLTR algorithm. Then, we transform the vector iterative method into matrix form. In the end, numerical experiments are given to illustrate the efficiency and real application of the proposed iteration algorithm.
2. Iteration Methods to Solve Problem (1)
In this section we first give the necessary and sufficient conditions for the problem (1) to have a solution. Then we propose an iteration method to compute the solution to the problem. And some properties of this algorithm are also given.
Theorem 1. Matrix X* is a solution of the problem (3) if and only if there is a scalar λ* ≥ 0 such that the following conditions are satisfied:
Proof. Assume that there is a scalar λ* ≥ 0 such that the conditions (4) are satisfied. Let
Conversely, assuming that X* is a global solution of the problem (3), we show that there is a nonnegative λ* such that satisfies conditions (4). For this purpose we consider two cases: ∥X*∥ < Δ and ∥X*∥ = Δ.
In case ∥X*∥ < Δ, X* is certainly an unconstrained minimizer of φ(X). So X* satisfies the stationary point condition ∇φ(X*) = 0; that is, ATAX*BBT − ATCBT = 0. This implies that the properties (4) hold for λ* = 0. In the case ∥X*∥ = Δ, the second equality is immediately satisfied, and X* also solves the constrained problem
We give an iteration method to solve problem (1) as in Algorithm 2.
Algorithm 2. (i) Given matrices X0 = 0, Q−1 = 0 and a small tolerance ε > 0. Compute
-
R0 = −ATCBT, t0 = −ATCBT, γ0 = ∥R0∥, P0 = −R0, T−1 = [].
-
Set k ← 0.
(ii) Computing Qk = tk/γk, , tk+1 = ATAQkBBT − δkQk − γkQk−1, γk+1 = ∥tk+1∥
-
, where Γk = (0, …, 0, γk) T ∈ Rk.
(iii) If APkB ≠ 0, compute .
-
If ∥Xk + αkPk∥ ≤ Δ, computing Rk+1 = Rk + αkATAPkBBT, , Xk+1 = Xk + αkPk, Pk+1 = −Rk+1 + βkPk, else, go to Step 4.
-
If ∥Rk+1∥ < ε, stop, else set k ← k + 1 and go to Step 2.
(iv) Find the solution hk to the following optimization problem:
(v) If γk+1|〈ek+1, hk〉| < ε (here ek+1 represents the last column of identity matrix I), set
and then stop, else set k ← k + 1 and go to Step 2.
The basic iteration route of Algorithm 2 to solve problem (1) includes two cases: First, using CG method (Step 3) to compute the solution of problem (1) in feasible region. When the first case is failure, the solution of problem (1) in feasible region cannot be obtained by using CG method, and then the solution of problem (1) on the boundary can be obtained by solving the optimization problem (14). The properties about Algorithm 2 are given as follows.
Theorem 3. Assume that the sequences {Ri}, {Pi}, and {APiB} are generated by Algorithm 2; then the following equalities hold for all i ≠ j, 0 ≤ i, j ≤ k:
Proof. Since 〈A, B〉 = 〈B, A〉 holds for all matrices A and B, we only need to prove that the conclusion holds for all 0 ≤ i < j ≤ k. Using induction and two steps are required.
Step 1. Show that 〈Ri, Ri+1〉 = 0, 〈Pi, Ri+1〉 = 0 and 〈APiB, APi+1B〉 = 0 hold for all i = 0,1, 2, …k. We also use the principle of mathematical induction to prove these conclusions. When i = 0, we have
Step 2. Assume that 〈Pi, Ri+l〉 = 0, 〈APiB, APi+lB〉 = 0, and 〈Ri, Ri+l〉 = 0 for all 0 ≤ i ≤ k and 1 < l < k, show that 〈Pi, Ri+l+1〉 = 0, 〈APiB, APi+l+1B〉 = 0, and 〈Ri, Ri+l+1〉 = 0. The proof is as follows:
Theorem 4. Assume that the sequence {Qi} is generated by Algorithm 2; then the following equalities hold:
Proof. By the definition of Qi, we immediately know that 〈Qi, Qi〉 = 1 (i = 0,1, 2, …). Similar to the proof of Theorem 3, we also use the principle of mathematical induction to prove this conclusion with the two following cases.
Step 1. Show that 〈Qi, Qi+1〉 = 0 for all i = 0,1, 2, …k.
When i = 0, we have
Step 2. Assume that 〈Qi, Qi+l〉 = 0 for all 0 ≤ i ≤ k and 1 < l < k show that 〈Qi, Qi+l+1〉 = 0. The proof is as follows:
Theorem 5. Assume that the sequences {γk}, {Tk}, and {Qi} are generated by Algorithm 2. Let
Proof. By the definition of Tk and Qk (k = 0,1, 2, …), we have
Theorem 6. Assume that the sequences {Qk}, {Rk}, {γk}, {δk}, {αk}, and {βi} are generated by Algorithm 2, then the following equalities hold for all k = 0,1, 2, …:
Proof. (the proof of the first equality in (27)). By the definition of Qk and Rk, we have
(The proof of the second equality in (27)). Noting that the first equality in (27) holds, then, when k = 0, we have
Remark 7. This theorem shows the relationship between the sequences {Qk}, {Rk}, {γk}, {δk}, {αk}, and {βi} to lower down the cost of calculation.
3. The Main Results and Improvement of the Iteration Method
We will show that the solution of the problem (1) can be obtained within finite iteration steps in the absence of round-off errors. And we give the detail to solve the problem (14) in order to complete Algorithm 2. By discussing the characterization of the proposed iteration method, the further optimization method for the proposed iteration method is given at the end of this section.
Theorem 8. Assume that the sequences {Xk}, {Rk} are generated by Algorithm 2. Then the following equalities hold for all k = 0,1, 2, …:
Proof. We use the principle of mathematical induction to prove this conclusion. When k = 0, obviously, the conclusion holds. Assume that the conclusion holds for k − 1; then
Remark 9. For Theorem 3, the sequences R0, R1, R2, … are orthogonal to each other in the finite dimension matrix space Rn×n; it is certain that there exists a positive number k + 1 ≤ n2 such that Rk+1 = 0. So without the error of calculation, the first stopping criterion in the algorithm will perform with finite steps. From Theorem 8, we get ATAXk+1BBT − ATCBT = 0. According to Theorem 1, when we set λ* = 0, Xk+1 is a solution of the problem (3).
Theorem 10. Assume that the sequences {Qk}, {γk}, and {hk} are generated by Algorithm 2. Let
Proof. Assume that hk is the solution of optimization problem (14); then there exists a nonnegative number λk such that the following optimality Karush-Kuhn-Tucker (KKT) conditions are satisfied:
Since the first equality in (38) can be rewritten as
Theorem 11. Assume that γ0, γ1, …, γk ≠ 0, and γk+1 = 0. Then is the solution of the problem (1).
Proof. Since γk+1 = 0 and , we have by Theorem 10 that
Remark 12. According to Theorem 4, the sequences Q0, Q1, Q2, … are orthogonal each other in the finite dimension matrix space Rn×n; it is certain that there exists a positive number k ≤ n2 such that Qk = 0. Since tk = γkQk = 0, then . So without the error of calculation, the second stopping criterion in the algorithm also performs with finite steps.
Remark 13. According to Remarks 9 and 12, we have that, without the error of calculation, a desired solution can be obtained with finitely iterative step by Algorithm 2.
Theorem 14. The solution hk of the problem (14) obtained by Algorithm 2 is on the boundary. In other words, hk is the solution of the following optimization problem:
Proof. Assuming that the solution hk of the problem (14) obtained by Algorithm 2 is inside the boundary, we have by (38) that Tkhk = −γ0e1. By Theorem 5, we know Tk is a positive semidefinite matrix. If Tk is positive definite, then with ∥hk∥ < Δ is a unique solution of the problem (14). Hence, we have by Theorem 5 that X = (Q0, Q1, …, Qk)(hk ⊗ I) with ∥X∥ = ∥hk∥ < Δ is a unique solution of the problem (1). In this case, the step of solving the problem (14) in Algorithm 2 cannot be implemented. If Tk is positive semidefinite and not positive definite, then there exists a matrix Z such that Tk(hk + Z) = −γ0e1 and ∥hk + Z∥ = Δ which implies that hk + Z is a solution to the problem (1) on the boundary. This contradicts our assumption.
Now we use the following Algorithm 15, which was proposed by More and Sorensen in paper [11], to solve the problem (43).
Algorithm 15. (I) Let a suitable starting value and Δ > 0 be given.
(II) For i = 0,1, … until convergence.
- (a)
Factorize , where Q and Λ are unit bidiagonal and diagonal matrix, respectively.
- (b)
Solve QΛQTh = −γ0e1.
- (c)
Solve Qw = h.
- (d)
Set .
In the implementation of Algorithm 15, the initial secular can be chosen by the following principles: If ∥hk(λk−1)∥ ≥ Δ, let ; else let , where λk−1 is obtained by the (k − 1)th iterative steps of Algorithm 2. The stopping criteria can be used as , where ε is a small tolerance.
By fully using the result of Theorem 6, Algorithm 2 can be optimized as in Algorithm 16.
Algorithm 16. (i) Given matrices X0 = 0, Q−1 = 0 and a small tolerance ε > 0.
Computing R0 = −ATCBT, t0 = −ATCBT.
Set γ0 = ∥R0∥, P0 = −R0, T−1 = [], and k ← 0.
(ii) If APkB ≠ 0, compute
Else, computing Qk = tk/γk (the first one Qk = (−1) kRk/∥Rk∥),
(iii) If ∥Xk+1 + αk+1Pk+1∥ ≤ Δ, computing Xk+1 = Xk + αkPk, Pk+1 = −Rk + βkPk.
If ∥Rk+1∥ ≤ ε, stop. Else, setting k ← k + 1 and go to Step 2.
Else, go to Step 4.
(iv) Using Algorithm 15 to compute the solution hk of the problem (43).
(v) If γk+1|〈ek+1, hk〉| < ε, setting , then stop.
Else, setting k ← k + 1 and go to step 2.
4. Numerical Experiments
Example 17. Given the matrices A, B, C as follows:
Example 18. We work with a 2D first-kind Fredholm integral equation of the generic form
An example of such problem is image denoising with a Gaussian point spread function:



Based on [7], Δ represents the energy of the target image, so we get Δ = ∥F′∥. Solving the above problem by Algorithm 16, we get the recovered image F* in Figure 1(c). It means our algorithm is suitable for image denoising.
Acknowledgments
The research was supported by National Natural Science Foundation of China (11261014) and Innovation Project of GUET Graduate Education (XJYC2012023).