Volume 2013, Issue 1 101974
Research Article
Open Access

s-Goodness for Low-Rank Matrix Recovery

Lingchen Kong

Corresponding Author

Lingchen Kong

Department of Applied Mathematics, Beijing Jiaotong University, Beijing 100044, China njtu.edu.cn

Search for more papers by this author
Levent Tunçel

Levent Tunçel

Department of Combinatorics and Optimization, Faculty of Mathematics, University of Waterloo, Waterloo, ON, Canada N2L 3G1, uwaterloo.ca

Search for more papers by this author
Naihua Xiu

Naihua Xiu

Department of Applied Mathematics, Beijing Jiaotong University, Beijing 100044, China njtu.edu.cn

Search for more papers by this author
First published: 09 April 2013
Citations: 4
Academic Editor: Jein-Shan Chen

Abstract

Low-rank matrix recovery (LMR) is a rank minimization problem subject to linear equality constraints, and it arises in many fields such as signal and image processing, statistics, computer vision, and system identification and control. This class of optimization problems is generally 𝒩𝒫 hard. A popular approach replaces the rank function with the nuclear norm of the matrix variable. In this paper, we extend and characterize the concept of s-goodness for a sensing matrix in sparse signal recovery (proposed by Juditsky and Nemirovski (Math Program, 2011)) to linear transformations in LMR. Using the two characteristic s-goodness constants, γs and , of a linear transformation, we derive necessary and sufficient conditions for a linear transformation to be s-good. Moreover, we establish the equivalence of s-goodness and the null space properties. Therefore, s-goodness is a necessary and sufficient condition for exact s-rank matrix recovery via the nuclear norm minimization.

1. Introduction

Low-rank matrix recovery (LMR for short) is a rank minimization problem (RMP) with linear constraints or the affine matrix rank minimization problem which is defined as follows:
()
where Xm×n is the matrix variable, 𝒜 : m×np is a linear transformation, and bp. Although specific instances can often be solved by specialized algorithms, the LMR is 𝒩𝒫 hard. A popular approach for solving LMR in the systems and control community is to minimize the trace of a positive semidefinite matrix variable instead of its rank (see, e.g., [1, 2]). A generalization of this approach to nonsymmetric matrices introduced by Fazel et al. [3] is the famous convex relaxation of LMR (1), which is called nuclear norm minimization (NNM):
()
where ∥X* is the nuclear norm of X, that is, the sum of its singular values. When m = n and the matrix X : = Diag(x), xn, is diagonal, the LMR (1) reduces to sparse signal recovery (SSR), which is the so-called cardinality minimization problem (CMP):
()
where ∥x0 denotes the number of nonzero entries in the vector x and Φ ∈ m×n is a given sensing matrix. A well-known heuristic for SSR is the 1-norm minimization relaxation (basis pursuit problem):
()
where ∥x1 is the 1-norm of x, that is, the sum of absolute values of its entries.

LMR problems have many applications and they appeared in the literature of a diverse set of fields including signal and image processing, statistics, computer vision, and system identification and control. For more details, see the recent paper [4]. LMR and NNM have been the focus of some recent research in the optimization community, see; for example, [415]. Although there are many papers dealing with algorithms for NNM such as interior-point methods, fixed point and Bregman iterative methods, and proximal point methods, there are fewer papers dealing with the conditions that guarantee the success of the low-rank matrix recovery via NNM. For instance, following the program laid out in the work of Candès and Tao in compressed sensing (CS, see, e.g., [1618]), Recht et al. [4] provided a certain restricted isometry property (RIP) condition on the linear transformation which guarantees that the minimum nuclear norm solution is the minimum rank solution. Recht et al. [14, 19] gave the null space property (NSP) which characterizes a particular property of the null space of the linear transformation, which is also discussed by Oymak et al. [20, 21]. Note that NSP states a necessary and sufficient condition for exactly recovering the low-rank matrix via nuclear norm minimization. Recently, Chandrasekaran et al. [22] proposed that a fixed s-rank matrix X0 can be recovered if and only if the null space of 𝒜 does not intersect the tangent cone of the nuclear norm ball at X0.

In the setting of CS, there are other characterizations of the sensing matrix, under which 1-norm minimization can be guaranteed to yield an optimal solution to SSR, in addition to RIP and null-space properties, see; for example, [2326]. In particular, Juditsky and Nemirovski [24] established necessary and sufficient conditions for a Sensing matrix to be “s-good” to allow for exact 1-recovery of sparse signals with s nonzero entries when no measurement noise is present. They also demonstrated that these characteristics, although difficult to evaluate, lead to verifiable sufficient conditions for exact SSR and to efficiently computable upper bounds on those s for which a given sensing matrix is s-good. Furthermore, they established instructive links between s-goodness and RIP in the CS context. One may wonder whether we can generalize the s-goodness concept to LMR and still maintain many of the nice properties as done in [24]. Here, we deal with this issue. Our approach is based on the singular value decomposition (SVD) of a matrix and the partition technique generalized from CS. In the next section, following Juditsky and Nemirovski’s terminology, we propose definitions of s-goodness and G-numbers, γs and , of a linear transformation in LMR and then we provide some basic properties of G-numbers. In Section 3, we characterize s-goodness of a linear transformation in LMR via G-numbers. We consider the connections between the s-goodness, NSP, and RIP in Section 4. We eventually obtain that δ2s < 0.472⇒𝒜 satisfying is s-good.

Let Wm×n, r : = min {m, n}, and let W = U Diag(σ(W))VT be an SVD of W, where Um×r, Vn×r, and Diag(σ(W)) is the diagonal matrix of σ(W) = (σ1(W), …, σr(W)) T which is the vector of the singular values of W. Also let Ξ(W) denote the set of pairs of matrices (U, V) in the SVD of W; that is,
()
For s ∈ {0,1, 2, …, r}, we say Wm×n is an s-rank matrix to mean that the rank of W is no more than s. For an s-rank matrix W, it is convenient to take as its SVD where Um×sm×s, Vn×sn×s are orthogonal matrices and Ws = Diag((σ1(W), …, σs(W)) T). For a vector yp, let ∥·∥d be the dual norm of ∥·∥ specified by ∥yd : = max v{〈v, y〉:∥v∥ ≤ 1}. In particular, ∥·∥ is the dual norm of ∥·∥1 for a vector. Let ∥X∥ denote the spectral or the operator norm of a matrix Xm×n, that is, the largest singular value of X. In fact, ∥X∥ is the dual norm of ∥X*. Let be the Frobenius norm of X, which is equal to the 2-norm of the vector of its singular values. We denote by XT the transpose of X. For a linear transformation 𝒜 : m×np, we denote by 𝒜* : pm×n the adjoint of 𝒜.

2. Definitions and Basic Properties

2.1. Definitions

We first go over some concepts related to s-goodness of the linear transformation in LMR (RMP). These are extensions of those given for SSR (CMP) in [24].

Definition 1. Let 𝒜 : m×np be a linear transformation and s ∈ {0,1, 2, …, r}. One says that 𝒜 is s-good, if for every s-rank matrix Wm×n, W is the unique optimal solution to the optimization problem

()

We denote by s*(𝒜) the largest integer s for which 𝒜 is s-good. Clearly, s*(𝒜)∈{0,1, …, r}. To characterize s-goodness we introduce two useful s-goodness constants: γs and . We call γs and G-numbers.

Definition 2. Let 𝒜 : m×np be a linear transformation, β ∈ [0, +] and s ∈ {0,1, 2, …, r}. Then we have the following.

(i) G-number γs(𝒜, β) is the infimum of γ ≥ 0 such that for every matrix Xm×n with singular value decomposition (i.e., s nonzero singular values, all equal to 1), there exists a vector yp such that

()
where U = [Um×s Um×(rs)], V = [Vn×s Vn×(rs)] are orthogonal matrices, and
()
If there does not exist such y for some X as above, we set γs(𝒜, β) = +.

(ii) G-number is the infimum of γ ≥ 0 such that for every matrix Xm×n with s nonzero singular values, all equal to 1, there exists a vector yp such that 𝒜*y and X share the same orthogonal row and column spaces:

()
If there does not exist such y for some X as above, we set γs(𝒜, β) = + and to be compatible with the special case given by [24] we write γs(𝒜), instead of γs(𝒜, +), , respectively.

From the above definition, we easily see that the set of values that γ takes is closed. Thus, when γs(𝒜, β)<+, for every matrix Xm×n with s nonzero singular values, all equal to 1, there exists a vector yp such that
()
Similarly, for every matrix Xm×n with s nonzero singular values, all equal to 1, there exists a vector such that and X share the same orthogonal row and column spaces:
()
Observing that the set {𝒜*y : ∥ydβ} is convex, we obtain that if γs(𝒜, β)<+ then for every matrix X with at most s nonzero singular values and ∥X∥ ≤ 1 there exist vectors y satisfying (10) and there exist vectors satisfying (11).

2.2. Basic Properties of G-Numbers

In order to characterize the s-goodness of a linear transformation 𝒜, we study the basic properties of G-numbers. We begin with the result that G-numbers γs(𝒜, β) and are convex nonincreasing functions of β.

Proposition 3. For every linear transformation 𝒜 and every s ∈ {0,1, …, r}, G-numbers γs(𝒜, β) and are convex nonincreasing functions of β ∈ [0, +].

Proof. We only need to demonstrate that the quantity γs(𝒜, β) is a convex nonincreasing function of β ∈ [0, +]. It is evident from the definition that γs(𝒜, β) is nonincreasing for given 𝒜, s. It remains to show that γs(𝒜, β) is a convex function of β. In other words, for every pair β1, β2 ∈ [0, +], we need to verify that

()
The above inequality follows immediately if one of β1, β2 is +. Thus, we may assume β1, β2 ∈ [0, +). In fact, from the argument around (10) and the definition of γs(𝒜, ·), we know that for every matrix X = U Diag(σ(X))VT with s nonzero singular values, all equal to 1, there exist vectors y1, y2p such that for k ∈ {1,2}
()
It is immediate from (13) that . Moreover, from the above information on the singular values of 𝒜*y1, 𝒜*y2, we may set 𝒜*yk = X + Yk, k ∈ {1,2} such that
()
This implies that for every α ∈ [0,1]
()
and hence rank [αY1 + (1 − α)Y2] ≤ rs, X, and [αY1 + (1 − α)Y2] have orthogonal row and column spaces. Thus, noting that 𝒜*[αy1 + (1 − α)y2] = X + αY1 + (1 − α)Y2, we obtain that and
()
for every α ∈ [0,1]. Combining this with the fact
()
we obtain the desired conclusion.

The following observation that G-numbers γs(𝒜, β), are nondecreasing in s is immediate.

Proposition 4. For every ss, one has , .

We further investigate the relationship between the G-numbers γs(𝒜, β) and .

Proposition 5. Let 𝒜 : m×np be a linear transformation, β ∈ [0, +], and s ∈ {0,1, 2, …, r}. Then one has

()

Proof. Let γ : = γs(𝒜, β) < 1. Then, for every matrix Zm×n with s nonzero singular values, all equal to 1, there exists yp, ∥ydβ, such that 𝒜*y = Z + W, where ∥W∥ ≤ γ and W and Z have orthogonal row and column spaces. For a given pair Z, y as above, take . Then we have and

()
where the first term under the maximum comes from the fact that 𝒜*y and Z agree on the subspace corresponding to the nonzero singular values of Z. Therefore, we obtain
()
Now, we assume that . Fix orthogonal matrices Um×r, Vn×r. For an s-element subset J of the index set {1,2, …, r}, we define a set SJ with respect to orthogonal matrices U, V as
()
In the above, denotes the complement of J. It is immediately seen that SJ is a closed convex set in r. Moreover, we have the following

Claim 1. SJ contains the ∥·∥-ball of radius centered at the origin in r.

Proof. Note that SJ is closed and convex. Moreover, SJ is the direct sum of its projections onto the pair of subspaces

()
Let Q denote the projection of SJ onto LJ. Then, Q is closed and convex (because of the direct sum property above and the fact that SJ is closed and convex). Note that LJ can be naturally identified with s, and our claim is the image of Q under this identification that contains the ∥·∥-ball Bs of radius centered at the origin in s. For a contradiction, suppose Bs is not contained in . Then there exists . Since is closed and convex, by a separating hyperplane theorem, there exists a vector us,  ∥u1 = 1 such that
()
Let zr be defined by
()
By definition of , for s-rank matrix U Diag(z)VT, there exists yp such that ∥ydβ and
()
where W and U Diag(z)VT have the same orthogonal row and column spaces, and . Together with the definitions of SJ and , this means that contains a vector with , ∀i ∈ {1,2, …, s}. Therefore,
()
By vBs and the definition of u, we obtain
()
where the strict inequality follows from the facts that and u separates v from . The above string of inequalities is a contradiction, and hence the desired claim holds.

Using the above claim, we conclude that for every J⊆{1,2, …, r} with cardinality s, there exists an xSJ such that , for all iJ. From the definition of SJ, we obtain that there exists yp with such that

()
where if iJ, and if . Thus, we obtain that
()

To conclude the proof, we need to prove that the inequalities we established

()
are both equations. This is straightforward by an argument similar to the one in the proof of [24, Theorem  1]. We omit it for the sake of brevity.

We end this section with a simple argument which illustrates that for a given pair (𝒜, s), γs(𝒜, β) = γs(𝒜) and , for all β large enough.

Proposition 6. Let 𝒜 : m×np be a linear transformation and β ∈ [0, +]. Assume that for some ρ > 0, the image of the unit ∥·∥*-ball in m×n under the mapping X𝒜X contains the ball B = {xp : ∥x1ρ}. Then for every s ∈ {1,2, …, r}

()

Proof. Fix s ∈ {1,2, …, r}. We only need to show the first implication. Let γ : = γs(𝒜) < 1. Then for every matrix Wm×n with its SVD , there exists a vector yp such that

()
where U = [Um×s Um×(rs)], V = [Vn×s Vn×(rs)] are orthogonal matrices, and
()
Clearly, ∥𝒜*y∥ ≤ 1. That is,
()
From the inclusion assumption, we obtain that
()
Combining the above two strings of relations, we derive the desired conclusion.

3. s-Goodness and G-Numbers

We first give the following characterization result of s-goodness of a linear transformation 𝒜 via the G-number γs(𝒜), which explains the importance of γs(𝒜) in LMR.

Theorem 7. Let 𝒜 : m×np be a linear transformation, and s be an integer s ∈ {0,1, 2, …, r}. Then 𝒜 is s-good if and only if γs(𝒜) < 1.

Proof. Suppose 𝒜 is s-good. Let Wm×n be a matrix of rank s ∈ {1,2, …, r}. Without loss of generality, let be its SVD where Um×sm×s, Vn×sn×s are orthogonal matrices and Ws = Diag((σ1(W), …, σs(W)) T). By the definition of s-goodness of 𝒜, W is the unique solution to the optimization problem (6). Using the first-order optimality conditions, we obtain that there exists yp such that the function fy(x) = ∥X*yT[𝒜X𝒜W] attains its minimum value over Xm×n at X = W. So 0 ∈ fy(W) or 𝒜*yW*. Using the fact (see, e.g., [27])

()
it follows that there exist matrices Um×(rs), Vn×(rs) such that 𝒜*y = U Diag(σi(𝒜*y))VT where U = [Um×s Um×(rs)], V = [Vn×s Vn×(rs)] are orthogonal matrices and
()
where J : = {i : σi(W) ≠ 0} and . Therefore, the optimal objective value of the optimization problem
()
is at most one. For the given W with its SVD , let
()
It is easy to see that Π is a subspace and its normal cone (in the sense of variational analysis, see, e.g., [28] for details) is specified by Π. Thus, the above problem (38) is equivalent to the following convex optimization problem with set constraint:
()
We will show that the optimal value is less than 1. For a contradiction, suppose that the optimal value is one. Then, by [28, Theorem 10.1 and Exercise  10.52], there exists a Lagrange multiplier Dm×n such that the function
()
has unconstrained minimum in (y, M) equal to 1, where δΠ(·) is the indicator function of Π. Let (y*, M*) be an optimal solution. Then, by the optimality condition 0 ∈ L, we obtain that
()
Direct calculation yields that
()
Then there exist DJΠ and such that . Notice that [29, Corollary  6.4] implies that for , and . Therefore, and . Moreover, by the definition of the dual norm of ∥·∥. This together with the facts 𝒜D = 0, DJΠ and yields
()
Thus, the minimum value of  L(y, M) is attained, , when M*Π, . We obtain that . By assumption, . That is, . Without loss of generality, let SVD of the optimal M* be , where and . From the above arguments, we obtain that
  • (i)

    𝒜D = 0,

  • (ii)

    ,

  • (iii)

    .

Clearly, for every t, the matrices Xt : = W + tD are feasible in (6). Note that

()
Then, . From the above equations, we obtain that for all small enough t > 0 (since σi(W) > 0, i ∈ {1,2, …, s}). Noting that W is the unique optimal solution to (6), we have Xt = W, which means that for iJ. This is a contradiction, and hence the desired conclusion holds.

We next prove that 𝒜 is s-good if γs(𝒜) < 1. That is, we let W be an s-rank matrix and we show that W is the unique optimal solution to (6). Without loss of generality, let W be a matrix of rank s ≠ 0 and its SVD, where , are orthogonal matrices and . It follows from Proposition 4 that . By the definition of γs(𝒜), there exists yp such that 𝒜*y = U Diag(σ(𝒜*y))VT, where , , and

()
Now, we have the optimization problem of minimizing the function
()
over all Xm×n such that 𝒜X = 𝒜W. Note that 〈𝒜*y, X〉≤∥X* by ∥𝒜*y∥ ≤ 1 and the definition of dual norm. So f(X) ≥ ∥X* − ∥X* + ∥W* = ∥W* and this function attains its unconstrained minimum in X at X = W. Hence X = W is an optimal solution to (6). It remains to show that this optimal solution is unique. Let Z be another optimal solution to the problem. Then f(Z) − f(W) = ∥Z*yT𝒜Z = ∥Z* − 〈𝒜*y, Z〉 = 0. This together with the fact ∥𝒜*y∥ ≤ 1 implies that there exist SVDs for 𝒜*y and Z such that
()
where and are orthogonal matrices, and σi(Z) = 0 if σi(𝒜*y) ≠ 1. Thus, for σi(𝒜*y) = 0, for all i ∈ {s + 1, …, r}, we must have σi(Z) = σi(W) = 0. By the two forms of SVDs of 𝒜*y as above, where , are the corresponding submatrices of , , respectively. Without loss of generality, let
()
where and for the corresponding index j ∈ {i : σi(𝒜*y) = 0,  i ∈ {s + 1, …, r}}. Then we have
()
From , we obtain that
()
Therefore, we deduce
()
Clearly, the rank of Ω is no less than rsrs. From the orthogonality property of U, V and , we easily derive that
()
Thus, we obtain ΩT(ZW) = 0, which implies that the rank of the matrix ZW is no more than s. Since γs(𝒜) < 1, there exists such that
()
Therefore, . Then Z = W.

For the G-number , we directly obtain the following equivalent theorem of s-goodness from Proposition 5 and Theorem 7.

Theorem 8. Let 𝒜 : m×np be a linear transformation, and s ∈ {1,2, …, r}. Then 𝒜 is s-good if and only if .

4. s-Goodness, NSP, and RIP

This section deals with the connections between s-goodness, the null space property (NSP), and the restricted isometry property (RIP). We start with establishing the equivalence of NSP and G-number . Here, we say 𝒜  satisfies NSP if for every nonzero matrix X ∈ Null(𝒜) with the SVD X = U Diag(σ(X))VT, then we have
()
For further details, see, for example, [14, 1921] and references therein.

Proposition 9. For the linear transformation 𝒜, if and only if 𝒜 satisfies NSP.

Proof. We first give an equivalent representation of the G-number . We define a compact convex set first:

()
Let Bβ : = {yp : ∥ydβ} and B : = {Xm×n : ∥X∥ ≤ 1}. By definition, is the smallest γ such that the closed convex set Cγ,β : = 𝒜*Bβ + γB contains all matrices with s nonzero singular values, all equal to 1. Equivalently, Cγ,β contains the convex hull of these matrices, namely, Ps. Note that γ satisfies the inclusion PsCγ,β if and only if for every Xm×n
()
For the above, we adopt the convention that whenever β = +, β𝒜X∥ is defined to be + or 0 depending on whether ∥𝒜X∥ > 0 or ∥𝒜X∥ = 0. Thus, PsCγ,β if and only if . Using the homogeneity of this last relation with respect to X, the above is equivalent to
()
Therefore, we obtain . Furthermore,
()

For Xm×n with 𝒜X = 0, let X = U Diag(σ(X))VT be its SVD. Then, we obtain the sum of the s largest singular values of X as

()
From (59), we immediately obtain that is the best upper bound on ∥Xs,* of matrices X ∈ Null(𝒜) such that ∥X* ≤ 1. Therefore, implies that the maximum value of ∥·∥s,*-norms of matrices X ∈ Null(𝒜) with ∥X* = 1 is less than 1/2. That is, . Thus, and hence 𝒜 satisfies NSP. Now, it is easy to see that 𝒜 satisfies NSP if and only if .

Next, we consider the connection between restricted isometry constants and G-number of the linear transformation in LMR. It is well known that, for a nonsingular matrix (transformation) Tp×p, the RIP constants of 𝒜 and T𝒜 can be very different, as shown by Zhang [30] for the vector case. However, the s-goodness properties of 𝒜 and T𝒜 are always the same for a nonsingular transformation Tp×p (i.e., s-goodness properties enjoy scale invariance in this sense). Recall that the s-restricted isometry constant δs of a linear transformation 𝒜 is defined as the smallest constant such that the following holds for all s-rank matrices Xm×n:
()
In this case, we say 𝒜 possesses the RI  (δs)-property (RIP) as in the CS context. For details, see [4, 3134] and the references therein.

Proposition 10. Let 𝒜 : m×np be a linear transformation and s ∈ {0,1, 2, …, r}. For any nonsingular transformation Tp×p, .

Proof. It follows from the nonsingularity of  T that {X : 𝒜X = 0} = {X : T𝒜X = 0}. Then, by the equivalent representation of the G-number in (59),

()

For the RIP constant δ2s, Oymak et al. [21] gave the current best bound on the restricted isometry constant δ2s < 0.472, where they proposed a general technique for translating results from SSR to LMR. Together with the above arguments, we immediately obtain the following theorem.

Theorem 11. δ2s < 0.472⇒𝒜 satisfying is s-good.

Proof. It follows from [21, Theorem  1], Proposition 9, and Theorems 7 and 8.

The above theorem says that s-goodness is a necessary and sufficient condition for recovering the low-rank solution exactly via nuclear norm minimization.

5. Conclusion

In this paper, we have shown that s-goodness of the linear transformation in LMR is a necessary and sufficient conditions for exact s-rank matrix recovery via the nuclear norm minimization, which is equivalent to the null space property. Our analysis is based on the two characteristic s-goodness constants, γs and , and the variational property of matrix norm in convex optimization. This shows that s-goodness is an elegant concept for low-rank matrix recovery, although γs and may not be easy to compute. Development of efficiently computable bounds on these quantities is left to future work. Even though we develop and use techniques based on optimization, convex analysis, and geometry, we do not provide explicit analogues to the results of Donoho [35] where necessary and sufficient conditions for vector recovery special case were derived based on the geometric notions of face preservation and neighborliness. The corresponding generalization to low-rank recovery is not known, currently the closest one being [22]. Moreover, it is also important to consider the semidefinite relaxation (SDR) for the rank minimization with the positive semidefinite constraint since the SDR convexifies nonconvex or discrete optimization problems by removing the rank-one constraint. Another future research topic is to extend the main results and the techniques in this paper to the SDR.

Acknowledgments

The authors thank two anonymous referees for their very useful comments. The work was supported in part by the National Basic Research Program of China (2010CB732501), the National Natural Science Foundation of China (11171018), a Discovery Grant from NSERC, a research grant from the University of Waterloo, ONR research Grant N00014-12-10049, and the Fundamental Research Funds for the Central Universities (2011JBM306, 2013JBM095).

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.