Volume 2012, Issue 1 312985
Research Article
Open Access

Least Squares Problems with Absolute Quadratic Constraints

R. Schöne

R. Schöne

Institute for Software Systems in Technical Appliations of Computer Science (FORWISS), University of Passau, InnstraBe 43, 94032 Passau, Germany uni%2Dpassau.de

Search for more papers by this author
T. Hanning

Corresponding Author

T. Hanning

Department of Mathematics and Computer Science, University of Passau, InnstraBe 43, 94032 Passau, Germany uni%2Dpassau.de

Search for more papers by this author
First published: 15 September 2011
Citations: 2
Academic Editor: Juan Manuel Peña

Abstract

This paper analyzes linear least squares problems with absolute quadratic constraints. We develop a generalized theory following Bookstein′s conic-fitting and Fitzgibbon′s direct ellipse-specific fitting. Under simple preconditions, it can be shown that a minimum always exists and can be determined by a generalized eigenvalue problem. This problem is numerically reduced to an eigenvalue problem by multiplications of Givens′ rotations. Finally, four applications of this approach are presented.

1. Introduction

The least squares methods cover a wide range of applications in signal processing and system identification [15]. Many technical applications need robust and fast algorithms for fitting ellipses to given points in the plane. In the past, effective methods were Bookstein′s conic-fitting or Fitzgibbon′s direct ellipse-specific fitting, where an algebaic distance with a quadratic constraint is minimized [6, 7]. In this paper, we develop an extended theory of minimization of least squares with a quadratic constraint based on the ideas of Bookstein and Fitzgibbon. Thereby, we show the existence of a minimal solution and present the uniqueness regarding to the smallest positive generalized eigenvalue. So, arbitrary conic fitting problems with quadratic constraints are possible.

Let An×m be matrix with nm ≥ 2, Cm×m be symmetric matrix, and d a real value. We consider the problem of finding a vector xm which minimizes the function F : m defined by
(1.1)
The side condition xtCx = d introduces an absolute quadratic constraint. The problem (1.1) is not a special case of Gander′s optimization as presented in [8], because in our case C is a real symmetric matrix in contrast to the approach of Gander, where the side condition considers real symmetric matrices CtC which are positive definite. For our considerations, we require the following three assumptions

Assumption 1.1. By replacing C by (−C), we consider d ≥ 0. For d = 0, the trivial solution x = 0 ∈ m fulfills (1.1). Therefore, we demand d > 0.

Assumption 1.2. The set N : = {xm,   xtCx = d} is not empty, that is, the matrix C has not been less than one positive eigenvalue. If we assume that C has only nonpositive eigenvalues, it would be negative semidefinite and

(1.2)
holds. With d > 0 it follows that the set N would be empty in this case.

Assumption 1.3. In the following, we set S = AtA and assume that S is regular. S is sometimes called scatter matrix.

In the following two sections, we introduce the theoretical basics of this optimization. The main result is the solution of a generalized eigenvalue problem. Afterwards, we reduce this system numerically to an eigenvalue problem. In Section 5, we present four typical applications for conic fitting problems with quadratic constraints. These approximations contain the ellipse fitting of Fitzgibbon, the hyperbola fitting of O′Leary, the conic fitting of Bookstein, and an optical application of shrinked aspheres [6, 7, 9, 10].

2. Existence of a Minimal Solution

Theorem 2.1. If Sm×m is regular, then there exists a global minimum to the problem (1.1).

Proof. The real regular matrix S = AtA is symmetric and positive definite. Therefore, a Cholesky decomposition S = RtR exists with a regular upper triangular matrix Rm×m. In (1.1), we are looking for a solution xm minimizing

(2.1)
With R regular we substitute x by R−1y for ym. Thus, we obtain an equivalent problem to (2.1), where we want to find a vector ym, minimizing
(2.2)
Now, we define G : m with G(y) = yt(R−1) tCR−1yd and look for a solution y on the zero-set NG of G with minimal distance to the point of origin. Let y0NG and be the closed sphere of m in 0 with radius r0 = ∥y0∥. Because of and G being continuous, the set
(2.3)
is nonempty, closed and bounded. Therefore, for the continuous function F(y) = ∥y2 exists a minimal value yM on with
(2.4)
For all y from , it is . So, yM is a minimal value of F in NG. By the equivalence of (2.1), and (2.2) the assumption follows.

3. Generalized Eigenvalue Problem

The minimization problem in (1.1) induces a generalized eigenvalue problem. The following theorem is already proven by Bookstein and Fitzgibbon for the special case of ellipse-fitting [6, 7].

Theorem 3.1. If xs is an extremum of F(x) subject to xtCx = d, then a positive λ0 exists with

(3.1)
that is, xs is an eigenvector to the generalized eigenvalue  λ0 and
(3.2)
holds.

Proof. Let G : m be a defined as G(x): = dxtCx. For G(x) = 0 and d > 0 follows x ≠ 0. Further, G is continuously differentiable with dG/dx = −2Cx ≠ 0 for all x of the zero-set of G. So, if xs is a local extremum of F(x) subject to G(x) = 0, then it is rank (dG/dx)(xs) = 1. Since F is also a continuously differentiable function in m with m > 1, it follows by using a Lagrange multiplier [11]: if xs is a local extremum of F(x) subject to G(x) = 0, then a λ0 exists, such that the Lagrange function ϕ : m+1 given as

(3.3)
has a critical point in (xs, λ0). Therefore, xs fulfills necessarily the equations:
(3.4)
(3.5)
The first equation describes a generalized eigenvalue problem with
(3.6)
With d > 0, xs ≠ 0 and xs fulfills (3.6), λ must be a generalized eigenvalue, and xs is a corresponding eigenvector to λ of (3.6), so that (SλC) is a singular matrix. If λ0 is an eigenvalue and x0 ≠ 0 a corresponding eigenvector to λ0 of (3.6), then every vector αx0 is also a solution of (3.6) for λ0. Now, we are looking for α, such that xs = αx0 satisfies (3.5). For λ0 ≠ 0 and (3.4), it follows
(3.7)
Because the left side and the numerator are positive, the denominator must also be chosen positive, that is, only positive eigenvalues solve (3.4) and (3.5). By the multiplication with λ0,
(3.8)
follows and xs = α · x0 fulfills the constraint G(xs) = 0.

Remark 3.2. Let x0 be a generalized eigenvector to a positive eigenvalue λ0 of problem (3.6). Then

(3.9)
are solutions of (3.8).

Lemma 3.3. If S is regular and C is symmetric, then all eigenvalues of (3.1) are real-valued and different to zero.

Proof. With det (S) ≠ 0,  λ0 ≠ 0 in (3.1). The Cholesky decomposition S = RtR with a regular upper triangular matrix R yields (3.1)

(3.10)
With R invertible and the substitution ,  ys = Rxs, we obtain an eigenvalue problem to the matrix (Rt) −1CR−1:
(3.11)
Furthermore, we have
(3.12)
Therefore, the matrix (Rt) −1CR−1 is symmetric and all eigenvalues μ0 are real. With for μ0 ≠ 0 follows the proposition.

Remark 3.4. Because of S regular and λ ≠ 0 in

(3.13)
we can consider the equivalent problem with instead of (3.11):
(3.14)
This system is called inverse eigenvalue problem to (3.1). Here, the eigenspaces to the generalized eigenvalue λ0 in (3.1) and to the eigenvalue 1/λ0 in (3.14) are identical. Therefore, the generalized eigenvectors in (3.1) are perpendicular.

Definition 3.5. The set of all eigenvalues of a matrix C is called spectrum σ(C). We call the set of all eigenvalues to the generalized eigenvalue problem in (3.1) also spectrum and denote σ(S, C). σ+(S, C) is defined as the set of all positive values in σ(S, C).

Remark 3.6. In case of rg(C) < m = rg(S), the inverse problem in (3.14) has eigenvalue 0 with multiplicity rg(S) − rg(C). Otherwise, for μ ≠ 0 and μσ(S−1C) follows 1/μσ(S, C). Analogously, for μσ((Rt) −1CR−1) with μ ≠ 0,  1/μσ(S, C).

The following lemma is a modified result of Fitzgibbon [7].

Lemma 3.7. The signs of the generalized eigenvalues of (3.1) are the same as those of C.

Proof. With S being nonsingular, every generalized eigenvalue λ0 of (3.1) is not zero. Therefore, it follows for the equivalent problem (3.11) that is also a positive eigenvalue to (R−1) tCR−1, where R is an upper triangular matrix to the Cholesky decomposition of S. With Sylvester′s Inertia Law, we know that the signs of eigenvalues of the matrices (R−1) tCR−1 are the same as those of C.

For the following proofs, we need the lemma of Lagrange (see, e.g., [12]).

Lemma 3.8 (Lemma of Lagrange). For Mn, f : M, g = (g1, …, gk) : Mk, and Ng = {xM, g(x) = 0 ∈ k}, let λk, so that xsNg is a minimal value of the function Φλ : M with

(3.15)
Then xs is a minimal solution of f in Ng.

Definition 3.9. Let λ* be the smallest positive value of σ+(S, C) and a corresponding generalized eigenvector to λ* to the constraint .

Lemma 3.10. Let SλC be a positive semidefinite matrix for λσ+(S, C). Then a generalized eigenvector xs corresponding to λ is a local minimum of (1.1).

Proof. We consider Φ : m with

(3.16)
With gradxΦ(x) = 2(SxλCx) holds gradxΦ(xs) = 0 and since the Hessian matrix (SλC) of Φ is positive semidefinite, the vector xs is a minimal solution of Φ. Then, xs minimizes also F(x) subject to xtCx = d by the Lemma 3.8.

Remark 3.11. In fact, in Lemma 3.10, we require only a semidefinite matrix, because xs ≠ 0 fulfills in (3.1).

Lemma 3.12. The matrix (Sλ*C) is positive semidefinite.

Proof. Let μ be an arbitrary eigenvalue of ((λ*) −1I − (Rt) −1CR−1), where R is the upper triangular matrix of the Cholesky decomposition from S. With

(3.17)
it follows that ((λ*) −1μ) is an eigenvalue of (Rt) −1CR−1. This value is corresponding in (3.11) with the inverse eigenvalue of problem (3.1). Furthermore, it yields
(3.18)
So μ ≥ 0 follows, that is, ((λ*) −1I − (Rt) −1CR−1) is positive semidefinite, and for ym we obtain
(3.19)
By setting y = Rx and with regular R, we get
(3.20)
With λ* > 0, xt(Sλ*C)x ≥ 0 for xm, that is, Sλ*C is positive semidefinite.

Theorem 3.13. For the smallest value λ* in σ+(S, C) exists a corresponding generalized eigenvector , which minimizes F(x) subject to xtCx = d.

Proof. The matrix (Sλ*C) is positive semidefinite by Lemma 3.12, and with Lemma 3.10 it follows that is a local minimum of problem (1.1). Furthermore, we know by Theorem 3.1 that if xs is a local extremum of F(x) subject to xtCx = d, then a positive value λs exists with

(3.21)
and it is F(xs) = λsd. Because of the existence of a minimum xE in Theorem 2.1 a value λEσ+(S, C) exists in problem (3.21) regarding to xE. Otherwise, for an arbitrary local minimum xs,
(3.22)
So, λ* = λE follows and is a minimum of F(x) subject to xtCx = d.

Example 3.14. We minimize F : 2 with subject to . So, we have d = 1, S the identity matrix I22×2, and C2×2 a diagonal matrix with values −1 and 1. Then, we get the following generalized eigenvalue problem:

(3.23)
with eigenvalues 1 and −1. Because of Theorem 3.1, we consider a generalized eigenvector (α,0)t with α∖{0} only for λ = 1. Then, (1,0)t and (−1,0)t are solutions subject to . This result is conform to the geometric interpretation, since we are looking for x = (ξ1, ξ2) t on the hyperbola with minimal distance to the origin.

4. Reduction to an Eigenvalue Problem of Dimension rg(C)

In numerical applications, a generalized eigenvalue problem is mostly reduced to an eigenvalue problem, for example, by multiplication with S−1. Thus, we obtain the inverse problem (3.14) from (3.1) (see, e.g., [13]). But, S may be ill-conditioned, so that a solution of (3.14) may be numerical instable. Therefore, we present another reduction of (3.1).

Many times, C is a sparse matrix with r : = rank (C) ≤ rank (S). This symmetric matrix C is diagonalizable in C = PtDP with P orthogonal and D diagonal. Further, we assume that the first r diagonal entrees in D are different to 0. For the characteristic polynomial in (3.1), it follows
(4.1)
The order of p is r. We decompose these matrices in
(4.2)
with S1, D1r×r, S2r×(mr), and S3(mr)×(mr). Now, we eliminate S2 in PSPt by multiplications with Givens rotations Gkm×m,     k = 1, …, l, so that it follows:
(4.3)
with Σ1, Δ1r×r, Σ2, Δ2(mrr, and Σ3(mr)×(mr). In (4.1), we achieve with orthogonal Gk,     k = 1, …, l
(4.4)
Because of p(0) = det (PSPt) = det (S) ≠ 0 and p(0) = det (Σ3)det (Σ1), the submatrices Σ1, Σ3 are regular and the generalized eigenvalues of det (Σ1λΔ1) are different to zero. So, with yr
(4.5)
can be transformed to an equivalent eigenvalue problem with
(4.6)
This system can be solved by finding the matrix X with Δ1X = Σ1 using the Gaussian elimination and determining the eigenvalues of X computing the QR algorithm [13]. Because all steps are equivalent, we have , that is, the eigenvalues of (3.1) and (4.6) are the same.
With Theorem 3.13, we are looking for the smallest value λ*σ+(S, C) and a corresponding generalized eigenvector to minimize the problem (3.1). So,
(4.7)
yields. By substitution of for , we obtain
(4.8)
We decompose into the subvectors and with . Then, is a generalized eigenvector for λ* of the problems (4.5) and (4.6).
Let be an eigenvector to the smallest positive eigenvalue λ* of (4.6). Since Σ3 is regular, it follows in (4.8)
(4.9)
and a generalized eigenvector for λ* in (3.1) is given as:
(4.10)

5. Applications in Conic Fitting

5.1. Fitzgibbon′s Ellipse Fitting

First, we would like to find an ellipse for a given set of points in 2. Generally, a conic in 2 is implicitly defined as the zero set of f : 6 × 2 to a constant parameter a = (α1, …, α6) t6:
(5.1)
The equation f(a, ξ, η) = 0 can also be written with x = (ξ, η) t as
(5.2)
The eigenvalues λ1,  λ2 of A characterize a conic uniquely [14]. Thus, we need for ellipses in f(a, x) = 0. Furthermore, every scaled vector μa with μ∖{0} describes the same zero-set of f. So, we can impose the constraint for ellipses with . For n    (n ≥ 6) given points (ξi, ηi) t2, we want to find a parameter a6, which minimizes F : 6 with
(5.3)
This ellipse fitting problem is established and solved by Fitzgibbon [7]. With the following matrices Dn×6, C6×6
(5.4)
and , we achieve the equivalent problem:
(5.5)
For S = DtD, we have a special case of (1.1). Assuming S is a regular matrix and since the eigenvalues of C are −2, −1, 0, and 2, by lemma 3.3 we know that the generalized eigenvalue problem
(5.6)
has exactly one positive solution λ*. Because of Theorem 3.13 a corresponding generalized eigenvector a* to λ* minimizes the problem (5.5) and a* consists of the coefficients of an implicit given ellipse.

A numerically stable noniterative algorithm to solve this optimization problem is presented by Halir and Flusser [15]. In comparison with Section 4, their method uses a special block decomposition of the matrices D and C.

5.2. Hyperbola Fitting

Instead of ellipses, O′Leary and Zsombor-Murray want to find a hyperbola to a set of scattered data xi2 [9]. A hyperbola is a conic, which can uniquely be characterized by [14]. So, we consider the constraint and obtain the optimization problem:
(5.7)
with D and C being chosen in 5.1. The matrix (−C) has two positive eigenvalues. In this case, a solution is given by a generalized eigenvector to the smallest value in σ+(S, −C). But O′Leary and Zsombor-Murray determine the best hyperbolic fit by evaluation of , where the eigenvector ai = (ai,1, …, ai,6) t is associated to a positive value of σ+(S, −C).

5.3. Bookstein′s Conic Fitting

In Bookstein′s method, the conic constraint is restricted to
(5.8)
where λ1,  λ2 are the eigenvalues of A in f [6]. There, it is and at least one of them different to 0. But the constraint (5.8) is not a restriction to a class of conics. Here, we determine an arbitrary conic, which minimizes
(5.9)
The resulting data matrix   Dn×6 is the same as for Fitzgibbon′s problem. The constraint matrix C6×6 has a diagonal shape with the entrees (2,1, 2,0, 0,0), that is, all eigenvalues of C are nonnegative. In the case of a regular matrix S, the problem (5.9) is solved for a generalized eigenvector to the smallest value in σ+(S, C).

5.4. Approximation of Shrinked Aspheres

After the molding process in optical applications, the shrinkage of rotation-symmetric aspheres is implicitly defined for x = (ξ, ζ) t in
(5.10)
where r∖{0} and a = (a1, …, a4) t are aspheric-specific constants [10]. For i = 1, …, n with n ≥ 4, the scattered data xi = (ξi, ζi) t2 of a shrinked asphere are given in this approximation problem. Here, we are looking for the conic parameter a = (α1, …, α4) t for a fixed value rref, which minimizes
(5.11)
Analogously to Fitzgibbon, we have the matrices Dn×4 and C4×4 with
(5.12)
and with we get the following optimization problem:
(5.13)
This is also an application of (1.1). The matrix C has the eigenvalues −2,0, 1, and 2. So, the generalized eigenvalue problem in (3.1) with regular S = DtD4×4 has two positive values in σ+(S, C). With Theorem 3.13, a generalized eigenvector a*4 to the smaller of both values solves (5.13).

The coefficients αi in the problems (5.5) and (5.13) correspond not to the same monomials  ξkζl. Hence, we have different matrices D and C.

6. Conclusion

In this paper, we present a minimization problem of least squares subject to absolute quadratic constraints. We develop a closed theory with the main result that a minimum is a solution of a generalized eigenvalue problem corresponding to the smallest positive eigenvalue. Further, we show a reduction to an eigensystem for numerical calculations. Finally, we study four applications about conic approximations. We analyze Fitzgibbon′s method for direct ellipse-specific fitting, O′Leary′s direct hyperbola approximation, Bookstein′s conic fitting, and an optical application of shrinked aspheres. All these systems are attribute to the general optimization problem.

    The full text of this article hosted at iucr.org is unavailable due to technical difficulties.