next up previous contents
Next: The SQP algorithm Up: Detailed description of the Previous: Detailed description of the   Contents

Subsections

The QP algorithm

The material of this section is based on the following reference: [Fle87].
We want to find the solution of:

$\displaystyle \min_{x} g^t x + \frac{1}{2}
 x^t H x \; \;$    subject to $\displaystyle \; \; A x \geq b$ (9.1)

where $ A \in \Re^{m \times n}$ and $ b \in \Re^m$.
We will assume in this chapter that $ H$ is positive definite. There is thus only one solution. We will first see how to handle the following simpler problem:

$\displaystyle \min_{x} g^t x + \frac{1}{2}
 x^t H x \; \;$    subject to $\displaystyle \; \; A x = b$ (9.2)


Equality constraints

Let $ Y$ and $ Z$ be $ n \times m$ and $ n \times (n-m)$ matrices respectively such that $ [Y:Z]$ is non-singular, and in addition let $ A Y=I$ and $ A Z=0$. The solution of the equation $ Ax=b$ is given by:

$\displaystyle x=Y b + Z y$ (9.3)

where $ y$ can be any vector. Indeed, we have the following: $ A x = A (Y b+ Z y)= A Y b+
A Z y= b$. The matrix $ Z$ has linearly independent columns $ z_1,
\ldots, z_{n-m}$ which are inside the null-space of $ A$ and therefore act as bases vectors (or reduced coordinate directions) for the null space. At any point $ x$ any feasible correction $ \delta$ can be written as:

$\displaystyle \delta =Z y = \sum_{i=1}^{n-m} z_i
 y_i$ (9.4)

where $ y_1,\ldots,y_{n-m}$ are the components (or reduced variables) in each reduced coordinate direction (see Figure 9.1).

Figure 9.1: The null-space of A
\begin{figure}
\centering\epsfig{figure=figures/reducedspace.eps, width=9cm,
height=6.5cm}
\end{figure}

Combining equation 9.2 and 9.3, we obtain the reduced quadratic function:

$\displaystyle \psi(y)= \frac{1}{2} y^t Z^t H Z y
 + (g+ H Y b)^t Z y + \frac{1}{2} (g+H Y b)^t Y b$ (9.5)

If $ Z^t H Z$ is positive definite then a minimizer $ y^*$ exists which solves the linear system:

$\displaystyle (Z^t H Z) y = -Z^t (g+G Y b)$ (9.6)

The solution $ y^*$ is computed using Cholesky factorization. Once $ y^*$ is known we can compute $ x^*$ using equation 9.3 and $ g^*$ using the secant equation 13.29: $ g^*=H x^* +
g $. Let's recall equation 13.19:

$\displaystyle g(x) = \sum_{j\in
 E} \lambda_j \nabla c_j(x)$ (9.7)

Let's now compute $ \lambda^*$: From equation 9.7 we have:

$\displaystyle g^* = A^t \lambda^*$ (9.8)

Pre-Multiply 9.8 by $ Y^t$ and using $ Y^t A^t=I$, we have:

$\displaystyle Y^t g^* = \lambda^*$ (9.9)

Depending on the choice of $ Y$ and $ Z$, a number of methods exist. We will obtain $ Y$ and $ Z$ from a QR factorization of the matrix $ A^t$ (see annexe for more information about the QR factorization of a matrix). This can be written:

$\displaystyle A^t= Q \left[
 \begin{array}{c} R
 \\  0 \end{array} \right] = \l...
..._1 \; Q_2 \right] \left[
 \begin{array}{c} R \\  0
 \end{array} \right] = Q_1 R$ (9.10)

where $ Q \in \Re^{n \times n}$ is an orthogonal matrix, $ R
\in \Re^{m \times m}$ is an upper triangular matrix, $ Q_1 \in
\Re^{n \times m}$ and $ Q_2 \in \Re^{n \times (n-m)}$. The choices

$\displaystyle Y=Q_1 (R^t)^{-1}, \; \; \; \; Z=Q_2$ (9.11)

have the correct properties. Moreover the vector $ Y b$ in equation 9.3 and figure 9.2 is orthogonal to the constraints. The reduced coordinate directions $ z_i$ are also mutually orthogonal. $ Y b$ is calculated by forward substitution in $ R^t u=b$ followed by forming $ Y b=Q_1 u$. The multipliers $ \lambda^*$ are calculated by backward substitution in $ R \lambda^* = Q_1^t g^*$.

Figure 9.2: A search in the reduced space of the active constraints give as result $ y$
\begin{figure}
\centering\epsfig{figure=figures/activeconstraint2.eps, width=9cm,
height=6.5cm}
\end{figure}

Active Set methods

Certain constraints, indexed by the Active set $ \cal A$, are regarded as equalities whilst the rest are temporarily disregarded, and the method adjusts this set in order to identify the correct active constraints at the solution to 9.1. Each iteration attempts to locate the solution to an Equality Problem (EP) in which only the active constraints occur. This is done by solving:

$\displaystyle \min_{x} g^t x + \frac{1}{2}
 x^t H x \; \;$    subject to $\displaystyle \; \; a_i x = b_i, \; i \in {\cal A}$ (9.12)

Let's define, as usual, $ x_{k+1}=x_k+\alpha_k
s_k$. If $ x_{k+1}$ is infeasible then the length $ \alpha_k$ of the step is chosen to solve:

$\displaystyle \alpha_k= \min \left( 1, \min_{i: i
 \notin {\cal A}} \frac{b_i-a_i x_k}{a_i s_k}
 \right)$ (9.13)

If $ \alpha_k<1$ then a new constraint becomes active, defined by the index which achieves the min in 9.13, and this index is added to the active set $ {\cal A}$. Suppose we have the solution $ x^*$ of the EP. We will now make a test to determine if a constraint should become inactive. All the constraints which have a negative associated Lagrange multiplier $ \lambda $ can become inactive. To summarize the algorithm, we have:
  1. Set $ k=1, \;\; {\cal A}=\emptyset,\;\; x_1=$ a feasible point.
  2. Solve the EP.
  3. Compute the Lagrange multipliers $ \lambda_k$. If $ \lambda_k
\geq 0$ for all constraints then terminate. Remove the constraints which have negative $ \lambda_k$.
  4. Solve 9.13 and activate a new constraint if necessary.
  5. set $ k:=k+1$ and go to (2).
An illustration of the method for a simple QP problem is shown in figure 9.3. In this QP, the constraints are the bounds $ x \geq 0$ and a general linear constraint $ a^t x \geq b$.

Figure 9.3: Illustration of a simple QP
\begin{figure}
\centering\epsfig{figure=figures/QPillustration.eps, width=9cm,
height=6.5cm}
\end{figure}

Duality in QP programming

If $ H$ is positive definite, the dual of ( $ x \in \Re^n, b \in
\Re^m $)

$\displaystyle \min_x \frac{1}{2} x^t H x + x^t g \; \;$   subject to$\displaystyle \; Ax \geq b$ (9.14)

is given by

$\displaystyle \max_{x, \lambda} \frac{1}{2}
 x^t H x + x^t g - \lambda^t (A x-b) \; \;$   subject to$\displaystyle \;
 \begin{cases}H x + g - A^t \lambda= 0 \\  \lambda \geq 0
 \end{cases}$ (9.15)

By eliminating $ x$ from the first constraint equation, we obtain the bound constrained problem:

$\displaystyle \max_\lambda \frac{1}{2} \lambda^t (A H^{-1} A^t) \lambda +
 \lambda^t ( b + A H^{-1} g ) - \frac{1}{2} g^t H^{-1} g \; \; \;
 \;$   subject to$\displaystyle \; \; \; \lambda \geq 0$ (9.16)

This problem can be solved by means of the gradient projection method, which normally allows us to identify the active set more rapidly than with classical active set methods. The matrix $ (A
H^{-1} A^t)$ is semi-definite positive when $ H$ is positive definite. This is good. Unfortunately, if we have linearly dependent constraints then the matrix $ (A
H^{-1} A^t)$ becomes singular (this is always the case when $ m>n$). This lead to difficulties when solving 9.16. There is, for example, no Cholesky factorization possible.
My first algorithm used gradient projection to identify the active set. It then project the matrix $ (A
H^{-1} A^t)$ into the space of the active constraints (the projection is straight forward and very easy) and attempt a Cholesky factorization of the reduced matrix. This fails very often. When it fails, it uses "steepest descent algorithm" which is sloooooooow and useless. The final implemented QP works in the primal space and use QR factorization to do the projection.

A note on the implemented QP

The QP which has been implemented uses normalization techniques on the constraints to increase robustness. It's able to start with an infeasible point. When performing the QR factorization of $ A^t$, it is using pivoting techniques to improve numerical stability. It has also some very primitive technique to avoid cycling. Cycling occurs when the algorithm returns to a previous active set in the sequence (because of rounding errors). The QP is also able to "warm start". "Warm start" means that you can give to the QP an approximation of the optimal active set $ {\cal A}$. If the given active set and the optimal active set $ {\cal A^*}$ are close, the solution will be found very rapidly. This feature is particularly useful when doing SQP. The code can also compute very efficiently "Second Order Correction steps" (SOC step) which are needed for the SQP algorithm. The SOC step is illustrated in figure 9.4. The SOC step is perpendicular to the active constraints. The length of the step is based on $ \Delta b$ which is calculated by the SQP algorithm. The SOC step is simply computed using:

$\displaystyle SOC= -Y \Delta b$ (9.17)

Figure 9.4: The SOC step
\begin{figure}
\centering\epsfig{figure=figures/soc.eps, width=9.5cm,
height=6.5cm}
\end{figure}

The solution to the EP (equation 9.12) is computed in one step using a Cholesky factorization of $ H$. This is very fast but, for badly scaled problem, this can lead to big rounding errors in the solution. The technique to choose which constraint enters the active set is very primitive (it's based on equation 9.13) and can also lead to big rounding errors. The algorithm which finds an initial feasible point when the given starting point is infeasible could be improved. All the linear algebra operations are performed with dense matrix.
This QP algorithm is very far from the "state-of-the-art". Some numerical results shows that the QP algorithm is really the weak point of all the optimization code. Nevertheless, for most problems, this implementation gives sufficient results (see numerical results).

next up previous contents
Next: The SQP algorithm Up: Detailed description of the Previous: Detailed description of the   Contents
Frank Vanden Berghen 2004-04-19