Next: The constrained step in Up: Detailed description of the Previous: The QP algorithm Contents

Subsections

Length of the step.
Maratos effect: The SOC step.
Update of
Stopping condition
The SQP in detail

The SQP algorithm

The material of this section is based on the following references: [NW99,Fle87,PT93].
The SQP algorithm is:

set
Solve the QP subproblem described on equation 8.53 to determine $\delta_k$ and let $\lambda_{k+1}$ be the vector of the Lagrange multiplier of the linear constraints obtained from the QP.
Compute the length $\alpha_k$ of the step and set $x_{k+1}=x_k+ \alpha_k \delta_k$
Compute $H_{k+1}$ from using a quasi-Newton formula
Increment . Stop if $\begin{picture}(.27,.3) \put(0,0){\makebox(0,0)[bl]{$\nabla$}} \put(.16,.17){\circle*{.18}} \end{picture} \L (x_k,\lambda_k)\approx 0$ . Otherwise, go to step 2.

We will now give a detailed discussion of each step. The QP subproblem has been described in the previous section.

Length of the step.

From the QP, we got a direction $\delta_k$ of research. To have a more robust code, we will apply the first Wolf condition (see equation 13.4) which is recalled here:

$\displaystyle f(\alpha) \leq f(0) + \rho \alpha f'(0) \; \; \; \;\; \; \;\;\; \; \;\rho \in (0, \frac{1}{2})$

(9.18)

where $f(\alpha)=f(x_k+\alpha \delta_k)$ . This condition ensure a ``sufficient enough" reduction of the objective function at each step. Unfortunately, in the constrained case, sufficient reduction of the objective function is not enough. We must also ensure reduction of the infeasibility. Therefore, we will use a modified form of the first Wolf condition where $f(\alpha)=\phi(x_k+\alpha \delta_k)$ and $\phi(x)$ is a merit function. We will use in the optimizer the

merit function:

$\displaystyle \phi (x)=f(x)+ \frac{1}{\mu} \Vert c(x) \Vert _1$

(9.19)

where $\mu >0$ is called the penalty parameter. A suitable value for the penalty parameter is obtained by choosing a constant

and defining $\mu$ at every iteration too be

$\displaystyle \mu^{-1}= \Vert \lambda_{k+1} \Vert _\infty + K$

(9.20)

In equation 9.18, we must calculate

: the directional derivative of $\phi$ in the direction $\delta_k$ at

:

$\displaystyle f'(0)= D_{\delta_k} \phi (x_k)= \delta_k^t g_k - \frac{1}{\mu} \... ...} \delta_k^t a_i + \sum_{i \in {\cal Z}_k} \max ( -\delta_k^t a_i, 0 ) \right)$

(9.21)

where ${\cal F}_k= \{ i: c_i(x_k)<0 \}$ , ${\cal Z}_k= \{ i: c_i(x_k)=0 \}$ , $g_k= \nabla f(x_k)$ , $a_i= \nabla c_i(x_k)$ . The algorithm which computes the length of the step is thus:

Set $\alpha_k:=1$ , current point, $\delta_k:=$ direction of research
Test the Wolf condition equation 9.18: ?
- True: Set $x_{k+1}:=x_k+\alpha_k \delta_k$ and go to step 3.
- False: Set $\alpha_k=0.7*\alpha_k$ and return to the beginning of step 2.
We have successfully computed the length $\Vert \alpha_k \delta_k\Vert$ of the step.

Maratos effect: The SOC step.

**Figure 9.5:** Maratos effect.
$\begin{figure} \centering\epsfig{figure=figures/maratos.eps, width=9cm, height=8cm} \end{figure}$

The

merit function $\phi (x)=f(x)+ \frac{1}{\mu} \Vert c(x) \Vert _1$ can sometimes reject steps (=to give $\alpha_k=0$ ) that are making good progress towards the solution. This phenomenon is known as the Maratos effect. It is illustrated by the following example:

$\displaystyle \min_{x,y} f(x,y)= 2 (x^2+y^2-1)-x, \; \; \; \;$ Subject to $\displaystyle \; \; x^2+y^2-1=0$

(9.22)

The optimal solution is

. The situation is illustrated in figure 9.5. The SQP method moves ( $\delta_1=(1,0)$ ) from

to

.
In this example, a step $\delta_1=\delta_k$ along the constraint will always be rejected ( $\alpha_k=0$ ) by $\phi=l_1$ merit function. If no measure are taken, the Maratos effect can dramatically slow down SQP methods. To avoid the Maratos effect, we can use a second-order correction step (SOC) in which we add to $\delta_k$ a step ${\delta_k}'$ which is computed at $c(x_k+\delta_k)$ and which provides sufficient decrease in the constraints. Another solution is to allow the merit function $\phi$ to increase on certain iterations (watchdog, non-monotone strategy).
Suppose we have found the SQP step $\delta_k$ . $\delta_k$ is the solution of the following QP problem:

$\begin{displaymath}\begin{array}{l} \displaystyle \min_\delta g_k^t \delta + \f... ...abla c_i(x_k)^t \delta \geq 0 \;\;\; i=1,\ldots,m \end{array}\end{displaymath}$

(9.23)

Where we have used a linear approximation of the constraints at point

to find $\delta_k$ . Suppose this first order approximation of the constraint is poor. It's better to replace $\delta_k$ with

, the solution of the following problem, where we have used a quadratical approximation of the constraints:

	$\displaystyle \min_s g_k^t s + \frac{1}{2} s^t H_k s$	(9.24)
	$\displaystyle \displaystyle$ subject to: $\displaystyle \; \; c_i(x_k)+ \nabla c_i(x_k)^t s + \frac{1}{2} s^t \nabla^2 c_i(x_k) s \geq 0 \;\;\; i=1,\ldots,m$	(9.25)

but it's not practical, even if the hessian of the constraints are individually available, because the subproblem becomes very hard to solve. Instead, we evaluate the constraints at the new point $x_k+\delta_k$ and make use of the following approximation. Ignoring third-order terms, we have:

$\displaystyle c_i(x_k+\delta_k)= c_i(x_k)+ \nabla c_i(x_k)^t \delta_k + \frac{1}{2} \delta_k^t \nabla^2 c_i(x_k) \delta_k$

(9.26)

We will assume that, near

, we have:

$\displaystyle \delta_k^t \nabla^2 c_i(x_k) \delta_k \approx s \nabla^2 c_i(x_k) s \; \; \; \; \; \; \forall s \; \;$ small

(9.27)

Using 9.27 inside 9.26, we obtain:

$\displaystyle \frac{1}{2} s \nabla^2 c_i(x_k) s \approx c_i(x_k+\delta_k)- c_i(x_k) - \nabla c_i(x_k)^t \delta_k$

(9.28)

Using 9.28 inside 9.25, we obtain:

$\displaystyle c_i(x_k)+$	$\displaystyle \nabla c_i(x_k)^t s + \frac{1}{2} s^t \nabla^2 c_i(x_k) s$
	$\displaystyle \approx c_i(x_k)+ \nabla c_i(x_k)^t s + \Big( c_i(x_k+\delta_k)- c_i(x_k) - \nabla c_i(x_k)^t \delta_k \Big)$
	$\displaystyle = \Big(c_i(x_k+\delta_k) - \nabla c_i(x_k)^t \delta_k \Big) + \nabla c_i(x_k)^t s$	(9.29)

Combining 9.24 and 9.29, we have:

$\begin{displaymath}\begin{array}{l} \displaystyle \min_s g_k^t s + \frac{1}{2} ... ...) + \nabla c_i(x_k)^t s \geq 0 \;\;\; i=1,\ldots,m \end{array}\end{displaymath}$

(9.30)

Let's define

the solution to problem 9.30. What we really want is ${\delta_k}'=s_k-\delta_k \Leftrightarrow s_k=\delta_k+{\delta_k}'$ . Using this last equation inside 9.30, we obtain: ${\delta_k}'$ is the solution of

$\begin{displaymath}\begin{array}{l} \displaystyle \min_{{\delta}'} g_k^t (\delt... ...la c_i(x_k)^t {\delta}' \geq 0 \;\;\; i=1,\ldots,m \end{array}\end{displaymath}$

If we assume that $\nabla f(x_k+\delta_k) \approx \nabla f(x_k)$ ( $\Rightarrow H_k \delta_k \approx 0$ )(equivalent to assumption 9.27), we obtain:

$\begin{displaymath}\begin{array}{l} \displaystyle \min_{{\delta}'} g_k^t {\delt... ...bla c_i(x_k)^t {\delta}' \geq 0\;\;\; i=1,\ldots,m \end{array}\end{displaymath}$

(9.31)

which is similar to the original equation 9.23 where the constant term of the constraints are evaluated at $x_k+\delta_k$ instead of

. In other words, there has been a small shift on the constraints (see illustration 9.4). We will assume that the active set of the QP described by 9.23 and the QP described by 9.31 are the same. Using the notation of section 9.1.1, we have:

$\displaystyle {\delta_k}' =-Y b' \; \;\;$ where $\displaystyle {b_i}'= c_i(x_k+\delta_k) \;\;\; i=1,\ldots,m$

(9.32)

Where

is the matrix calculated during the first QP 9.23 which is used to compute $\delta_k$ . The SOC step is thus not computationally intensive: all what we need is an new evaluation of the active constraints at $x_k+\delta_k$ . The SOC step is illustrated in figure 9.4 and figure 9.5. It's a shift perpendicular to the active constraints with length proportional to the amplitude of the violation of the constraints. Using a classical notation, the SOC step is:

$\displaystyle {\delta_k}'= -A_k^t (A_k A_k^t)^{-1} c(x_k+\delta_k)$

(9.33)

where

is the jacobian matrix of the active constraints at

.

Update of

is an approximation of the hessian matrix of the Lagrangian of the optimization problem.

$\displaystyle H_k \approx \nabla_x^2 \L (x_k,\lambda_k)=\nabla_x^2 f(x_k)- \sum_i (\lambda_{k})_i \nabla^2 c_i(x_k)$

(9.34)

The QP problem makes the hypothesis that

is definite positive. To obtain a definite positive approximation of $\nabla_x^2 \L$ we will use the damped-BFGS updating for SQP (with

):

$\displaystyle s_k$	$\displaystyle :=$	$\displaystyle x_{k+1}- x_k$
$\displaystyle y_k$	$\displaystyle :=$	$\displaystyle \nabla_x \L (x_{k+1},\lambda_{k+1}) - \nabla_x \L (x_k,\lambda_{k+1})$
$\displaystyle \theta_k$	$\displaystyle :=$	$\begin{displaymath}\begin{cases}1 & \displaystyle \text{if } \; \; s_k^t y_k \ge... ... s_k}{s_k^t H_k s_k - s_k^t y_k} & \text{otherwise} \end{cases}\end{displaymath}$
$\displaystyle r_k$	$\displaystyle :=$	$\displaystyle \displaystyle \theta_k y_k + (1-\theta_k) H_k s_k$
$\displaystyle H_{k+1}$	$\displaystyle :=$	$\displaystyle H_k - \frac{H_k s_k s_k^t H_k}{s_k^t H_k s_k}+\frac{r_k r_k^t}{s_k^t r_k}$	(9.35)

The formula 9.35 is simply the standard BFGS update formula, with

replaced by

. It guarantees that $H_{k+1}$ is positive definite.

Stopping condition

All the tests are in the form:

$\displaystyle \Vert r \Vert \leq (1+ \Vert V \Vert)\tau$

(9.36)

where

is a residual and

is a related reference value.
We will stop in the following conditions:

The length of the step is very small.
The maximum number of iterations is reached.
The current point is inside the feasible space, all the values of $\lambda$ are positive or null, The step's length is small.

The SQP in detail

The SQP algorithm is:

set
If termination test is satisfied then stop.
Solve the QP subproblem described on equation 8.53 to determine $\delta_k$ and let $\lambda_{k+1}$ be the vector of the Lagrange multiplier of the linear constraints obtained from the QP.
Choose $\mu_k$ such that $\delta_k$ is a descent direction for $\phi$ at , using equation 9.20.
Set $\alpha_k:=1$
Test the Wolf condition equation 9.18: ?
- True: Set $x_{k+1}:=x_k+\alpha_k \delta_k$ and go to step 7.
- False: Compute using equation 9.32
  and test: ?
  - True: Set $x_{k+1}:=x_k+\alpha_k \delta_k+{\delta_k}'$ and go to step 7.
  - False: Set $\alpha_k=0.7*\alpha_k$ and go back to the beginning of step 6.
Compute $H_{k+1}$ from using a quasi-Newton formula
Increment . Stop if $\begin{picture}(.27,.3) \put(0,0){\makebox(0,0)[bl]{$\nabla$}} \put(.16,.17){\circle*{.18}} \end{picture} \L (x_k,\lambda_k)\approx 0$ otherwise, go to step 1.

Next: The constrained step in Up: Detailed description of the Previous: The QP algorithm Contents

Frank Vanden Berghen 2004-04-19