- Duality

- A primal-dual Algorithm
- Central path
- Link between Barrier method and Interior point method
- A final note on primal-dual algorithms.
- SQP Methods
- A small note about the H matrix in the SQP algorithm.
- A final note about the SQP algorithm.

We assume its domain is . We define the

(8.10) |

We refer to as the

(8.11) |

The proof follow. Suppose is a feasible point of the problem 8.9. Then we have:

(8.13) |

since each term is non-negative. And therefore:

(8.14) |

Hence

(8.15) |

Since holds for every feasible point , the inequality 8.12 follows.

When the feasible domain is convex and when the objective function is convex, the problem is said to be convex. In this case, we have . The proof will be skipped. When the problem is not convex, we will define the

The Lagrangian is

(8.17) |

So the dual function is

(8.18) |

The infinitum of a linear function is , except in the special case when its identically zero, so the dual function is:

(8.19) |

The dual variable is dual feasible if and . The Lagrange dual of the LP 8.16 is to maximize over all . We can reformulate this by explicitly including the dual feasibility conditions as constraints, as in:

Subject to: | (8.20) |

which is an LP in standard form.

Since the feasible domain is convex and the objective function is also convex, we have a convex problem. The solution of this dual problem is thus equal to the solution of the primal problem.

Subject to: |

where . We will now compute the dual problem:

Since the problem is convex, we have ( are the Lagrange multipliers associated to the constraints and are the Lagrange multipliers associated to the constraints ):
| |||

Subject to: | |||

Subject to: |

The associated KKT conditions are:

Primal dual methods find solutions of this system by applying variants of Newton's method (see section 13.6 for Newton's method for non-linear equations) to the three equalities 8.21-8.23 and modifying the search direction and step length so that inequalities are satisfied strictly at every iteration. The nonnegativity condition is the source of all the complications in the design and analysis of interior-point methods. Let's rewrite equations 8.21-8.24 in a slightly different form:

where and . Primal dual methods generate iterates that satisfy the bounds 8.26 strictly. This property is the origin of the term

As mentioned above, the search direction procedure has its origin in Newton's method for the set of nonlinear equations 8.25 (see section 13.6 for Newton's method for non-linear equations). We obtain the search direction by solving the following system of linear equations:

where is the Jacobian of . If the current point is strictly feasible, we have:

(8.28) |

A full step along this direction is not permissible, since it would violate the bound . To avoid this difficulty, we perform a line search along the Newton direction so that the new iterate is

(8.29) |

for some line search parameter . Unfortunately, we often can only take a small step along this direction (= the

- They bias the search direction toward the interior of the nonnegative orthant , so that we can move further along this direction before one components of becomes negative.
- They keep the components of from moving "too close" to the boundary of the nonnegative orthant.

(8.30) |

Where each point solves the following system, which is a perturbation of the original system 8.25

A plot of for a typical problem, projected into the space of primal variables , is shown in figure 8.5.

The equation 8.31 approximate 8.25 more and more closely as goes to zero. Primal-Dual algorithms take Newton's steps toward points on for which . Since these steps are biased toward the interior of the nonnegative orthant , it is usually possible to take longer steps along them than along pure Newton's steps for , before violating the positivity condition. To describe the biased search direction, we introduce a

(8.33) |

which measure the average value of the pairwise products . We also define a

If , the equations 8.34 define a

The choice of centering parameter and step length are crucial to the performance of the method. Techniques for controlling these parameters, directly and indirectly, give rise to a wide variety of methods with varying theoretical properties.

There exist a value of the dual variable and a value of the primal variable which satisfy the KKT conditions:

Equations 8.36-8.39 are comparable to equations 8.21-8.24. There are the base equations for the primal-dual iterations. In the previous section, we motivated the following perturbation of these equations:

Let's now consider the following equivalent optimization problem (barrier problem):

For a given , at the optimum of equation 8.44, we have:

If we define

We will now proof that is dual feasible. First it's clear that because because is inside the feasible domain. We now have to proof that . Let's compute the value of the dual function of 8.35 at :

When , we will have which concludes the proof. is the

Let's define the minimum value of the original problem 8.35, and the solution of for a given value of . From equation 8.47, we have the interesting following result: . That is: is no more than suboptimal.

What's the link between equations 8.40-8.43, which are used inside the primal-dual algorithm and the barrier method? We see that if we combine equations 8.45 and 8.46, we obtain 8.40. The equations 8.42 and 8.46 are the same, except that . The "perturbation parameter" in primal-dual algorithm is simply the barrier parameter in barrier algorithms. In fact, barrier method and primal-dual methods are very close. The main difference is: In primal-dual methods, we update the primal variables AND the dual variable at each iteration. In barrier methods, we only update the primal variables.

- Set
- Solve
*approximatively*the barrier problem 8.44 for a given value of the centering parameter (see equation 8.34 about ) using as starting point the solution of the previous iteration. - Update the barrier parameter . For example:

Increase the iteration counter

If not finished, go to step 2.

- The choice of centering parameter which is crucial
to the performance of the method. If the decrease is too slow, we
will loose our time evaluating the objective function far from the
real optimum. For "
*approach 2*", where these evaluations are very expensive (in time), it's a major drawback. - In step 2 above, we have to solve
*approximatively*an optimization problem. What are the tolerances? What's the meaning of*approximatively*? - The feasible space should be convex. Extension to any feasible space is possible but not straight forward.
- The starting point should be feasible. Extension to any starting point is possible but not straight forward.
- The "switching mechanism" from the unconstrained steps when no constrained are violated to the constrained case is difficult to implement (if we want to keep the actual mechanism for step computation when no constraints are active).

SQP stands for ``

As in the case of the Newton's method in unconstrained optimization, we will do a quadratical approximation of the function to optimize and search for the minimum of this quadratic. The function to be approximated will be the Lagrangian function .

The quadratical approximation of is:

With : the Jacobian matrix of the constraints evaluated at : we have: or

and : the Hessian Matrix of the Langrangian function: =

The full Hessian of is thus :

(8.50) |

If we are on the boundary of the constraint, we will have , thus we can write:

We want to find which minimize , subject to the constraints .

From equation 8.49 and using equation 8.51, we obtain:

(8.52) |

Using a first order approximation of the constraints around , we have the

DEFINITION: a

Note that .

We can now define the SQP algorithm:

- solve the QP subproblem described on equation 8.53 to determine and let be the vector of the Lagrange multiplier of the linear constraints obtained from the QP.
- set
- Increment . Stop if . Otherwise, go to step 1.

A more robust implementation of the SQP algorithm adjust the length of the steps and performs "second order correction steps" (or "

The inclusion of the second order constraint terms in the subproblem is important in that otherwise second order convergence for nonlinear constraints would not be obtained. This is well illustrated by the following problem:

(8.54) |

in which the objective function is linear so that it is only the curvature of the constraint which causes a solution to exist. In this case the sequence followed by the SQP algorithm is only well-defined if the constraint curvature terms are included.

We can obtain a good approximation of this matrix using an extension of the BFGS update to the constrained case.

Extrapolated | Multiplier | SQP method | |

Problem | barrier function | penalty function | (Powell, 1978a) [Pow77] |

TP1 | 177 | 47 | 6 |

TP2 | 245 | 172 | 17 |

TP3 | 123 | 73 | 3 |