next up previous contents
Next: Explanation of the Hard Up: The Trust-Region subproblem Previous: The Trust-Region subproblem   Contents

$ H(\lambda ^*)$ must be positive definite.

The solution we are seeking lies either interior to the trust region ( $ \Vert s \Vert _2 < \Delta$) or on the boundary.
If it lies on the interior, the trust region may as well not have been there and therefore $ s^*$ is the unconstrained minimizer of $ q(s)$. We have seen in Equation 2.4 ($ H s = -g$) how to find it. We have seen in Equation 2.6 that

$\displaystyle H$    must be definite positive $\displaystyle \quad ( s^T H s > 0 \quad
 \forall s)$ (4.2)

in order to be able to apply 2.4.
If we found a value of $ s^*$ using 2.4 which lies outside the trust region, it means that $ s^*$ lies on the trust region boundary. Let's take a closer look to this case:
Theorem 1

\begin{figure}
\centering\fbox{\hspace{0.2cm}\parbox[t][3cm][b]{15.7cm}{
{\e...
...is
positive definite, $s^*$\ is unique.}\\
} }\vspace{-0.1cm}
\end{figure}

First, we rewrite the constraints $ \Vert s \Vert _2 = \Delta$ as $ \displaystyle c(s)=\frac{1}{2}\Delta-\frac{1}{2}\Vert s \Vert _2=0 $. Now, we introduce a Lagrange multiplier $ \lambda $ for the constraint and use first-order optimality conditions (see Annexe, section 13.3 ). This gives:

$\displaystyle \L (s,\lambda)=q(s)-\lambda c(s)$ (4.3)

Using first part of Equation 13.22,we have

$\displaystyle \nabla_s \L (s^*,\lambda^*)= \nabla
 q(s^*)-\lambda^* \nabla_s c(s^*)= H s^* + g + \lambda^* s^* = 0$ (4.4)

which is 4.3.
We will now proof that $ H(\lambda ^*)$ must be positive (semi)definite.
Suppose $ s^F$ is a feasible point ( $ \Vert s^F \Vert = \Delta$), we obtain:

$\displaystyle q(s^F)=q(s^*)+\langle s^F-s^*, g(s^*)\rangle+
 \frac{1}{2} \langle s^F-s^*, H (s^F-s^*)\rangle$ (4.5)

Using the secant equation (see Annexes, Section 13.4), $ g''-g'=H(x''-x')= H s \quad (s=x''-x')$, we can rewrite 4.5 into $ g(s^*)= -\lambda^* s^*$. This and the restriction that $ (\Vert s^F \Vert = \Vert s^* \Vert = \Delta)$ implies that:

$\displaystyle \langle s^F-s^*, g(s^*)\rangle =$ $\displaystyle \langle s^*-s^F, s^*\rangle \lambda^*$    
$\displaystyle =$ $\displaystyle (\Delta^2- \langle s^F,s^*\rangle) \lambda^*$    
$\displaystyle =$ $\displaystyle [\frac{1}{2}(\langle s^F,s^F\rangle+\langle s^*,s^*\rangle)-\langle s^F,s^*\rangle] \lambda^*$    
$\displaystyle =$ $\displaystyle [\frac{1}{2}(\langle s^F,s^F\rangle+\langle s^*,s^*\rangle)-
 \frac{1}{2}\langle s^F,s^*\rangle-\frac{1}{2}\langle s^F,s^*\rangle] \lambda^*$    
$\displaystyle =$ $\displaystyle \frac{1}{2}[\langle s^F,s^F\rangle-\langle s^F,s^*\rangle+\langle s^*,s^*\rangle-\langle s^F,s^*\rangle] \lambda^*$    
$\displaystyle =$ $\displaystyle \frac{1}{2}[\langle s^F,s^F-s^*\rangle+\langle s^*-s^F,s^*\rangle] \lambda^*$    
$\displaystyle =$ $\displaystyle \frac{1}{2}[\langle s^F,s^F-s^*\rangle-\langle s^*,s^F-s^*\rangle] \lambda^*$    
$\displaystyle \langle s^F-s^*, g(s^*)\rangle =$ $\displaystyle \frac{1}{2}\langle
 s^F-s^*,s^F-s^*\rangle \lambda^*$ (4.6)

Combining 4.6 and 4.7

$\displaystyle q(s^F)=$ $\displaystyle q(s^*)+\frac{1}{2}\langle
 s^F-s^*,s^F-s^*\rangle\lambda^*+\frac{1}{2}\langle s^F-s^*, H
 (s^F-s^*)\rangle$    
$\displaystyle =$ $\displaystyle q(s^*)+\frac{1}{2}\langle
 s^F-s^*,( H+\lambda^* I )(s^F-s^*)\rangle$    
$\displaystyle =$ $\displaystyle q(s^*)+\frac{1}{2}\langle s^F-s^*, H(\lambda^*) (s^F-s^*)\rangle$ (4.7)

Let's define a line $ s^*+ \alpha v$ as a function of the scalar $ \alpha $. This line intersect the constraints $ \Vert s\Vert=\Delta$ for two values of $ \alpha $: $ \alpha=0$ and $ \alpha=\alpha^F\neq 0$ at which $ s=s^F$. So $ s^F-s^*=\alpha^F v$, and therefore, using 4.8, we have that

$\displaystyle q(s^F)= q(s^*)+\frac{1}{2}(\alpha^F)^2 \langle v,H(\lambda^*) v
\rangle $

Finally, as we are assuming that $ s^*$ is a global minimizer, we must have that $ s^F\geq s^*$, and thus that $ \langle
v,H(\lambda^*) v \rangle \geq 0 \quad \forall v$, which is the same as saying that $ H(\lambda)$ is positive semidefinite.
If $ H(\lambda ^*)$ is positive definite, then $ \langle s^F-s^*,
H(\lambda^*) (s^F-s^*)\rangle >0 $ for any $ s^F \neq s^*$, and therefore 4.8 shows that $ q(s^F)>q(s^*)$ whenever $ s^F$ is feasible. Thus $ s^*$ is the unique global minimizer.
Using 4.2 (which is concerned about an interior minimizer) and the previous paragraph (which is concerned about a minimizer on the boundary of the trust region), we can state:
Theorem 2:

\begin{figure}
\centering\fbox{\hspace{0.2cm}\parbox[t][3.5cm][b]{15.7cm}{
{...
...is
positive definite, $s^*$\ is unique.}\\
} }\vspace{-0.1cm}
\end{figure}

The justification of $ \lambda^* ( \Vert s^* \Vert - \Delta ) =0 $ is simply the complementarity condition (see Section 13.3 for explanation, Equation 13.22).
The parameter $ \lambda $ is said to ``regularized'' or ``modify'' the model such that the modified model is convex and so that its minimizer lies on or within the trust region boundary.
next up previous contents
Next: Explanation of the Hard Up: The Trust-Region subproblem Previous: The Trust-Region subproblem   Contents
Frank Vanden Berghen 2004-04-19