If the objective function is locally quadratic and if we have
exactly , then we will find the optimum in one iteration of
the algorithm.
Unfortunately, we usually don't have , but only an
approximation of it. For example, if this approximation is
constructed using several ``BFGS update'', it becomes close to the
real value of the Hessian only after updates. It means we
will have to wait at least iterations before going in ``one
step'' to the optimum. In fact, the
algorithm becomes ``only'' super-linearly convergent.
How to improve Newton's method : Zoutendijk Theorem.
We have seen that Newton's method is very fast (quadratic
convergence) but has no global convergence property ( can be
negative definite). We must search for an alternative method which
has global convergence property. One way to prove global
convergence of a method
is to use Zoutendijk theorem.
Let us define a general method:
find a search direction
search for the minimum
in this direction using 1D-search techniques and find
with
(13.3)
Increment . Stop if
otherwise, go to step 1.
The 1D-search must respect the Wolf conditions. If we define
, we have:
(13.4)
(13.5)
Figure 13.1:
bounds on : Wolf conditions
The objective of the Wolf conditions is to give a lower bound
(equation 13.5) and upper bound (equation
13.4) on the value of , such that the 1D-search
algorithm is easier: see Figure 13.1. Equation
13.4 expresses that the objective function
must be
reduced sufficiently. Equation 13.5 prevents too small
steps. The parameter defines the precision of the 1D-line
search:
exact line search :
inexact line search :
We must also define which is the angle between the
steepest descent direction () and the current search
direction ():
(13.6)
Under the following assumptions:
bounded below
is
continuously differentiable in a neighborhood of the
level set
We know that
is bounded below, we also know from Equation
13.12, that
. So, for a given large value
of k (and for all the values above), we will have
.
The sum on the left side of Equation 13.14 converges and is
finite:
. Thus,we have:
To make an algorithm globally convergent, we can make what is
called an ``angle test''. It consists of always choosing a search
direction such that
. This means that
the search direction do not tends to be perpendicular to the
gradient. Using the Zoutendijk theorem (see Equation 13.8),
we obtain:
which means that the
algorithm is globally convergent.
The ``angle test'' on Newton's method prevents quadratical
convergence. We must not use it.
If, regularly (lets say every iterations), we make a
``steepest descent'' step, we will have for this step,
. It will be impossible to have
. So, using Zoutendijk, the only
possibility left is that
. The algorithm is now globally convergent.
Next:Gram-Schmidt orthogonalization procedure. Up:Annexes Previous:AnnexesContents
Frank Vanden Berghen
2004-04-19