next up previous contents
Next: The H-norm Up: Conclusions Previous: Improvements   Contents


Some advice on how to use the code.

The steps $ s_k$ of the unconstrained algorithm are the solution to the following minimization problem (see Chapter 4):

$\displaystyle \min_{s \in \Re^n} q_k(s) = g_k^t f_k + s^t H_k s \;\;$   subject to $\displaystyle \Vert s \Vert _2 < \Delta$ (12.2)

where $ q_k(s)$ is the local model of the objective function around $ x_k$, $ g_k \approx
\nabla f(x_k)$ and $ H_k \approx \nabla^2 f(x_k)$. The size of the steps are limited by the trust region radius $ \Delta$ which represents the domain of validity of the local model $ q_k(s)$. It is assumed that the validity of the local model at the point $ x_k+s$ is only related to the distance $ \Vert s \Vert _2$ and not to the direction of $ s$. This assumption can be false: In some directions, the model can be completely false for small $ \Vert s \Vert _2$ and in other directions the model can still be valid for large $ \Vert s \Vert _2$.
Currently the trust region is a simple ball (because we are using the L2-norm $ \Vert s \Vert _2$). If we were using the H-norm, the trust region would be an ellipsoid (see next Section 12.4 about H-norm). The H-norm allows us to link the validity of the model to the norm AND the direction of $ s$.
Since we are using a L2-norm, it's very important to scale correctly the variables. An example of bad-scaling is given in Table 12.1.

Table 12.1: Illustration of bad scaling
  Normal Bad scale Bad Scale Rosenbrock
Name Rosenbrock Rosenbrock corrected using CorrectScaleOF
Objective function $ \displaystyle
100*(x_2-x_1^2)^2+(1-x_1)^2$ $ \displaystyle 100*(\frac{x_2}{1000}-x_1^2)^2+(1-x_1)^2$
starting point $ (-1.2 \;\; 1)^t$ $ (-1200 \;\; 1)^t$
$ \rho_{start}$ .1
$ \rho_{end}$ 1e-5 1e-3 1e-5
Number of function      
evaluations 100 (89) 376 (360) 100 (89)
Best value found 1.048530e-13 5.543738e-13 1.048569e-13


When all the variables are of the same order of magnitude, the optimizer is the fastest. For example, don't mix together variables which are degrees expressed in radians (values around 1) and variables which are height of a house expressed in millimeters (values around 10000). You have to scale or normalize the variables. There is inside the code a C++ class which can do automatically the scaling for you: ``CorrectScaleOF''. The scaling factors used in CorrectScaleOF are based on the values of the components of the starting point or are given by the user.

The same advice (scaling) can be given for the constraints: The evaluation of a constraint should give results of the same order of magnitude as the evaluation of the objective function.
next up previous contents
Next: The H-norm Up: Conclusions Previous: Improvements   Contents
Frank Vanden Berghen 2004-04-19