By Alan Weiss, MathWorks
Most Optimization Toolbox™ solvers run faster and more accurately when your objective and constraint function files include derivative calculations. Some solvers also benefit from second derivatives, or Hessians. While calculating a derivative is straightforward, it is also quite tedious and error-prone. Calculating second derivatives is even more tedious and fraught with opportunities for error. How can you get your solver to run faster and more accurately without the pain of computing derivatives manually?
This article demonstrates how to ease the calculation and use of gradients using Symbolic Math Toolbox™. The techniques described here are applicable to almost any optimization problem where the objective or constraint functions can be defined analytically. This means that you can use them if your objective and constraint functions are not simulations or black-box functions.
Suppose we want to minimize the function x + y + cosh(x – 1.1y) + sinh(z/4) over the region defined by the implicit equation z2 = sin(z – x2y2), –1 ≤ x ≤ 1, –1 ≤ y ≤ 1, 0 ≤ z ≤ 1.
The region is shown in Figure 1.
The fmincon solver from Optimization Toolbox solves nonlinear optimization problems with nonlinear constraints. To formulate our problem for fmincon, we first write the objective and constraint functions symbolically.
We then generate function handles for numerical computation with matlabFunction from Symbolic Math Toolbox.
The returned output structure shows that it took fmincon 20 iterations and 99 function evaluations to solve the problem. The solution point x (the yellow sphere in the plot in Figure 3) is [-0.8013;-0.6122;0.4077.
To include derivatives of the objective and constraint functions in the calculation, we simply perform three steps:
The following code shows how to include gradients for the example.
Notice that the jacobian function is followed by .'. This transpose ensures that gradw and gradobj are column vectors, the preferred orientation for Optimization Toolbox solvers. matlabFunction creates a function handle for evaluating both the function and its gradient. Notice, too, that we were able to calculate the gradient of the constraint function even though the function is implicit.
The output structure shows that fmincon computed the solution in 20 iterations, just as it did without gradients. fmincon with gradients evaluated the nonlinear functions at 36 points, compared to 99 points without gradients.
A Hessian function lets us solve the problem even more efficiently. For the interior-point algorithm, we write a function that is the Hessian of the Lagrangian. This means that if ƒ is the objective function, c is the vector of nonlinear inequality constraints, ceq is the vector of nonlinear equality constraints, and λ is the vector of associated Lagrange multipliers, the Hessian H is
∇2u represents the matrix of second derivatives with respect to x of the function u.
fmincon generates the Lagrange multipliers in a MATLAB® structure. The relevant multipliers are lambda.ineqnonlin and lambda.eqnonlin, corresponding to indices i and j in the equation for H. We include multipliers in the Hessian function, and then run the optimization1.
The output structure shows that including a Hessian results in fewer iterations (10 instead of 20), a lower function count (11 instead of 36), and a better first-order optimality measure (2e-8 instead of 8e-8).
1 For nonlinear equality constraints in Optimization Toolbox version 9b or earlier, you must subtract, not add, the Lagrange multiplier. See bug report.
Published 2010 - 91801v00