Optimization Toolbox 4.3
Using Symbolic Mathematics with Optimization Toolbox™ Solvers
Optimization Toolbox™ solvers are usually more accurate and efficient when you supply gradients and Hessians of the objective and constraint functions. This demo shows how to use the Symbolic Math Toolbox™ functions named jacobian and matlabFunction to provide these derivatives to optimization solvers.
Additional Requirements:
- Symbolic Math Toolbox
There are several considerations in using symbolic calculations with optimization functions:
1. Optimization objective and constraint functions should be defined in terms of a vector, say x. However, symbolic variables are scalar or complex-valued, not vector-valued. This requires you to translate between vectors and scalars.
2. Optimization gradients, and sometimes Hessians, are supposed to be calculated within the body of the objective or constraint functions. This means that a symbolic gradient or Hessian has to be placed in the appropriate place in the objective or constraint function file or function handle.
3. Calculating gradients and Hessians symbolically can be time-consuming. Therefore you should perform this calculation only once, and generate code, via matlabFunction, to call during execution of the solver.
4. Evaluating symbolic expressions with the subs function is time-consuming. It is much more efficient to use matlabFunction.
5. matlabFunction generates code that depends on the orientation of input vectors. Since fmincon calls the objective function with column vectors, you must be careful to call matlabFunction with column vectors of symbolic variables.
Contents
First Example: Unconstrained Minimization with Hessian
The objective function to minimize is:

This function is positive, with a unique minimum value of zero attained at x1 = 4/3, x2 =(4/3)^3 - 4/3 = 1.0370...
We write the independent variables as x1 and x2 because in this form they can be used as symbolic variables. As components of a vector x they would be written x(1) and x(2). The function has a twisty valley as depicted in the plot below.
syms x1 x2 real x = [x1;x2]; % column vector of symbolic variables f = log(1 + 3*(x2 - (x1^3 - x1))^2 + (x1 - 4/3)^2); ezsurfc(f,[-2 2]) view(127,38)
Compute the gradient and Hessian of f:
gradf = jacobian(f,x); hessf = jacobian(gradf,x); pretty(gradf) pretty(hessf)
+-
-+
| 2 3 8
|
| - 2 x1 + 6 (3 x1 - 1) (- x1 + x1 + x2) + - 3
|
| 3 - 6 x1 + 6 x1
+ 6 x2 |
| - --------------------------------------------, -----------------------
--------------- |
| / 4 \2 3 2 / 4 \2 3
2 |
| | x1 - - | + 3 (- x1 + x1 + x2) + 1 | x1 - - | + 3 (- x1
+ x1 + x2) + 1 |
| \ 3 / \ 3 /
|
+-
-+
+-
-+
| / 2 3 8 \2
3 / 2 3
8 \ |
| | - 2 x1 + 6 (3 x1 - 1) (- x1 + x1 + x2) + - |
3 2 2 2
(- 6 x1 + 6 x1 + 6 x2) | - 2 x1 + 6 (3 x1 - 1) (- x1 + x1 +
x2) + - | |
| \ 3 / - 36 x1 (
- x1 + x1 + x2) + 6 (3 x1 - 1) + 2 18 x1 - 6
\
3 / |
| - ------------------------------------------------- + ---------
-------------------------------------, - ----------------------------
---------- + ---------------------------------------------------------------
--------- |
| / / 4 \2 3 2 \2 /
4 \2 3 2 / 4 \2 3
2 / / 4 \2 3 2 \2
|
| | | x1 - - | + 3 (- x1 + x1 + x2) + 1 | | x1
- - | + 3 (- x1 + x1 + x2) + 1 | x1 - - | + 3 (- x1 + x1
+ x2) + 1 | | x1 - - | + 3 (- x1 + x1 + x2) + 1 |
|
| \ \ 3 / / \
3 / \ 3 /
\ \ 3 / /
|
|
|
| 3 /
2 3 8 \
|
| 2 (- 6 x1 + 6 x1 + 6 x2) | -
2 x1 + 6 (3 x1 - 1) (- x1 + x1 + x2) + - |
3 2
|
| 18 x1 - 6 \
3 /
6 (- 6 x1 + 6 x1 + 6 x2)
|
| - -------------------------------------- + ----------------------------
--------------------------------------------, --------------
------------------------ - -------------------------------------------
|
| / 4 \2 3 2 / / 4 \2
3 2 \2 / 4 \2
3 2 / / 4 \2 3 2 \2
|
| | x1 - - | + 3 (- x1 + x1 + x2) + 1 | | x1 - - |
+ 3 (- x1 + x1 + x2) + 1 | | x1 - - | +
3 (- x1 + x1 + x2) + 1 | | x1 - - | + 3 (- x1 + x1 + x2) + 1 |
|
| \ 3 / \ \ 3 /
/ \ 3 /
\ \ 3 / /
|
+-
-+
The fminunc solver expects to pass in a vector x, and, with the GradObj and Hessian options set to 'on', expects a list of three outputs: [f(x),gradf(x),hessf(x)]
matlabFunction generates exactly this list of three outputs from a list of three inputs. Furthermore, using the vars option, matlabFunction accepts vector inputs.
fh = matlabFunction(f,gradf,hessf,'vars',{x});
Now solve the minimization problem starting at the point [-1,2]:
options = optimset('GradObj','on','Hessian','on','Display','final'); [xfinal fval exitflag output] = fminunc(fh,[-1;2],options)
Local minimum possible.
fminunc stopped because the final change in function value relative to
its initial value is less than the default value of the function tolerance.
xfinal =
1.333332064357340
1.037030117814729
fval =
7.662315226723625e-012
exitflag =
3
output =
iterations: 14
funcCount: 15
cgiterations: 11
firstorderopt: 3.439061368233648e-005
algorithm: 'large-scale: trust-region Newton'
message: [1x475 char]
Compare this with the number of iterations using no gradient or Hessian information. This requires the medium-scale algorithm, obtained by setting the LargeScale option to 'off':
options = optimset('Display','final','LargeScale','off'); fh2 = matlabFunction(f,'vars',{x}); % objective with no gradient or Hes sian [xfinal fval exitflag output2] = fminunc(fh2,[-1;2],options)
Local minimum found.
Optimization completed because the size of the gradient is less than
the default value of the function tolerance.
xfinal =
1.333337991777384
1.037057531377053
fval =
2.198508042259553e-011
exitflag =
1
output2 =
iterations: 18
funcCount: 81
stepsize: 1
firstorderopt: 2.458683006702789e-006
algorithm: 'medium-scale: Quasi-Newton line search'
message: [1x441 char]
The number of iterations is lower when using gradients and Hessians, and there are dramatically fewer function evaluations:
sprintf(['There were %d iterations using gradient and Hessian, but' ... ' %d without them.'],output.iterations,output2.iterations) sprintf(['There were %d function evaluations using gradient and Hessian,' .. . ' but %d without them.'],output.funcCount,output2.funcCount)
ans = There were 14 iterations using gradient and Hessian, but 18 without them. ans = There were 15 function evaluations using gradient and Hessian, but 81 withou t them.
Interpreting the Solutions
You may have noticed that fminunc stopped with exitflag 3 when using gradients and Hessians, and with exitflag 5 when using no derivative information. You might expect to obtain an exitflag of 1 at an optimum point. Yet both runs arrived at the correct minimum point. The reason the solver is not certain that the final points are optimal is the objective function f has a poorly-conditoned Hessian. (The objective function was specifically designed to be hard to optimize numerically.) The objective function is quite flat in one direction, but sharply curved in the other direction. The solver cannot be certain that it arrived at a true optimum. The condition number is over 1000, which is fairly large for a two-dimensional problem.
hessfh = matlabFunction(hessf,'vars',{x});
cond(hessfh(xfinal))
ans =
1.211417181851912e+003
Second Example: Constrained Minimization Using the fmincon Interior-Point Algorithm
We consider the same objective function and starting point, but now have two nonlinear constraints:


The constraints keep the optimization away from the global minimum point [1.333,1.037]. We plot the two constraints, with a small red circle around the optimal point:
[X,Y] = meshgrid(-2:.01:3); Z = (5*sinh(Y./5) >= X.^4); % Z=1 where the first constraint is satisfied , =0 otherwise Z = Z+ 2*(5*tanh(X./5) >= Y.^2 - 1); % Z=2 where the second is satisfied, =3 where both are surf(X,Y,Z,'LineStyle','none'); set(gcf,'Color','w') % white background view(0,90) hold on plot3(.4396, .0373, 4,'o','MarkerEdgeColor','r','MarkerSize',8); % best poin t hold off
Here is a plot of the objective function over the feasible region, the region that satisfies both constraints, pictured above in dark red, along with a small red circle around the optimal point:
W = log(1 + 3*(Y - (X.^3 - X)).^2 + (X - 4/3).^2); % the objective function W(Z < 3) = nan; % plot only where the constraints are satisfied surf(X,Y,W,'LineStyle','none'); view(68,20) hold on plot3(.4396, .0373, .8152,'o','MarkerEdgeColor','r','MarkerSize',8); % best point hold off
The nonlinear constraints must be written in the form c(x) <= 0. We compute all the symbolic constraints and their derivatives, and place them in a function handle using matlabFunction.
The gradients of the constraints should be column vectors; they must be placed in the objective function as a matrix, with each column of the matrix representing the gradient of one constraint function. This is the transpose of the form generated by jacobian, so we take the transpose below.
We place the nonlinear constraints into a function handle. fmincon expects the nonlinear constraints and gradients to be output in the order [c ceq gradc gradceq]. Since there are no nonlinear equality constraints, we output [ ] for ceq and gradceq.
c1 = x1^4 - 5*sinh(x2/5); c2 = x2^2 - 5*tanh(x1/5) - 1; c = [c1 c2]; gradc = jacobian(c,x).'; % transpose to put gradient in correct form constraint = matlabFunction(c,[],gradc,[],'vars',{x});
The interior-point algorithm requires its Hessian function to be written as a separate function, instead of being part of the objective function. This is because a nonlinearly constrained function needs to include those constraints in its Hessian. Its Hessian is the Hessian of the Lagrangian; see the User's Guide for more information.
The Hessian function takes two input arguments: the position vector x, and the Lagrange multiplier structure lambda. The parts of the lambda structure that you use for nonlinear constraints are lambda.ineqnonlin and lambda.eqnonlin. For the current constraint, there are no linear equalities, so we use the two multipliers lambda.ineqnonlin(1) and lambda.ineqnonlin(2).
We calculated the Hessian of the objective function in the first example. Now we calculate the Hessians of the two constraint functions, and make function handle versions with matlabFunction.
hessc1 = jacobian(gradc(:,1),x); % first constraint is first column of c hessc2 = jacobian(gradc(:,2),x); hessc1h = matlabFunction(hessc1,'vars',{x}); hessc2h = matlabFunction(hessc2,'vars',{x});
To make the final Hessian, we put the three Hessians together, adding the appropriate Lagrange multipliers to the constraint functions.
myhess = @(x,lambda)(hessfh(x) + ... lambda.ineqnonlin(1)*hessc1h(x) + ... lambda.ineqnonlin(2)*hessc2h(x));
Set the options to use the interior-point algorithm, the gradient, and the Hessian, have the objective function return two outputs, and run the solver:
options = optimset('Algorithm','interior-point','GradObj','on',... 'GradConstr','on','Hessian','user-supplied','HessFcn',myhess,... 'Display','final'); fh2 = matlabFunction(f,gradf,'vars',{x}); % objective without Hessian [xfinal fval exitflag output] = fmincon(fh2,[-1;2],[],[],[],[],[],[],... constraint,options)
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the default value of the function tolerance,
and constraints were satisfied to within the default value of the constraint
tolerance.
xfinal =
0.439569087135554
0.037334020327260
fval =
0.815247080686200
exitflag =
1
output =
iterations: 10
funcCount: 13
constrviolation: 0
stepsize: 1.916032009774339e-006
algorithm: 'interior-point'
firstorderopt: 1.921701378582164e-008
cgiterations: 0
message: [1x791 char]
Again, the solver makes many fewer iterations and function evaluations with gradient and Hessian supplied than when they are not:
options = optimset('Algorithm','interior-point','Display','final'); fh3 = matlabFunction(f,'vars',{x}); % objective without gradient or Hes sian constraint = matlabFunction(c,[],'vars',{x}); % constraint without grad ient [xfinal fval exitflag output2] = fmincon(fh3,[-1;2],[],[],[],[],[],[],... constraint,options) sprintf(['There were %d iterations using gradient and Hessian, but' ... ' %d without them.'],output.iterations,output2.iterations) sprintf(['There were %d function evaluations using gradient and Hessian,' .. . ' but %d without them.'],output.funcCount,output2.funcCount)
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the default value of the function tolerance,
and constraints were satisfied to within the default value of the constraint
tolerance.
xfinal =
0.439568846240534
0.037334303592384
fval =
0.815247460667077
exitflag =
1
output2 =
iterations: 17
funcCount: 54
constrviolation: 0
stepsize: 4.241701937721570e-006
algorithm: 'interior-point'
firstorderopt: 3.843306486672748e-007
cgiterations: 0
message: [1x791 char]
ans =
There were 10 iterations using gradient and Hessian, but 17 without them.
ans =
There were 13 function evaluations using gradient and Hessian, but 54 withou
t them.
Cleaning Up Symbolic Variables
The symbolic variables used in this demo were assumed to be real. To clear this assumption from the symbolic engine workspace, it is not sufficient to delete the variables. You must either clear the variables using the syntax
syms x1 x2 clear
or reset the symbolic engine using the command
% reset(symengine) % uncomment this line to reset the engine
After resetting the symbolic engine you should clear all symbolic variables from the MATLAB® workspace:
% clear % uncomment this line to clear the variables
Store