Writing Scalar Objective Functions

Function Files

A scalar objective function file accepts one input, say x, and returns one real scalar output, say f. The input x can be a scalar, vector, or matrix. A function file can return more outputs (see Including Gradients and Hessians).

For example, suppose your objective is a function of three variables, x, y, and z:

f(x) = 3*(xy)4 + 4*(x + z)2 / (1 + x2 + y2 + z2) + cosh(x – 1) + tanh(y + z).

  1. Write this function as a file that accepts the vector xin = [x;y;z] and returns f:

    function f = myObjective(xin)
    f = 3*(xin(1)-xin(2))^4 + 4*(xin(1)+xin(3))^2/(1+norm(xin)^2) ...
        + cosh(xin(1)-1) + tanh(xin(2)+xin(3));
  2. Save it as a file named myObjective.m to a folder on your MATLAB® path.

  3. Check that the function evaluates correctly:

    myObjective([1;2;3])
    
    ans =
        9.2666

For information on how to include extra parameters, see Passing Extra Parameters. For more complex examples of function files, see Minimization with Gradient and Hessian Sparsity Pattern or Minimization with Bound Constraints and Banded Preconditioner.

Local Functions and Nested Functions

Functions can exist inside other files as local functions (MATLAB) or nested functions (MATLAB). Using local functions or nested functions can lower the number of distinct files you save. Using nested functions also lets you access extra parameters, as shown in Nested Functions.

For example, suppose you want to minimize the myObjective.m objective function, described in Function Files, subject to the ellipseparabola.m constraint, described in Nonlinear Constraints. Instead of writing two files, myObjective.m and ellipseparabola.m, write one file that contains both functions as local functions:

function [x fval] = callObjConstr(x0,options)
% Using a local function for just one file

if nargin < 2
    options = optimoptions('fmincon','Algorithm','interior-point');
end

[x fval] = fmincon(@myObjective,x0,[],[],[],[],[],[], ...
    @ellipseparabola,options);

function f = myObjective(xin)
f = 3*(xin(1)-xin(2))^4 + 4*(xin(1)+xin(3))^2/(1+sum(xin.^2)) ...
    + cosh(xin(1)-1) + tanh(xin(2)+xin(3));

function [c,ceq] = ellipseparabola(x)
c(1) = (x(1)^2)/9 + (x(2)^2)/4 - 1;
c(2) = x(1)^2 - x(2) - 1;
ceq = [];

Solve the constrained minimization starting from the point [1;1;1]:

[x fval] = callObjConstr(ones(3,1))

Local minimum found that satisfies the constraints.

Optimization completed because the objective function is 
non-decreasing in feasible directions, to within the default 
value of the function tolerance, and constraints are satisfied 
to within the default value of the constraint tolerance.

x =
    1.1835
    0.8345
   -1.6439

fval =
    0.5383

Anonymous Function Objectives

Use anonymous functions to write simple objective functions. For more information about anonymous functions, see What Are Anonymous Functions? (MATLAB). Rosenbrock's function is simple enough to write as an anonymous function:

anonrosen = @(x)(100*(x(2) - x(1)^2)^2 + (1-x(1))^2);
Check that anonrosen evaluates correctly at [-1 2]:
anonrosen([-1 2])

ans =
   104
Minimizing anonrosen with fminunc yields the following results:
options = optimoptions(@fminunc,'Algorithm','quasi-newton');
[x fval] = fminunc(anonrosen,[-1;2],options)

Local minimum found.

Optimization completed because the size of the gradient
is less than the default value of the function tolerance.

x =
    1.0000
    1.0000

fval =
  1.2266e-10

Including Gradients and Hessians

Provide Derivatives For Solvers

For fmincon and fminunc, you can include gradients in the objective function. Generally, solvers are more robust, and can be slightly faster when you include gradients. See Benefits of Including Derivatives. To also include second derivatives (Hessians), see Including Hessians.

The following table shows which algorithms can use gradients and Hessians.

SolverAlgorithmGradientHessian
fminconactive-setOptionalNo
interior-pointOptionalOptional (see Hessian for fmincon interior-point algorithm)
sqpOptionalNo
trust-region-reflectiveRequiredOptional (see Hessian for fminunc trust-region or fmincon trust-region-reflective algorithms)
fminuncquasi-newtonOptionalNo
trust-regionRequiredOptional (see Hessian for fminunc trust-region or fmincon trust-region-reflective algorithms)

How to Include Gradients

  1. Write code that returns:

    • The objective function (scalar) as the first output

    • The gradient (vector) as the second output

  2. Set the SpecifyObjectiveGradient option to true using optimoptions. If appropriate, also set the SpecifyConstraintGradient option to true.

  3. Optionally, check if your gradient function matches a finite-difference approximation. See Checking Validity of Gradients or Jacobians.

Tip

For most flexibility, write conditionalized code. Conditionalized means that the number of function outputs can vary, as shown in the following example. Conditionalized code does not error depending on the value of the SpecifyObjectiveGradient option. Unconditionalized code requires you to set options appropriately.

For example, consider Rosenbrock's function

f(x)=100(x2x12)2+(1x1)2,

which is described and plotted in Solve a Constrained Nonlinear Problem, Solver-Based. The gradient of f(x) is

f(x)=[400(x2x12)x12(1x1)200(x2x12)],

rosentwo is a conditionalized function that returns whatever the solver requires:

function [f,g] = rosentwo(x)
% Calculate objective f
f = 100*(x(2) - x(1)^2)^2 + (1-x(1))^2;

if nargout > 1 % gradient required
    g = [-400*(x(2)-x(1)^2)*x(1)-2*(1-x(1));
        200*(x(2)-x(1)^2)];
    
end

nargout checks the number of arguments that a calling function specifies. See Find Number of Function Arguments (MATLAB).

The fminunc solver, designed for unconstrained optimization, allows you to minimize Rosenbrock's function. Tell fminunc to use the gradient and Hessian by setting options:

options = optimoptions(@fminunc,'Algorithm','trust-region',...
    'SpecifyObjectiveGradient',true);

Run fminunc starting at [-1;2]:

[x fval] = fminunc(@rosentwo,[-1;2],options)
Local minimum found.

Optimization completed because the size of the gradient
is less than the default value of the function tolerance.

x =
    1.0000
    1.0000

fval =
  1.9886e-17

If you have a Symbolic Math Toolbox™ license, you can calculate gradients and Hessians automatically, as described in Symbolic Math Toolbox Calculates Gradients and Hessians.

Including Hessians

You can include second derivatives with the fmincon 'trust-region-reflective' and 'interior-point' algorithms, and with the fminunc 'trust-region' algorithm. There are several ways to include Hessian information, depending on the type of information and on the algorithm.

You must also include gradients (set SpecifyObjectiveGradient to true and, if applicable, SpecifyConstraintGradient to true) in order to include Hessians.

Hessian for fminunc trust-region or fmincon trust-region-reflective algorithms.  These algorithms either have no constraints, or have only bound or linear equality constraints. Therefore the Hessian is the matrix of second derivatives of the objective function.

Include the Hessian matrix as the third output of the objective function. For example, the Hessian H(x) of Rosenbrock’s function is (see How to Include Gradients)

H(x)=[1200x12400x2+2400x1400x1200].

Include this Hessian in the objective:

function [f, g, H] = rosenboth(x)
% Calculate objective f
f = 100*(x(2) - x(1)^2)^2 + (1-x(1))^2;

if nargout > 1 % gradient required
    g = [-400*(x(2)-x(1)^2)*x(1)-2*(1-x(1));
        200*(x(2)-x(1)^2)];
    
    if nargout > 2 % Hessian required
        H = [1200*x(1)^2-400*x(2)+2, -400*x(1);
            -400*x(1), 200];  
    end

end

Set HessianFcn to 'objective'. For example,

options = optimoptions('fminunc','Algorithm','trust-region',...
    'SpecifyObjectiveGradient',true,'HessianFcn','objective');

Hessian for fmincon interior-point algorithm.  The Hessian is the Hessian of the Lagrangian, where the Lagrangian L(x,λ) is

L(x,λ)=f(x)+λg,igi(x)+λh,ihi(x).

g and h are vector functions representing all inequality and equality constraints respectively (meaning bound, linear, and nonlinear constraints), so the minimization problem is

minxf(x) subject to g(x)0, h(x)=0.

For details, see Constrained Optimality Theory. The Hessian of the Lagrangian is

xx2L(x,λ)=2f(x)+λg,i2gi(x)+λh,i2hi(x).(1)

To include a Hessian, write a function with the syntax

hessian = hessianfcn(x,lambda)

hessian is an n-by-n matrix, sparse or dense, where n is the number of variables. If hessian is large and has relatively few nonzero entries, save running time and memory by representing hessian as a sparse matrix. lambda is a structure with the Lagrange multiplier vectors associated with the nonlinear constraints:

lambda.ineqnonlin
lambda.eqnonlin

fmincon computes the structure lambda and passes it to your Hessian function. hessianfcn must calculate the sums in Equation 1. Indicate that you are supplying a Hessian by setting these options:

options = optimoptions('fmincon','Algorithm','interior-point',...
    'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true,...
    'HessianFcn',@hessianfcn);

For example, to include a Hessian for Rosenbrock’s function constrained to the unit disk x12+x221, notice that the constraint function g(x)=x12+x2210 has gradient and second derivative matrix

g(x)=[2x12x2]Hg(x)=[2002].

Write the Hessian function as

function Hout = hessianfcn(x,lambda)
% Hessian of objective
H = [1200*x(1)^2-400*x(2)+2, -400*x(1);
            -400*x(1), 200];
% Hessian of nonlinear inequality constraint
Hg = 2*eye(2);
Hout = H + lambda.ineqnonlin*Hg;

Save hessianfcn on your MATLAB path. To complete the example, the constraint function including gradients is

function [c,ceq,gc,gceq] = unitdisk2(x)
c = x(1)^2 + x(2)^2 - 1;
ceq = [ ];

if nargout > 2
    gc = [2*x(1);2*x(2)];
    gceq = [];
end

Solve the problem including gradients and Hessian.

fun = @rosenboth;
nonlcon = @unitdisk2;
x0 = [-1;2];
options = optimoptions('fmincon','Algorithm','interior-point',...
    'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true,...
    'HessianFcn',@hessianfcn);
[x,fval,exitflag,output] = fmincon(fun,x0,[],[],[],[],[],[],@unitdisk2,options);

For other examples using an interior-point Hessian, see fmincon Interior-Point Algorithm with Analytic Hessian and Symbolic Math Toolbox Calculates Gradients and Hessians.

Hessian Multiply Function.  Instead of a complete Hessian function, both the fmincon interior-point and trust-region-reflective algorithms allow you to supply a Hessian multiply function. This function gives the result of a Hessian-times-vector product, without computing the Hessian directly. This can save memory. The SubproblemAlgorithm option must be 'cg' for a Hessian multiply function to work; this is the trust-region-reflective default.

The syntaxes for the two algorithms differ.

  • For the interior-point algorithm, the syntax is

    W = HessMultFcn(x,lambda,v);

    The result W should be the product H*v, where H is the Hessian of the Lagrangian at x (see Equation 1), lambda is the Lagrange multiplier (computed by fmincon), and v is a vector of size n-by-1. Set options as follows:

    options = optimoptions('fmincon','Algorithm','interior-point','SpecifyObjectiveGradient',true,... 
        'SpecifyConstraintGradient',true,'SubproblemAlgorithm','cg','HessianMultiplyFcn',@HessMultFcn);

    Supply the function HessMultFcn, which returns an n-by-1 vector, where n is the number of dimensions of x. The HessianMultiplyFcn option enables you to pass the result of multiplying the Hessian by a vector without calculating the Hessian.

  • The trust-region-reflective algorithm does not involve lambda:

    W = HessMultFcn(H,v);

    The result W = H*v. fmincon passes H as the value returned in the third output of the objective function (see Hessian for fminunc trust-region or fmincon trust-region-reflective algorithms). fmincon also passes v, a vector or matrix with n rows. The number of columns in v can vary, so write HessMultFcn to accept an arbitrary number of columns. H does not have to be the Hessian; rather, it can be anything that enables you to calculate W = H*v.

    Set options as follows:

    options = optimoptions('fmincon','Algorithm','trust-region-reflective',... 
        'SpecifyObjectiveGradient',true,'HessianMultiplyFcn',@HessMultFcn);

    For an example using a Hessian multiply function with the trust-region-reflective algorithm, see Minimization with Dense Structured Hessian, Linear Equalities.

Benefits of Including Derivatives

If you do not provide gradients, solvers estimate gradients via finite differences. If you provide gradients, your solver need not perform this finite difference estimation, so can save time and be more accurate, although a finite-difference estimate can be faster for complicated derivatives. Furthermore, solvers use an approximate Hessian, which can be far from the true Hessian. Providing a Hessian can yield a solution in fewer iterations. For example, see Compare to Optimization Without Gradients and Hessians.

For constrained problems, providing a gradient has another advantage. A solver can reach a point x such that x is feasible, but, for this x, finite differences around x always lead to an infeasible point. Suppose further that the objective function at an infeasible point returns a complex output, Inf, NaN, or error. In this case, a solver can fail or halt prematurely. Providing a gradient allows a solver to proceed. To obtain this benefit, you might also need to include the gradient of a nonlinear constraint function, and set the SpecifyConstraintGradient option to true. See Nonlinear Constraints.

Choose Input Hessian Approximation for interior-point fmincon

The fmincon interior-point algorithm has many options for selecting an input Hessian approximation. For syntax details, see Hessian as an Input. Here are the options, along with estimates of their relative characteristics.

HessianRelative Memory UsageRelative Efficiency
'bfgs' (default)High (for large problems)High
'lbfgs'Low to ModerateModerate
'fin-diff-grads'LowModerate
'HessianMultiplyFcn'Low (can depend on your code)Moderate
'HessianFcn'? (depends on your code)High (depends on your code)

Use the default 'bfgs' Hessian unless you

The reason 'lbfgs' has only moderate efficiency is twofold. It has relatively expensive Sherman-Morrison updates. And the resulting iteration step can be somewhat inaccurate due to the 'lbfgs' limited memory.

The reason 'fin-diff-grads' and HessianMultiplyFcn have only moderate efficiency is that they use a conjugate gradient approach. They accurately estimate the Hessian of the objective function, but they do not generate the most accurate iteration step. For more information, see fmincon Interior Point Algorithm, and its discussion of the LDL approach and the conjugate gradient approach to solving Equation 36.

Related Topics