Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

First-Order Optimality Measure

What Is First-Order Optimality Measure?

First-order optimality is a measure of how close a point x is to optimal. Most Optimization Toolbox™ solvers use this measure, though it has different definitions for different algorithms. First-order optimality is a necessary condition, but it is not a sufficient condition. In other words:

  • The first-order optimality measure must be zero at a minimum.

  • A point with first-order optimality equal to zero is not necessarily a minimum.

For general information about first-order optimality, see Nocedal and Wright [31]. For specifics about the first-order optimality measures for Optimization Toolbox solvers, see Unconstrained Optimality, Constrained Optimality Theory, and Constrained Optimality in Solver Form.

Stopping Rules Related to First-Order Optimality

The OptimalityTolerance tolerance relates to the first-order optimality measure. Typically, if the first-order optimality measure is less than OptimalityTolerance, solver iterations end.

Some solvers or algorithms use relative first-order optimality as a stopping criterion. Solver iterations end if the first-order optimality measure is less than μ times OptimalityTolerance, where μ is either:

  • The infinity norm (maximum) of the gradient of the objective function at x0

  • The infinity norm (maximum) of inputs to the solver, such as f or b in linprog or H in quadprog

A relative measure attempts to account for the scale of a problem. Multiplying an objective function by a very large or small number does not change the stopping condition for a relative stopping criterion, but does change it for an unscaled one.

Solvers with enhanced exit messages state, in the stopping criteria details, when they use relative first-order optimality.

Unconstrained Optimality

For a smooth unconstrained problem,

minxf(x),

the first-order optimality measure is the infinity norm (meaning maximum absolute value) of f(x), which is:

first-order optimality measure = maxi|(f(x))i|=f(x).

This measure of optimality is based on the familiar condition for a smooth function to achieve a minimum: its gradient must be zero. For unconstrained problems, when the first-order optimality measure is nearly zero, the objective function has gradient nearly zero, so the objective function could be near a minimum. If the first-order optimality measure is not small, the objective function is not minimal.

Constrained Optimality Theory

This section summarizes the theory behind the definition of first-order optimality measure for constrained problems. The definition as used in Optimization Toolbox functions is in Constrained Optimality in Solver Form.

For a smooth constrained problem, let g and h be vector functions representing all inequality and equality constraints respectively (meaning bound, linear, and nonlinear constraints):

minxf(x) subject to g(x)0, h(x)=0.

The meaning of first-order optimality in this case is more complex than for unconstrained problems. The definition is based on the Karush-Kuhn-Tucker (KKT) conditions. The KKT conditions are analogous to the condition that the gradient must be zero at a minimum, modified to take constraints into account. The difference is that the KKT conditions hold for constrained problems.

The KKT conditions use the auxiliary Lagrangian function:

L(x,λ)=f(x)+λg,igi(x)+λh,ihi(x).(3-1)
The vector λ, which is the concatenation of λg and λh, is the Lagrange multiplier vector. Its length is the total number of constraints.

The KKT conditions are:

xL(x,λ)=0,(3-2)
λg,igi(x)=0 i,(3-3)
{g(x)0,h(x)=0,λg,i0.(3-4)
Solvers do not use the three expressions in Equation 3-4 in the calculation of optimality measure.

The optimality measure associated with Equation 3-2 is

xL(x,λ=f(x)+λg,igi(x)+λh,ihh,i(x).(3-5)
The optimality measure associated with Equation 3-3 is
λgg(x),(3-6)
where the norm in Equation 3-6 means infinity norm (maximum) of the vector λg,igi(x).

The combined optimality measure is the maximum of the values calculated in Equation 3-5 and Equation 3-6. Solvers that accept nonlinear constraint functions report constraint violations g(x) > 0 or |h(x)| > 0 as ConstraintTolerance violations. See Tolerances and Stopping Criteria.

Constrained Optimality in Solver Form

Most constrained toolbox solvers separate their calculation of first-order optimality measure into bounds, linear functions, and nonlinear functions. The measure is the maximum of the following two norms, which correspond to Equation 3-5 and Equation 3-6:

xL(x,λ=f(x)+ATλineqlin+AeqTλeqlin                       +λineqnonlin,ici(x)+λeqnonlin,iceqi(x),(3-7)
|lixi|λlower,i,|xiui|λupper,i,|(Axb)i|λineqlin,i,|ci(x)|λineqnonlin,i,(3-8)

where the norm of the vectors in Equation 3-7 and Equation 3-8 is the infinity norm (maximum). The subscripts on the Lagrange multipliers correspond to solver Lagrange multiplier structures. See Lagrange Multiplier Structures. The summations in Equation 3-7 range over all constraints. If a bound is ±Inf, that term is not constrained, so it is not part of the summation.

Linear Equalities Only

For some large-scale problems with only linear equalities, the first-order optimality measure is the infinity norm of the projected gradient. In other words, the first-order optimality measure is the size of the gradient projected onto the null space of Aeq.

Bounded Least-Squares and Trust-Region-Reflective Solvers

For least-squares solvers and trust-region-reflective algorithms, in problems with bounds alone, the first-order optimality measure is the maximum over i of |vi*gi|. Here gi is the ith component of the gradient, x is the current point, and

vi={|xibi|if the negative gradient points toward bound bi1otherwise.

If xi is at a bound, vi is zero. If xi is not at a bound, then at a minimizing point the gradient gi should be zero. Therefore the first-order optimality measure should be zero at a minimizing point.

Was this topic helpful?