Note: This page has been translated by MathWorks. Please click here

To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

First-order optimality is a measure of how close a point *x* is
to optimal. Most Optimization
Toolbox™ solvers use this measure,
though it has different definitions for different algorithms. First-order
optimality is a necessary condition, but it is not a sufficient condition.
In other words:

The first-order optimality measure must be zero at a minimum.

A point with first-order optimality equal to zero is not necessarily a minimum.

For general information about first-order optimality, see Nocedal and Wright [31]. For specifics about the first-order optimality measures for Optimization Toolbox solvers, see Unconstrained Optimality, Constrained Optimality Theory, and Constrained Optimality in Solver Form.

The `OptimalityTolerance`

tolerance relates
to the first-order optimality measure. Typically, if the first-order
optimality measure is less than `OptimalityTolerance`

,
solver iterations end.

Some solvers or algorithms use *relative* first-order
optimality as a stopping criterion. Solver iterations end if the first-order
optimality measure is less than *μ* times `OptimalityTolerance`

,
where *μ* is either:

The infinity norm (maximum) of the gradient of the objective function at

`x0`

The infinity norm (maximum) of inputs to the solver, such as

`f`

or`b`

in`linprog`

or`H`

in`quadprog`

A relative measure attempts to account for the scale of a problem. Multiplying an objective function by a very large or small number does not change the stopping condition for a relative stopping criterion, but does change it for an unscaled one.

Solvers with enhanced exit messages state, in the stopping criteria details, when they use relative first-order optimality.

For a smooth unconstrained problem,

$$\underset{x}{\mathrm{min}}f(x),$$

$$\text{first-orderoptimalitymeasure=}\underset{i}{\mathrm{max}}\left|{\left(\nabla f(x)\right)}_{i}\right|={\Vert \nabla f(x)\Vert}_{\infty}.$$

This section summarizes the theory behind the definition of first-order optimality measure for constrained problems. The definition as used in Optimization Toolbox functions is in Constrained Optimality in Solver Form.

For a smooth constrained problem, let *g* and *h* be
vector functions representing all inequality and equality constraints
respectively (meaning bound, linear, and nonlinear constraints):

$$\underset{x}{\mathrm{min}}f(x)\text{subjectto}g(x)\le 0,\text{}h(x)=0.$$

The KKT conditions use the auxiliary Lagrangian function:

$$L(x,\lambda )=f(x)+{\displaystyle \sum {\lambda}_{g,i}{g}_{i}(x)}+{\displaystyle \sum {\lambda}_{h,i}{h}_{i}(x)}.$$ | (3-1) |

The KKT conditions are:

$${\nabla}_{x}L(x,\lambda )=0,$$ | (3-2) |

$${\lambda}_{g,i}{g}_{i}(x)=0\text{}\forall i,$$ | (3-3) |

$$\{\begin{array}{c}g(x)\le 0,\\ h(x)=0,\\ {\lambda}_{g,i}\ge 0.\end{array}$$ | (3-4) |

The optimality measure associated with Equation 3-2 is

$$\Vert {\nabla}_{x}L(x,\lambda \Vert =\Vert \nabla f(x)+{\displaystyle \sum {\lambda}_{g,i}\nabla {g}_{i}(x)+{\displaystyle \sum {\lambda}_{h,i}\nabla {h}_{h,i}(x)}}\Vert .$$ | (3-5) |

$$\Vert \overrightarrow{{\lambda}_{g}g}(x)\Vert ,$$ | (3-6) |

The combined optimality measure is the maximum of the values
calculated in Equation 3-5 and Equation 3-6. Solvers that
accept nonlinear constraint functions report constraint violations *g*(*x*) > 0 or |*h*(*x*)| > 0 as `ConstraintTolerance`

violations.
See Tolerances and Stopping Criteria.

Most constrained toolbox solvers separate their calculation of first-order optimality measure into bounds, linear functions, and nonlinear functions. The measure is the maximum of the following two norms, which correspond to Equation 3-5 and Equation 3-6:

$$\begin{array}{l}\Vert {\nabla}_{x}L(x,\lambda \Vert =\Vert \nabla f(x)+{A}^{T}{\lambda}_{ineqlin}+Ae{q}^{T}{\lambda}_{eqlin}\\ \text{}+{\displaystyle \sum {\lambda}_{ineqnonlin,i}\nabla {c}_{i}(x)+{\displaystyle \sum {\lambda}_{eqnonlin,i}\nabla ce{q}_{i}(x)}}\Vert ,\end{array}$$ | (3-7) |

$$\Vert \overrightarrow{\left|{l}_{i}-{x}_{i}\right|{\lambda}_{lower,i}},\overrightarrow{\left|{x}_{i}-{u}_{i}\right|{\lambda}_{upper,i}},\overrightarrow{\left|{(Ax-b)}_{i}\right|{\lambda}_{ineqlin,i}},\overrightarrow{\left|{c}_{i}(x)\right|{\lambda}_{ineqnonlin,i}}\Vert ,$$ | (3-8) |

where the norm of the
vectors in Equation 3-7 and Equation 3-8 is the infinity
norm (maximum). The subscripts on the Lagrange multipliers correspond
to solver Lagrange multiplier structures. See Lagrange Multiplier Structures. The summations in Equation 3-7 range over all
constraints. If a bound is ±`Inf`

, that term
is not constrained, so it is not part of the summation.

For some large-scale problems with only linear equalities, the
first-order optimality measure is the infinity norm of the *projected* gradient.
In other words, the first-order optimality measure is the size of
the gradient projected onto the null space of `Aeq`

.

For least-squares solvers and trust-region-reflective algorithms,
in problems with bounds alone, the first-order optimality measure
is the maximum over *i* of |*v _{i}**

$${v}_{i}=\{\begin{array}{ll}\left|{x}_{i}-{b}_{i}\right|\hfill & \text{ifthenegativegradientpointstowardbound}{b}_{i}\hfill \\ 1\hfill & \text{otherwise}\text{.}\hfill \end{array}$$

If *x _{i}* is at a bound,

Was this topic helpful?