Note: This page has been translated by MathWorks. Click here to see

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

First-order optimality is a measure of how close a point *x* is
to optimal. Most Optimization
Toolbox™ solvers use this measure,
though it has different definitions for different algorithms. First-order
optimality is a necessary condition, but it is not a sufficient condition.
In other words:

The first-order optimality measure must be zero at a minimum.

A point with first-order optimality equal to zero is not necessarily a minimum.

For general information about first-order optimality, see Nocedal and Wright [31]. For specifics about the first-order optimality measures for Optimization Toolbox solvers, see Unconstrained Optimality, Constrained Optimality Theory, and Constrained Optimality in Solver Form.

The `OptimalityTolerance`

tolerance relates
to the first-order optimality measure. Typically, if the first-order
optimality measure is less than `OptimalityTolerance`

,
solver iterations end.

Some solvers or algorithms use *relative* first-order
optimality as a stopping criterion. Solver iterations end if the first-order
optimality measure is less than *μ* times `OptimalityTolerance`

,
where *μ* is either:

The infinity norm (maximum) of the gradient of the objective function at

`x0`

The infinity norm (maximum) of inputs to the solver, such as

`f`

or`b`

in`linprog`

or`H`

in`quadprog`

A relative measure attempts to account for the scale of a problem. Multiplying an objective function by a very large or small number does not change the stopping condition for a relative stopping criterion, but does change it for an unscaled one.

Solvers with enhanced exit messages state, in the stopping criteria details, when they use relative first-order optimality.

For a smooth unconstrained problem,

$$\underset{x}{\mathrm{min}}f(x),$$

the first-order optimality
measure is the infinity norm (meaning maximum absolute value) of ∇*f*(*x*),
which is:

$$\text{first-orderoptimalitymeasure=}\underset{i}{\mathrm{max}}\left|{\left(\nabla f(x)\right)}_{i}\right|={\Vert \nabla f(x)\Vert}_{\infty}.$$

This measure of optimality is based on the familiar condition for a smooth function to achieve a minimum: its gradient must be zero. For unconstrained problems, when the first-order optimality measure is nearly zero, the objective function has gradient nearly zero, so the objective function could be near a minimum. If the first-order optimality measure is not small, the objective function is not minimal.

This section summarizes the theory behind the definition of first-order optimality measure for constrained problems. The definition as used in Optimization Toolbox functions is in Constrained Optimality in Solver Form.

For a smooth constrained problem, let *g* and *h* be
vector functions representing all inequality and equality constraints
respectively (meaning bound, linear, and nonlinear constraints):

$$\underset{x}{\mathrm{min}}f(x)\text{subjectto}g(x)\le 0,\text{}h(x)=0.$$

The meaning of first-order optimality in this case is more complex than for unconstrained problems. The definition is based on the Karush-Kuhn-Tucker (KKT) conditions. The KKT conditions are analogous to the condition that the gradient must be zero at a minimum, modified to take constraints into account. The difference is that the KKT conditions hold for constrained problems.

The KKT conditions use the auxiliary Lagrangian function:

$$L(x,\lambda )=f(x)+{\displaystyle \sum {\lambda}_{g,i}{g}_{i}(x)}+{\displaystyle \sum {\lambda}_{h,i}{h}_{i}(x)}.$$ | (1) |

The KKT conditions are:

$${\nabla}_{x}L(x,\lambda )=0,$$ | (2) |

$${\lambda}_{g,i}{g}_{i}(x)=0\text{}\forall i,$$ | (3) |

$$\{\begin{array}{c}g(x)\le 0,\\ h(x)=0,\\ {\lambda}_{g,i}\ge 0.\end{array}$$ | (4) |

The optimality measure associated with Equation 2 is

$$\Vert {\nabla}_{x}L(x,\lambda \Vert =\Vert \nabla f(x)+{\displaystyle \sum {\lambda}_{g,i}\nabla {g}_{i}(x)+{\displaystyle \sum {\lambda}_{h,i}\nabla {h}_{h,i}(x)}}\Vert .$$ | (5) |

$$\Vert \overrightarrow{{\lambda}_{g}g}(x)\Vert ,$$ | (6) |

The combined optimality measure is the maximum of the values
calculated in Equation 5 and Equation 6. Solvers that
accept nonlinear constraint functions report constraint violations *g*(*x*) > 0 or |*h*(*x*)| > 0 as `ConstraintTolerance`

violations.
See Tolerances and Stopping Criteria.

Most constrained toolbox solvers separate their calculation of first-order optimality measure into bounds, linear functions, and nonlinear functions. The measure is the maximum of the following two norms, which correspond to Equation 5 and Equation 6:

$$\begin{array}{l}\Vert {\nabla}_{x}L(x,\lambda \Vert =\Vert \nabla f(x)+{A}^{T}{\lambda}_{ineqlin}+Ae{q}^{T}{\lambda}_{eqlin}\\ \text{}+{\displaystyle \sum {\lambda}_{ineqnonlin,i}\nabla {c}_{i}(x)+{\displaystyle \sum {\lambda}_{eqnonlin,i}\nabla ce{q}_{i}(x)}}\Vert ,\end{array}$$ | (7) |

$$\Vert \overrightarrow{\left|{l}_{i}-{x}_{i}\right|{\lambda}_{lower,i}},\overrightarrow{\left|{x}_{i}-{u}_{i}\right|{\lambda}_{upper,i}},\overrightarrow{\left|{(Ax-b)}_{i}\right|{\lambda}_{ineqlin,i}},\overrightarrow{\left|{c}_{i}(x)\right|{\lambda}_{ineqnonlin,i}}\Vert ,$$ | (8) |

where the norm of the
vectors in Equation 7 and Equation 8 is the infinity
norm (maximum). The subscripts on the Lagrange multipliers correspond
to solver Lagrange multiplier structures. See Lagrange Multiplier Structures. The summations in Equation 7 range over all
constraints. If a bound is ±`Inf`

, that term
is not constrained, so it is not part of the summation.

For some large-scale problems with only linear equalities, the
first-order optimality measure is the infinity norm of the *projected* gradient.
In other words, the first-order optimality measure is the size of
the gradient projected onto the null space of `Aeq`

.

For least-squares solvers and trust-region-reflective algorithms,
in problems with bounds alone, the first-order optimality measure
is the maximum over *i* of |*v _{i}**

$${v}_{i}=\{\begin{array}{ll}\left|{x}_{i}-{b}_{i}\right|\hfill & \text{ifthenegativegradientpointstowardbound}{b}_{i}\hfill \\ 1\hfill & \text{otherwise}\text{.}\hfill \end{array}$$

If *x _{i}* is at a bound,