Note: This page has been translated by MathWorks. Please click here

To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

Unconstrained minimization is the problem of finding a vector * x* that
is a local minimum to a scalar function

$$\underset{x}{\mathrm{min}}f(x)$$

The term *unconstrained* means that no restriction
is placed on the range of * x*.

`fminunc`

`trust-region`

AlgorithmMany of the methods used in Optimization Toolbox™ solvers
are based on *trust regions,* a simple yet powerful
concept in optimization.

To understand the trust-region approach to optimization, consider
the unconstrained minimization problem, minimize * f*(

$$\underset{s}{\mathrm{min}}\left\{q(s),\text{}s\in N\right\}.$$ | (6-1) |

The current point is updated to be * x* +

The key questions in defining a specific trust-region approach
to minimizing * f*(

In the standard trust-region method ([48]), the quadratic approximation * q* is
defined by the first two terms of the Taylor approximation to

$$\mathrm{min}\left\{\frac{1}{2}{s}^{T}Hs+{s}^{T}g\text{suchthat}\Vert Ds\Vert \le \Delta \right\},$$ | (6-2) |

where * g* is the gradient of

$$\frac{1}{\Delta}-\frac{1}{\Vert s\Vert}=0.$$

Such algorithms provide an accurate solution to Equation 6-2. However, they require time proportional to several
factorizations of * H*. Therefore, for large-scale
problems a different approach is needed. Several approximation and
heuristic strategies, based on Equation 6-2, have been
proposed in the literature ([42] and [50]). The approximation approach followed
in Optimization Toolbox solvers is to restrict the trust-region
subproblem to a two-dimensional subspace

The two-dimensional subspace * S* is
determined with the aid of a preconditioned
conjugate gradient process described below. The solver defines

$$H\cdot {s}_{2}=-g,$$ | (6-3) |

or a direction of negative curvature,

$${s}_{2}^{T}\cdot H\cdot {s}_{2}<0.$$ | (6-4) |

The philosophy behind this choice of * S* is
to force global convergence (via the steepest descent direction or
negative curvature direction) and achieve fast local convergence (via
the Newton step, when it exists).

A sketch of unconstrained minimization using trust-region ideas is now easy to give:

Formulate the two-dimensional trust-region subproblem.

Solve Equation 6-2 to determine the trial step

.*s*If

, then*f*(*x*+*s*) <*f*(*x*).*x*=*x*+*s*Adjust Δ.

These four steps are repeated until convergence. The trust-region
dimension Δ is adjusted according to standard rules. In particular,
it is decreased if the trial step is not accepted, i.e., * f(x + s)
≥ f(x)*.
See [46] and [49] for a discussion of this aspect.

Optimization Toolbox solvers treat a few important special
cases of * f* with specialized functions: nonlinear
least-squares, quadratic functions, and linear least-squares. However,
the underlying algorithmic ideas are the same as for the general case.
These special cases are discussed in later sections.

A popular way to solve large symmetric positive definite systems
of linear equations * Hp = –g* is the
method of Preconditioned Conjugate Gradients (PCG). This iterative
approach requires the ability to calculate matrix-vector products
of the form

In a minimization context, you can assume that the Hessian matrix * H* is
symmetric. However,

`fminunc`

`quasi-newton`

AlgorithmAlthough a wide spectrum of methods exists for unconstrained optimization, methods can be broadly categorized in terms of the derivative information that is, or is not, used. Search methods that use only function evaluations (e.g., the simplex search of Nelder and Mead [30]) are most suitable for problems that are not smooth or have a number of discontinuities. Gradient methods are generally more efficient when the function to be minimized is continuous in its first derivative. Higher order methods, such as Newton's method, are only really suitable when the second-order information is readily and easily calculated, because calculation of second-order information, using numerical differentiation, is computationally expensive.

Gradient methods use information about the slope of the function
to dictate a direction of search where the minimum is thought to lie.
The simplest of these is the method of steepest descent in which a
search is performed in a direction, *–∇ f(x)*,
where

$$f(x)=100{\left({x}_{2}-{x}_{1}^{2}\right)}^{2}+{(1-{x}_{1})}^{2}.$$ | (6-5) |

The minimum of this function is at * x = [1,1]*, where

**Figure 6-1. Steepest Descent Method on Rosenbrock's Function**

This function, also known as the banana function, is notorious in unconstrained examples because of the way the curvature bends around the origin. Rosenbrock's function is used throughout this section to illustrate the use of a variety of optimization techniques. The contours have been plotted in exponential increments because of the steepness of the slope surrounding the U-shaped valley.

For a more complete description of this figure, including scripts that generate the iterative points, see Banana Function Minimization.

Of the methods that use gradient information, the most favored are the quasi-Newton methods. These methods build up curvature information at each iteration to formulate a quadratic model problem of the form

$$\underset{x}{\mathrm{min}}\frac{1}{2}{x}^{T}Hx+{c}^{T}x+b,$$ | (6-6) |

where the Hessian matrix, * H*, is a positive
definite symmetric matrix,

$$\nabla f({x}^{*})=H{x}^{*}+\text{\hspace{0.17em}}c=0.$$ | (6-7) |

The optimal solution point, * x**, can be written
as

$${x}^{*}=-{H}^{-1}c.$$ | (6-8) |

Newton-type methods (as opposed to quasi-Newton methods) calculate * H* directly
and proceed in a direction of descent to locate the minimum after
a number of iterations. Calculating

A large number of Hessian updating methods have been developed. However, the formula of Broyden [3], Fletcher [12], Goldfarb [20], and Shanno [37] (BFGS) is thought to be the most effective for use in a general purpose method.

The formula given by BFGS is

$${H}_{k+1}={H}_{k}+\frac{{q}_{k}{q}_{k}^{T}}{{q}_{k}^{T}{s}_{k}}-\frac{{H}_{k}{s}_{k}{s}_{k}^{T}{H}_{k}^{T}}{{s}_{k}^{T}{H}_{k}{s}_{k}},$$ | (6-9) |

where

$$\begin{array}{c}{s}_{k}={x}_{k+1}-{x}_{k},\\ {q}_{k}=\nabla f\left({x}_{k+1}\right)-\nabla f\left({x}_{k}\right).\end{array}$$

As a starting point, *H*_{0} can
be set to any symmetric positive definite matrix, for example, the
identity matrix * I*. To avoid the inversion of the
Hessian

The gradient information is either supplied through analytically
calculated gradients, or derived by partial derivatives using a numerical
differentiation method via finite differences. This involves perturbing
each of the design variables, * x*, in turn and calculating
the rate of change in the objective function.

At each major iteration, * k*, a line search
is performed in the direction

$$d=-{H}_{k}^{-1}\cdot \nabla f\left({x}_{k}\right).$$ | (6-10) |

The quasi-Newton method is illustrated by the solution path on Rosenbrock's function in Figure 6-2, BFGS Method on Rosenbrock's Function. The method is able to follow the shape of the valley and converges to the minimum after 140 function evaluations using only finite difference gradients.

**Figure 6-2. BFGS Method on Rosenbrock's Function**

For a more complete description of this figure, including scripts that generate the iterative points, see Banana Function Minimization.

*Line search* is a search method that is
used as part of a larger optimization algorithm. At each step of the
main algorithm, the line-search method searches along the line containing
the current point, * x_{k}*, parallel
to the

$${x}_{k+1}={x}_{k}+{\alpha}^{*}{d}_{k},$$ | (6-11) |

where * x_{k}* denotes the
current iterate,

The line search method attempts to decrease the objective function
along the line * x_{k} + α*d_{k}* by
repeatedly minimizing polynomial interpolation models of the objective
function. The line search procedure has two main steps:

The

*bracketing*phase determines the range of points on the line $${x}_{k+1}={x}_{k}+{\alpha}^{*}{d}_{k}$$ to be searched. The*bracket*corresponds to an interval specifying the range of values of.*α*The

*sectioning*step divides the bracket into subintervals, on which the minimum of the objective function is approximated by polynomial interpolation.

The resulting step length α satisfies the Wolfe conditions:

$$f\left({x}_{k}+\alpha {d}_{k}\right)\le f\left({x}_{k}\right)+{c}_{1}\alpha \nabla {f}_{k}^{T}{d}_{k},$$ | (6-12) |

$$\nabla f{\left({x}_{k}+\alpha {d}_{k}\right)}^{T}{d}_{k}\ge {c}_{2}\nabla {f}_{k}^{T}{d}_{k},$$ | (6-13) |

where *c*_{1} and *c*_{2} are
constants with 0 < *c*_{1} < *c*_{2} <
1.

The first condition
(Equation 6-12) requires that * α_{k}* sufficiently
decreases the objective function. The second condition (Equation 6-13) ensures that the step length is not too small.
Points that satisfy both conditions (Equation 6-12 and Equation 6-13) are called

The line search method is an implementation of the algorithm described in Section 2-6 of [13]. See also [31] for more information about line search.

Many of the optimization functions determine the direction of
search by updating the Hessian matrix at each iteration, using the
BFGS method (Equation 6-9). The function `fminunc`

also provides an option to use
the DFP method given in Quasi-Newton Methods (set `HessUpdate`

to `'dfp'`

in `options`

to
select the DFP method). The Hessian, * H*, is always
maintained to be positive definite so that the direction of search,

$${q}_{k}^{T}{s}_{k}={\alpha}_{k}\left(\nabla f{\left({x}_{k+1}\right)}^{T}d-\nabla f{\left({x}_{k}\right)}^{T}d\right).$$ | (6-14) |

You always achieve the condition that $${q}_{k}^{T}{s}_{k}$$ is positive by performing a
sufficiently accurate line search. This is because the search direction, * d*,
is a descent direction, so that

Was this topic helpful?