Given a set of *n* nonlinear functions *F _{i}*(

`fsolve`

attempts to solve
systems of equations by minimizing the sum of squares of the components.
If the sum of squares is zero, the system of equation is solved. `fsolve`

has
three algorithms:

Trust-region

Trust-region dogleg

Levenberg-Marquardt

All algorithms are large-scale; see Large-Scale vs. Medium-Scale Algorithms.

The `fzero`

function solves
a single one-dimensional equation.

The `mldivide`

function
solves systems of linear equations.

Many of the methods used in Optimization Toolbox™ solvers
are based on *trust regions,* a simple yet powerful
concept in optimization.

To understand the trust-region approach to optimization, consider
the unconstrained minimization problem, minimize *f*(*x*),
where the function takes vector arguments and returns scalars. Suppose
you are at a point *x* in *n*-space
and you want to improve, i.e., move to a point with a lower function
value. The basic idea is to approximate *f* with
a simpler function *q*, which reasonably reflects
the behavior of function *f* in a neighborhood *N* around
the point *x*. This neighborhood is the trust region.
A trial step *s* is computed by minimizing (or approximately
minimizing) over *N*. This is the trust-region subproblem,

$$\underset{s}{\mathrm{min}}\left\{q(s),\text{}s\in N\right\}.$$ | (11-1) |

The current point is updated to be *x* + *s* if *f*(*x* + *s*) < *f*(*x*);
otherwise, the current point remains unchanged and *N*,
the region of trust, is shrunk and the trial step computation is repeated.

The key questions in defining a specific trust-region approach
to minimizing *f*(*x*) are how to
choose and compute the approximation *q* (defined
at the current point *x*), how to choose and modify
the trust region *N*, and how accurately to solve
the trust-region subproblem. This section focuses on the unconstrained
problem. Later sections discuss additional complications due to the
presence of constraints on the variables.

In the standard trust-region method ([48]), the quadratic approximation *q* is
defined by the first two terms of the Taylor approximation to *F* at *x*;
the neighborhood *N* is usually spherical or ellipsoidal
in shape. Mathematically the trust-region subproblem is typically
stated

$$\mathrm{min}\left\{\frac{1}{2}{s}^{T}Hs+{s}^{T}g\text{suchthat}\Vert Ds\Vert \le \Delta \right\},$$ | (11-2) |

where *g* is the gradient of *f* at
the current point *x*, *H* is the
Hessian matrix (the symmetric matrix of second derivatives), *D* is
a diagonal scaling matrix, Δ is a positive scalar, and ∥
. ∥ is the 2-norm. Good algorithms exist for solving Equation 11-2 (see [48]); such algorithms typically involve the
computation of a full eigensystem and a Newton process applied to
the secular equation

$$\frac{1}{\Delta}-\frac{1}{\Vert s\Vert}=0.$$

Such algorithms provide an accurate solution to Equation 11-2. However, they
require time proportional to several factorizations of *H*.
Therefore, for trust-region problems a different approach is needed.
Several approximation and heuristic strategies, based on Equation 11-2, have been
proposed in the literature ([42] and [50]). The approximation approach followed
in Optimization Toolbox solvers is to restrict the trust-region
subproblem to a two-dimensional subspace *S* ([39] and [42]).
Once the subspace *S* has been computed, the work
to solve Equation 11-2 is
trivial even if full eigenvalue/eigenvector information is needed
(since in the subspace, the problem is only two-dimensional). The
dominant work has now shifted to the determination of the subspace.

The two-dimensional subspace *S* is
determined with the aid of a preconditioned
conjugate gradient process described below. The solver defines *S* as
the linear space spanned by *s*_{1} and *s*_{2},
where *s*_{1} is in the direction
of the gradient *g*, and *s*_{2} is
either an approximate Newton direction, i.e.,
a solution to

$$H\cdot {s}_{2}=-g,$$ | (11-3) |

or a direction of negative curvature,

$${s}_{2}^{T}\cdot H\cdot {s}_{2}<0.$$ | (11-4) |

The philosophy behind this choice of *S* is
to force global convergence (via the steepest descent direction or
negative curvature direction) and achieve fast local convergence (via
the Newton step, when it exists).

A sketch of unconstrained minimization using trust-region ideas is now easy to give:

Formulate the two-dimensional trust-region subproblem.

Solve Equation 11-2 to determine the trial step

*s*.If

*f*(*x*+*s*) <*f*(*x*), then*x*=*x*+*s*.Adjust Δ.

These four steps are repeated until convergence. The trust-region
dimension Δ is adjusted according to standard rules. In particular,
it is decreased if the trial step is not accepted, i.e., *f*(*x* + *s*)
≥ *f*(*x*).
See [46] and [49] for a discussion of this aspect.

Optimization Toolbox solvers treat a few important special
cases of *f* with specialized functions: nonlinear
least-squares, quadratic functions, and linear least-squares. However,
the underlying algorithmic ideas are the same as for the general case.
These special cases are discussed in later sections.

A popular way to solve large symmetric positive definite systems
of linear equations *Hp* = –*g* is the
method of Preconditioned Conjugate Gradients (PCG). This iterative
approach requires the ability to calculate matrix-vector products
of the form *H·v* where *v* is
an arbitrary vector. The symmetric positive definite matrix *M *is
a* preconditioner* for *H*.
That is, *M* = *C*^{2},
where *C*^{–1}*HC*^{–1} is
a well-conditioned matrix or a matrix with clustered eigenvalues.

In a minimization context, you can assume that the Hessian matrix *H* is
symmetric. However, *H* is guaranteed to be positive
definite only in the neighborhood of a strong minimizer. Algorithm
PCG exits when a direction of negative (or zero) curvature is encountered,
i.e., *d ^{T}Hd* ≤ 0. The
PCG output direction,

Another approach is to solve a linear system of equations to
find the search direction, namely, Newton's method says to solve for
the search direction *d _{k}* such
that

*J*(*x _{k}*)

where *J*(*x _{k}*)
is the

$$J\left({x}_{k}\right)=\left[\begin{array}{c}\nabla {F}_{1}{\left({x}_{k}\right)}^{T}\\ \nabla {F}_{2}{\left({x}_{k}\right)}^{T}\\ \vdots \\ \nabla {F}_{n}{\left({x}_{k}\right)}^{T}\end{array}\right].$$

Newton's method can run into difficulties. *J*(*x _{k}*)
may be singular, and so the Newton step

Using trust-region techniques (introduced in Trust-Region Methods for Nonlinear Minimization) improves
robustness when starting far from the solution and handles the case
when *J*(*x _{k}*)
is singular. To use a trust-region strategy, a merit function is needed
to decide if

$$\underset{d}{\mathrm{min}}f(d)=\frac{1}{2}F{\left({x}_{k}+d\right)}^{T}F\left({x}_{k}+d\right).$$

But a minimum of * f*(*d*)
is not necessarily a root of *F*(*x*).

The Newton step *d _{k}* is
a root of

*M*(*x _{k}* +

and so it is also a minimum of *m*(*d*),
where

$$\begin{array}{c}\underset{d}{\mathrm{min}}m(d)=\frac{1}{2}{\Vert M\left({x}_{k}+d\right)\Vert}_{2}^{2}=\frac{1}{2}{\Vert F\left({x}_{k}\right)+J\left({x}_{k}\right)d\Vert}_{2}^{2}\\ =\frac{1}{2}F{\left({x}_{k}\right)}^{T}F\left({x}_{k}\right)+{d}^{T}J{\left({x}_{k}\right)}^{T}F\left({x}_{k}\right)+\frac{1}{2}{d}^{T}J{\left({x}_{k}\right)}^{T}J\left({x}_{k}\right)d.\end{array}$$ | (11-5) |

Then *m*(*d*) is a better
choice of merit function than *f*(*d*),
and so the trust-region subproblem is

$$\underset{d}{\mathrm{min}}\left[\frac{1}{2}F{\left({x}_{k}\right)}^{T}F\left({x}_{k}\right)+{d}^{T}J{\left({x}_{k}\right)}^{T}F\left({x}_{k}\right)+\frac{1}{2}{d}^{T}J{\left({x}_{k}\right)}^{T}J\left({x}_{k}\right)d\right],$$ | (11-6) |

such that ∥*D*·*d*∥
≤ Δ. This subproblem can be efficiently
solved using a dogleg strategy.

For an overview of trust-region methods, see Conn [4], and Nocedal [31].

The key feature of this algorithm is the use of the Powell dogleg
procedure for computing the step *d*, which minimizes Equation 11-6. For a detailed description, see Powell [34].

The step *d* is constructed from a convex combination
of a Cauchy step (a step along the steepest descent direction) and
a Gauss-Newton step for *f*(*x*).
The Cauchy step is calculated as

*d _{C}* = –

where *α* is chosen to minimize Equation 11-5.

The Gauss-Newton step is calculated by solving

*J*(*x _{k}*)·

using the MATLAB^{®} `mldivide`

(matrix
left division) operator.

The step *d* is chosen so that

*d* = *d _{C}* +

where *λ* is the largest value in the
interval [0,1] such that ∥*d*∥
≤ Δ. If *J _{k}* is
(nearly) singular,

The dogleg algorithm is efficient since it requires only one linear solve per iteration (for the computation of the Gauss-Newton step). Additionally, it can be more robust than using the Gauss-Newton method with a line search.

The Levenberg-Marquardt [25], and [27] method uses a search direction that is a solution of the linear set of equations

$$\left(J{\left({x}_{k}\right)}^{T}J\left({x}_{k}\right)+{\lambda}_{k}I\right){d}_{k}=-J{\left({x}_{k}\right)}^{T}F\left({x}_{k}\right),$$ | (11-7) |

or, optionally, of the equations

$$\left(J{\left({x}_{k}\right)}^{T}J\left({x}_{k}\right)+{\lambda}_{k}diag\left(J{\left({x}_{k}\right)}^{T}J\left({x}_{k}\right)\right)\right){d}_{k}=-J{\left({x}_{k}\right)}^{T}F\left({x}_{k}\right),$$ | (11-8) |

where the scalar *λ _{k}* controls
both the magnitude and direction of

`ScaleProblem`

to `'none'`

to
choose Equation 11-7,
and set `ScaleProblem`

to `'Jacobian'`

to
choose Equation 11-8.When *λ _{k}* is zero,
the direction

This algorithm is described in the MATLAB arithmetic operators
section for `mldivide`

.

`fzero`

attempts to find the root of a scalar
function *f* of a scalar variable *x*.

`fzero`

looks for an interval around an initial
point such that *f*(*x*) changes
sign. If you give an initial interval instead of an initial point, `fzero`

checks
to make sure *f*(*x*) has different
signs at the endpoints of the interval. The initial interval must
be finite; it cannot contain ±`Inf`

.

`fzero`

uses a combination of interval bisection,
linear interpolation, and inverse quadratic interpolation in order
to locate a root of *f*(*x*). See `fzero`

for more information.

Was this topic helpful?