In the example Nonlinear Equations with Analytic Jacobian, the function `nlsf1`

computes
the Jacobian `J`

, a sparse matrix, along with the
evaluation of `F`

. What if the code to compute the
Jacobian is not available? By default, if you do not indicate that
the Jacobian can be computed in `nlsf1`

(by setting
the `'SpecifyObjectiveGradient'`

option in `options`

to `true`

), `fsolve`

, `lsqnonlin`

,
and `lsqcurvefit`

instead uses
finite differencing to approximate the Jacobian.

In order for this finite differencing to be as efficient as
possible, you should supply the sparsity pattern of the Jacobian, by
setting `JacobPattern`

to a sparse matrix `Jstr`

in `options`

.
That is, supply a sparse matrix `Jstr`

whose nonzero
entries correspond to nonzeros of the Jacobian for all *x*.
Indeed, the nonzeros of `Jstr`

can correspond to
a superset of the nonzero locations of `J`

; however,
in general the computational cost of the sparse finite-difference
procedure will increase with the number of nonzeros of `Jstr`

.

Providing the sparsity pattern can drastically reduce the time needed to compute the finite differencing on large problems. If the sparsity pattern is not provided (and the Jacobian is not computed in the objective function either) then, in this problem with 1000 variables, the finite-differencing code attempts to compute all 1000-by-1000 entries in the Jacobian. But in this case there are only 2998 nonzeros, substantially less than the 1,000,000 possible nonzeros the finite-differencing code attempts to compute. In other words, this problem is solvable if you provide the sparsity pattern. If not, most computers run out of memory when the full dense finite-differencing is attempted. On most small problems, it is not essential to provide the sparsity structure.

Suppose the sparse matrix `Jstr`

, computed
previously, has been saved in file `nlsdat1.mat`

.
The following driver calls `fsolve`

applied
to `nlsf1a`

, which is `nlsf1`

without
the Jacobian. Sparse finite-differencing is used to estimate the sparse
Jacobian matrix as needed.

function F = nlsf1a(x) % Evaluate the vector function n = length(x); F = zeros(n,1); i = 2:(n-1); F(i) = (3-2*x(i)).*x(i)-x(i-1)-2*x(i+1) + 1; F(n) = (3-2*x(n)).*x(n)-x(n-1) + 1; F(1) = (3-2*x(1)).*x(1)-2*x(2) + 1;

xstart = -ones(1000,1); fun = @nlsf1a; load nlsdat1 % Get Jstr options = optimoptions(@fsolve,'Display','iter','JacobPattern',Jstr,... 'Algorithm','trust-region','SubproblemAlgorithm','cg'); [x,fval,exitflag,output] = fsolve(fun,xstart,options);

In this case, the output displayed is

Norm of First-order Iteration Func-count f(x) step optimality 0 5 1011 19 1 10 16.1942 7.91898 2.35 2 15 0.0228025 1.33142 0.291 3 20 0.00010336 0.0433327 0.0201 4 25 7.37924e-07 0.0022606 0.000946 5 30 4.02301e-10 0.000268382 4.12e-05 Equation solved, inaccuracy possible. The vector of function values is near zero, as measured by the default value of the function tolerance. However, the last step was ineffective.

Alternatively, it is possible to choose a sparse direct linear
solver (i.e., a sparse QR factorization) by indicating a "complete"
preconditioner. For example, if you set `PrecondBandWidth`

to `Inf`

,
then a sparse direct linear solver is used instead of a preconditioned
conjugate gradient iteration:

xstart = -ones(1000,1); fun = @nlsf1a; load nlsdat1 % Get Jstr options = optimoptions(@fsolve,'Display','iter','JacobPattern',Jstr,... 'Algorithm','trust-region','SubproblemAlgorithm','factorization'); [x,fval,exitflag,output] = fsolve(fun,xstart,options);

and the resulting display is

Norm of First-order Iteration Func-count f(x) step optimality 0 5 1011 19 1 10 15.9018 7.92421 1.89 2 15 0.0128161 1.32542 0.0746 3 20 1.73502e-08 0.0397923 0.000196 4 25 1.10727e-18 4.55495e-05 2.74e-09 Equation solved. fsolve completed because the vector of function values is near zero as measured by the default value of the function tolerance, and the problem appears regular as measured by the gradient.

When using the sparse direct solver, there are no CG iterations.
Notice that the final optimality and *f*(*x*)
value (which for `fsolve`

, *f*(*x*),
is the sum of the squares of the function values) are closer to zero
than using the PCG method, which is often the case.

Was this topic helpful?