Products & Services Solutions Academia Support User Community Company

Learn more about Optimization Toolbox   

Equation Solving Examples

Example: Nonlinear Equations with Analytic Jacobian

This example demonstrates the use of the default large-scale fsolve algorithm (see Large-Scale vs. Medium-Scale Algorithms). It is intended for problems where

The example uses fsolve to obtain the minimum of the banana (or Rosenbrock) function by deriving and then solving an equivalent system of nonlinear equations. The Rosenbrock function, which has a minimum of F(x) = 0, is a common test problem in optimization. It has a high degree of nonlinearity and converges extremely slowly if you try to use steepest descent type methods. It is given by

First generalize this function to an n-dimensional function, for any positive, even value of n:

This function is referred to as the generalized Rosenbrock function. It consists of n squared terms involving n unknowns.

Before you can use fsolve to find the values of x such that F(x) = 0, i.e., obtain the minimum of the generalized Rosenbrock function, you must rewrite the function as the following equivalent system of nonlinear equations:

This system is square, and you can use fsolve to solve it. As the example demonstrates, this system has a unique solution given by xi = 1, i = 1,...,n.

Step 1: Write an M-file bananaobj.m to compute the objective function values and the Jacobian.

function [F,J] = bananaobj(x)
% Evaluate the vector function and the Jacobian matrix for 
% the system of nonlinear equations derived from the general 
% n-dimensional Rosenbrock function.
% Get the problem size
n = length(x);  
if n == 0, error('Input vector, x, is empty.'); end
if mod(n,2) ~= 0, 
   error('Input vector, x ,must have an even number of components.');
end
% Evaluate the vector function
odds  = 1:2:n;
evens = 2:2:n;
F = zeros(n,1);
F(odds,1)  = 1-x(odds);
F(evens,1) = 10.*(x(evens)-x(odds).^2); 
% Evaluate the Jacobian matrix if nargout > 1
if nargout > 1
   c = -ones(n/2,1);    C = sparse(odds,odds,c,n,n);
   d = 10*ones(n/2,1);  D = sparse(evens,evens,d,n,n);
   e = -20.*x(odds);    E = sparse(evens,odds,e,n,n);
   J = C + D + E;
end

Step 2: Call the solve routine for the system of equations.

n = 64;  
x0(1:n,1) = -1.9; 
x0(2:2:n,1) = 2;
options=optimset('Display','iter','Jacobian','on');
[x,F,exitflag,output,JAC] = fsolve(@bananaobj,x0,options);

Use the starting point x(i) = –1.9 for the odd indices, and x(i) = 2 for the even indices. Set Display to 'iter' to see the solver's progress. Set Jacobian to 'on' to use the Jacobian defined in bananaobj.m. The fsolve function generates the following output:

                                Norm of  First-order Trust-region
Iteration Func-count   f(x)        step   optimality       radius
    0        1      4281.92                    615            1
    1        2      1546.86           1        329            1
    2        3      112.552         2.5       34.8          2.5
    3        4       106.24        6.25       34.1         6.25
    4        5       106.24        6.25       34.1         6.25
    5        6      51.3854      1.5625       6.39         1.56
    6        7      51.3854     3.90625       6.39         3.91
    7        8      43.8722    0.976562       2.19        0.977
    8        9      37.0713     2.44141       6.27         2.44
    9       10      37.0713     2.44141       6.27         2.44
   10       11      26.2485    0.610352       1.52         0.61
   11       12      20.6649     1.52588       4.63         1.53
   12       13      17.2558     1.52588       6.97         1.53
   13       14      8.48582     1.52588       4.69         1.53
   14       15      4.08398     1.52588       3.77         1.53
   15       16      1.77589     1.52588       3.56         1.53
   16       17     0.692381     1.52588       3.31         1.53
   17       18     0.109777     1.16206       1.66         1.53
   18       19            0   0.0468565          0         1.53

Equation solved.

fsolve completed because the vector of function values is near zero
as measured by the default value of the function tolerance, and
the problem appears regular as measured by the gradient.

Example: Nonlinear Equations with Finite-Difference Jacobian

In the preceding example, the function bananaobj evaluates F and computes the Jacobian J. What if the code to compute the Jacobian is not available? By default, if you do not indicate that the Jacobian can be computed in the objective function (by setting the Jacobian option in options to 'on'), fsolve, lsqnonlin, and lsqcurvefit instead use finite differencing to approximate the Jacobian. This is the default Jacobian option. You can select finite differencing by setting Jacobian to 'off' using optimset.

This example uses bananaobj from the preceding example as the objective function, but sets Jacobian to 'off' so that fsolve approximates the Jacobian and ignores the second bananaobj output.

n = 64;  
x0(1:n,1) = -1.9; 
x0(2:2:n,1) = 2;
options=optimset('Display','iter','Jacobian','off');
[x,F,exitflag,output,JAC] = fsolve(@bananaobj,x0,options);

The example produces the following output:

                                Norm of  First-order Trust-region
Iteration Func-count   f(x)        step   optimality       radius
    0       65      4281.92                    615            1
    1      130      1546.86           1        329            1
    2      195      112.552         2.5       34.8          2.5
    3      260       106.24        6.25       34.1         6.25
    4      261       106.24        6.25       34.1         6.25
    5      326      51.3854      1.5625       6.39         1.56
    6      327      51.3854     3.90625       6.39         3.91
    7      392      43.8722    0.976562       2.19        0.977
    8      457      37.0713     2.44141       6.27         2.44
    9      458      37.0713     2.44141       6.27         2.44
   10      523      26.2485    0.610352       1.52         0.61
   11      588      20.6649     1.52588       4.63         1.53
   12      653      17.2558     1.52588       6.97         1.53
   13      718      8.48582     1.52588       4.69         1.53
   14      783      4.08398     1.52588       3.77         1.53
   15      848      1.77589     1.52588       3.56         1.53
   16      913     0.692381     1.52588       3.31         1.53
   17      978     0.109777     1.16206       1.66         1.53
   18     1043            0   0.0468565          0         1.53

Equation solved.

fsolve completed because the vector of function values is near zero
as measured by the default value of the function tolerance, and
the problem appears regular as measured by the gradient.

The finite-difference version of this example requires the same number of iterations to converge as the analytic Jacobian version in the preceding example. It is generally the case that both versions converge at about the same rate in terms of iterations. However, the finite-difference version requires many additional function evaluations. The cost of these extra evaluations might or might not be significant, depending on the particular problem.

Example: Nonlinear Equations with Jacobian

Consider the problem of finding a solution to a system of nonlinear equations whose Jacobian is sparse. The dimension of the problem in this example is 1000. The goal is to find x such that F(x) = 0. Assuming n = 1000, the nonlinear equations are

To solve a large nonlinear system of equations, F(x) = 0, you can use the trust-region reflective algorithm available in fsolve, a large-scale algorithm (Large-Scale vs. Medium-Scale Algorithms).

Step 1: Write an M-file nlsf1.m that computes the objective function values and the Jacobian.

function [F,J] = nlsf1(x)
% Evaluate the vector function
n = length(x);
F = zeros(n,1);
i = 2:(n-1);
F(i) = (3-2*x(i)).*x(i)-x(i-1)-2*x(i+1) + 1;
F(n) = (3-2*x(n)).*x(n)-x(n-1) + 1;
F(1) = (3-2*x(1)).*x(1)-2*x(2) + 1;
% Evaluate the Jacobian if nargout > 1
if nargout > 1
   d = -4*x + 3*ones(n,1); D = sparse(1:n,1:n,d,n,n);
   c = -2*ones(n-1,1); C = sparse(1:n-1,2:n,c,n,n);
   e = -ones(n-1,1); E = sparse(2:n,1:n-1,e,n,n);
   J = C + D + E;
end

Step 2: Call the solve routine for the system of equations.

xstart = -ones(1000,1);
fun = @nlsf1; 
options = optimset('Display','iter',...
    'Algorithm','trust-region-reflective',...
    'Jacobian','on','PrecondBandWidth',0);
[x,fval,exitflag,output] = fsolve(fun,xstart,options);

A starting point is given as well as the function name. The default method for fsolve is trust-region-dogleg, so it is necessary to specify 'Algorithm' as 'trust-region-reflective' in the options argument in order to select the trust-region-reflective algorithm. Setting the Display option to 'iter' causes fsolve to display the output at each iteration. Setting Jacobian to 'on', causes fsolve to use the Jacobian information available in nlsf1.m.

The commands display this output:

                                        Norm of      First-order 
Iteration  Func-count     f(x)          step          optimality   CG-iterations
    0          1            1011                            19
    1          2         16.1942        7.91898           2.35            3
    2          3       0.0228027        1.33142          0.291            3
    3          4     0.000103359      0.0433329         0.0201            4
    4          5     7.3792e-007      0.0022606       0.000946            4
    5          6    4.02299e-010    0.000268381      4.12e-005            5

Equation solved, inaccuracy possible.

The vector of function values is near zero, as measured by the default value
of the function tolerance. However, the last step was ineffective.

A linear system is (approximately) solved in each major iteration using the preconditioned conjugate gradient method. Setting PrecondBandWidth to 0 in options means a diagonal preconditioner is used. (PrecondBandWidth specifies the bandwidth of the preconditioning matrix. A bandwidth of 0 means there is only one diagonal in the matrix.)

From the first-order optimality values, fast linear convergence occurs. The number of conjugate gradient (CG) iterations required per major iteration is low, at most five for a problem of 1000 dimensions, implying that the linear systems are not very difficult to solve in this case (though more work is required as convergence progresses).

If you want to use a tridiagonal preconditioner, i.e., a preconditioning matrix with three diagonals (or bandwidth of one), set PrecondBandWidth to the value 1:

options = optimset('Display','iter','Jacobian','on',...
   'Algorithm','trust-region-reflective','PrecondBandWidth',1);
[x,fval,exitflag,output] = fsolve(fun,xstart,options);

In this case the output is

                                        Norm of      First-order 
Iteration  Func-count     f(x)          step          optimality   CG-iterations
    0          1            1011                            19
    1          2         16.0839        7.92496           1.92            1
    2          3       0.0458181         1.3279          0.579            1
    3          4     0.000101184      0.0631898         0.0203            2
    4          5    3.16615e-007     0.00273698        0.00079            2
    5          6    9.72481e-010     0.00018111      5.82e-005            2

Equation solved, inaccuracy possible.

The vector of function values is near zero, as measured by the default value
of the function tolerance. However, the last step was ineffective.

Note that although the same number of iterations takes place, the number of PCG iterations has dropped, so less work is being done per iteration. See Preconditioned Conjugate Gradient Method.

Setting PrecondBandWidth to Inf (this is the default) means that the solver uses Cholesky factorization rather than PCG.

Example: Nonlinear Equations with Jacobian Sparsity Pattern

In the preceding example, the function nlsf1 computes the Jacobian J, a sparse matrix, along with the evaluation of F. What if the code to compute the Jacobian is not available? By default, if you do not indicate that the Jacobian can be computed in nlsf1 (by setting the Jacobian option in options to 'on'), fsolve, lsqnonlin, and lsqcurvefit instead uses finite differencing to approximate the Jacobian.

In order for this finite differencing to be as efficient as possible, you should supply the sparsity pattern of the Jacobian, by setting JacobPattern to 'on' in options. That is, supply a sparse matrix Jstr whose nonzero entries correspond to nonzeros of the Jacobian for all x. Indeed, the nonzeros of Jstr can correspond to a superset of the nonzero locations of J; however, in general the computational cost of the sparse finite-difference procedure will increase with the number of nonzeros of Jstr.

Providing the sparsity pattern can drastically reduce the time needed to compute the finite differencing on large problems. If the sparsity pattern is not provided (and the Jacobian is not computed in the objective function either) then, in this problem nlsfs1, the finite-differencing code attempts to compute all 1000-by-1000 entries in the Jacobian. But in this case there are only 2998 nonzeros, substantially less than the 1,000,000 possible nonzeros the finite-differencing code attempts to compute. In other words, this problem is solvable if you provide the sparsity pattern. If not, most computers run out of memory when the full dense finite-differencing is attempted. On most small problems, it is not essential to provide the sparsity structure.

Suppose the sparse matrix Jstr, computed previously, has been saved in file nlsdat1.mat. The following driver calls fsolve applied to nlsf1a, which is the same as nlsf1 except that only the function values are returned; sparse finite-differencing is used to estimate the sparse Jacobian matrix as needed.

Step 1: Write an M-file nlsf1a.m that computes the objective function values.

function F = nlsf1a(x)
% Evaluate the vector function
n = length(x);
F = zeros(n,1);
i = 2:(n-1);
F(i) = (3-2*x(i)).*x(i)-x(i-1)-2*x(i+1) + 1;
F(n) = (3-2*x(n)).*x(n)-x(n-1) + 1;
F(1) = (3-2*x(1)).*x(1)-2*x(2) + 1;

Step 2: Call the system of equations solve routine.

xstart = -ones(1000,1); 
fun = @nlsf1a;
load nlsdat1   % Get Jstr
options = optimset('Display','iter','JacobPattern',Jstr,...
    'Algorithm','trust-region-reflective','PrecondBandWidth',1);
[x,fval,exitflag,output] = fsolve(fun,xstart,options);

In this case, the output displayed is

                                        Norm of      First-order 
Iteration  Func-count     f(x)          step          optimality   CG-iterations
    0          5            1011                            19
    1         10         16.0839        7.92496           1.92            1
    2         15       0.0458179         1.3279          0.579            1
    3         20     0.000101184      0.0631896         0.0203            2
    4         25    3.16616e-007     0.00273698        0.00079            2
    5         30    9.72483e-010     0.00018111      5.82e-005            2

Equation solved, inaccuracy possible.

The vector of function values is near zero, as measured by the default value
of the function tolerance. However, the last step was ineffective.

Alternatively, it is possible to choose a sparse direct linear solver (i.e., a sparse QR factorization) by indicating a "complete" preconditioner. For example, if you set PrecondBandWidth to Inf, then a sparse direct linear solver is used instead of a preconditioned conjugate gradient iteration:

xstart = -ones(1000,1);
fun = @nlsf1a;
load nlsdat1   % Get Jstr
options = optimset('Display','iter','JacobPattern',Jstr,...
 'Algorithm','trust-region-reflective','PrecondBandWidth',inf);
[x,fval,exitflag,output] = fsolve(fun,xstart,options);

and the resulting display is

                                        Norm of      First-order 
Iteration  Func-count     f(x)          step          optimality   CG-iterations
    0          5            1011                            19
    1         10         15.9018        7.92421           1.89            0
    2         15       0.0128161        1.32542         0.0746            0
    3         20    1.73502e-008      0.0397923       0.000196            0
    4         25    1.10732e-018   4.55495e-005      2.74e-009            0

Equation solved.

fsolve completed because the vector of function values is near zero
as measured by the default value of the function tolerance, and
the problem appears regular as measured by the gradient.

When the sparse direct solvers are used, the CG iteration is 0 for that (major) iteration, as shown in the output under CG-Iterations. Notice that the final optimality and f(x) value (which for fsolve, f(x), is the sum of the squares of the function values) are closer to zero than using the PCG method, which is often the case.

  


Recommended Products

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.

 © 1984-2009- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS