Products & Services Solutions Academia Support User Community Company

Learn more about Optimization Toolbox   

Least Squares (Model Fitting) Examples

Example: Using lsqnonlin With a Simulink Model

Suppose that you want to optimize the control parameters in the Simulink model optsim.mdl. (This model can be found in the optim folder. Note that Simulink must be installed on your system to load this model.) The model includes a nonlinear process plant modeled as a Simulink block diagram.

Plant with Actuator Saturation

nonlinear process plant modeled as a Simulink block diagram

The plant is an under-damped third-order model with actuator limits. The actuator limits are a saturation limit and a slew rate limit. The actuator saturation limit cuts off input values greater than 2 units or less than -2 units. The slew rate limit of the actuator is 0.8 units/sec. The closed-loop response of the system to a step input is shown in Closed-Loop Response. You can see this response by opening the model (type optsim at the command line or click the model name), and selecting Start from the Simulation menu. The response plots to the scope.

Closed-Loop Response

Plot of closed loop response

The problem is to design a feedback control loop that tracks a unit step input to the system. The closed-loop plant is entered in terms of the blocks where the plant and actuator have been placed in a hierarchical Subsystem block. A Scope block displays output trajectories during the design process.

Closed-Loop Model

Simulink block diagram of closed-loop model. Tunable variables are PID gains, K sub p, K i, and K d.

One way to solve this problem is to minimize the error between the output and the input signal. The variables are the parameters of the Proportional Integral Derivative (PID) controller. If you only need to minimize the error at one time unit, it would be a single objective function. But the goal is to minimize the error for all time steps from 0 to 100, thus producing a multiobjective function (one function for each time step).

The routine lsqnonlin is used to perform a least-squares fit on the tracking of the output. The tracking is performed via an M-file function tracklsq, which returns the error signal yout, the output computed by calling sim, minus the input signal 1. The code for tracklsq, shown below, is contained in the file runtracklsq.m, which is included with Optimization Toolbox software.

The function runtracklsq sets up all the needed values and then calls lsqnonlin with the objective function tracklsq, which is nested inside runtracklsq. The variable options passed to lsqnonlin defines the criteria and display characteristics. In this case you ask for output, use the medium-scale algorithm, and give termination tolerances for the step and objective function on the order of 0.001.

To run the simulation in the model optsim, the variables Kp, Ki, Kd, a1, and a2 (a1 and a2 are variables in the Plant block) must all be defined. Kp, Ki, and Kd are the variables to be optimized. The function tracklsq is nested inside runtracklsq so that the variables a1 and a2 are shared between the two functions. The variables a1 and a2 are initialized in runtracklsq.

The objective function tracklsq must run the simulation. The simulation can be run either in the base workspace or the current workspace, that is, the workspace of the function calling sim, which in this case is the workspace of tracklsq. In this example, the simset command is used to tell sim to run the simulation in the current workspace by setting 'SrcWorkspace' to 'Current'. You can also choose a solver for sim using the simset function. The simulation is performed using a fixed-step fifth-order method to 100 seconds.

When the simulation is completed, the variables tout, xout, and yout are now in the current workspace (that is, the workspace of tracklsq). The Outport block in the block diagram model puts yout into the current workspace at the end of the simulation.

The following is the code for runtracklsq:

function [Kp,Ki,Kd] = runtracklsq
% RUNTRACKLSQ demonstrates using LSQNONLIN with Simulink.

optsim                       % Load the model
pid0 = [0.63 0.0504 1.9688]; % Set initial values
a1 = 3; a2 = 43;             % Initialize model plant variables
options = optimset('Algorithm','levenberg-marquardt',...
   'Display','iter','TolX',0.001,'TolFun',0.001);
pid = lsqnonlin(@tracklsq, pid0, [], [], options);
Kp = pid(1); Ki = pid(2); Kd = pid(3); 

    function F = tracklsq(pid)
      % Track the output of optsim to a signal of 1
        
      % Variables a1 and a2 are needed by the model optsim.
      % They are shared with RUNTRACKLSQ so do not need to be
      % redefined here.
      Kp = pid(1);
      Ki = pid(2);
      Kd = pid(3);

      % Compute function value
      simopt = simset('solver','ode5',...
                      'SrcWorkspace','Current');  
      % Initialize sim options
      [tout,xout,yout] = sim('optsim',[0 100],simopt);
      F = yout-1;

    end
end

When you run runtracklsq, the optimization gives the solution for the proportional, integral, and derivative (Kp, Ki, Kd) gains of the controller after 64 function evaluations:

[Kp, Ki, Kd] = runtracklsq

                                        First-Order                    Norm of 
 Iteration  Func-count    Residual       optimality      Lambda           step
     0           4         8.66531           0.833         0.01
     1           8         6.86978             1.2        0.001        1.98067
     2          12          5.6588           0.922       0.0001        3.15662
     3          16         4.57909           0.169       1e-005        6.15155
     4          24          4.4727           0.164          0.1       0.589688
     5          28         4.47066         0.00534         0.01      0.0475163

Local minimum possible.

lsqnonlin stopped because the final change in the sum of squares relative to 
its initial value is less than the selected value of the function tolerance.

Kp =
    2.9633

Ki =
    0.1436

Kd =
   13.1386

Here is the resulting closed-loop step response.

Closed-Loop Response Using lsqnonlin

Example: Nonlinear Least-Squares with Full Jacobian Sparsity Pattern

The trust-region reflective algorithm in lsqnonlin, lsqcurvefit, and fsolve can be used with small- to medium-scale problems without computing the Jacobian in fun or providing the Jacobian sparsity pattern. (This example also applies to the case of using fmincon or fminunc without computing the Hessian or supplying the Hessian sparsity pattern.) How small is small- to medium-scale? No absolute answer is available, as it depends on the amount of virtual memory available in your computer system configuration.

Suppose your problem has m equations and n unknowns. If the command J = sparse(ones(m,n)) causes an Out of memory error on your machine, then this is certainly too large a problem. If it does not result in an error, the problem might still be too large, but you can only find out by running it and seeing if MATLAB is able to run within the amount of virtual memory available on your system.

Let's say you have a small problem with 10 equations and 2 unknowns, such as finding x that minimizes

starting at the point x = [0.3,0.4].

Because lsqnonlin assumes that the sum of squares is not explicitly formed in the user function, the function passed to lsqnonlin should instead compute the vector valued function

for k = 1 to 10 (that is, F should have 10 components).

Step 1: Write an M-file myfun.m that computes the objective function values.

function F = myfun(x)
k = 1:10;
F = 2 + 2*k-exp(k*x(1))-exp(k*x(2));

Step 2: Call the nonlinear least-squares routine.

x0 = [0.3 0.4]            % Starting guess
[x,resnorm] = lsqnonlin(@myfun,x0)     % Invoke optimizer

Because the Jacobian is not computed in myfun.m, and no Jacobian sparsity pattern is provided by the JacobPattern option in options, lsqnonlin calls the trust-region reflective algorithm with JacobPattern set to Jstr = sparse(ones(10,2)). This is the default for lsqnonlin. Note that the Jacobian option in options is set to 'off' by default.

When the finite-differencing routine is called the first time, it detects that Jstr is actually a dense matrix, i.e., that no speed benefit is derived from storing it as a sparse matrix. From then on the finite-differencing routine uses Jstr = ones(10,2) (a full matrix) for the optimization computations.

After about 24 function evaluations, this example gives the solution

x = 
     0.2578   0.2578
resnorm     % Residual or sum of squares
resnorm = 
     124.3622

Most computer systems can handle much larger full problems, say into the hundreds of equations and variables. But if there is some sparsity structure in the Jacobian (or Hessian) that can be taken advantage of, the large-scale methods will always run faster if this information is provided.

Example: Linear Least-Squares with Bound Constraints

Many situations give rise to sparse linear least-squares problems, often with bounds on the variables. The next problem requires that the variables be nonnegative. This problem comes from fitting a function approximation to a piecewise linear spline. Specifically, particles are scattered on the unit square. The function to be approximated is evaluated at these points, and a piecewise linear spline approximation is constructed under the condition that (linear) coefficients are not negative. There are 2000 equations to fit on 400 variables:

load particle   % Get C, d
lb = zeros(400,1);
[x,resnorm,residual,exitflag,output] = ...
              lsqlin(C,d,[],[],[],[],lb);

The default diagonal preconditioning works fairly well:

exitflag =
     3
resnorm =
   22.5794
output = 
       iterations: 10
        algorithm: 'large-scale: trust-region reflective Newton'
    firstorderopt: 2.7870e-005
     cgiterations: 42
          message: [1x123 char]

For bound constrained problems, the first-order optimality is the infinity norm of v.*g, where v is defined as in Box Constraints, and g is the gradient.

You can improve (decrease) the first-order optimality measure by using a sparse QR factorization in each iteration. To do this, set PrecondBandWidth to inf:

options = optimset('PrecondBandWidth',inf);
[x,resnorm,residual,exitflag,output] = ... 
              lsqlin(C,d,[],[],[],[],lb,[],[],options);

The first-order optimality measure decreases:

exitflag =
     1
resnorm =
   22.5794
output = 
       iterations: 12
        algorithm: 'large-scale: trust-region reflective Newton'
    firstorderopt: 5.5907e-015
     cgiterations: 0
          message: [1x104 char]

Example: Jacobian Multiply Function with Linear Least Squares

You can solve a least-squares problem of the form

such that A·x ≤ b, Aeq·x = beq, lb ≤ x ≤ ub, for problems where C is very large, perhaps too large to be stored, by using a Jacobian multiply function.

For example, consider the case where C is a 2n-by-n matrix based on a circulant matrix. This means the rows of C are shifts of a row vector v. This example has the row vector v with elements of the form (–1)k+1/k:

v = [1, –1/2, 1/3, –1/4, ... , –1/n],

cyclically shifted:

This least-squares example considers the problem where

d = [n – 1; n – 2; ...; –n],

and the constraints are –5 ≤ x(i) ≤ 5 for i = 1, ..., n.

For large enough n, the dense matrix C does not fit into computer memory. (n = 10,000 is too large on one tested system.)

A Jacobian multiply function has the following syntax:

w = jmfcn(Jinfo,Y,flag)

Jinfo is a matrix the same size as C, used as a preconditioner. If C is too large to fit into memory, Jinfo should be sparse. Y is a vector or matrix sized so that C*Y or C'*Y makes sense. flag tells jmfcn which product to form:

Since C is such a simply structured matrix, it is easy to write a Jacobian multiply function in terms of the vector v; i.e., without forming C. Each row of C*Y is the product of a shifted version of v times Y. The following matrix performs one step of the shift: v shifts to v*T, where

To compute C*Y, compute v*Y to find the first row, then shift v and compute the second row, and so on.

To compute C'*Y, perform the same computation, but use a shifted version of temp, the vector formed from the first row of C':

temp = [fliplr(v)*T,fliplr(v)*T];

To compute C'*C*Y, simply compute C*Y using shifts of v, and then compute C' times the result using shifts of fliplr(v).

The dolsqJac function in the following code sets up the vector v and matrix T, and calls the solver lsqlin:

function [x,resnorm,residual,exitflag,output] = dolsqJac(n)
%
r = 1:n-1; % index for making vectors

T = spalloc(n,n,n); % making a sparse circulant matrix
for m = r
    T(m,m+1)=1;
end
T(n,1) = 1;

v(n) = (-1)^(n+1)/n; % allocating the vector v
v(r) =( -1).^(r+1)./r;

% Now C should be a 2n-by-n circulant matrix based on v,
% but that might be too large to fit into memory.

r = 1:2*n;
d(r) = n-r;

Jinfo = [speye(n);speye(n)]; % sparse matrix for preconditioning
% This matrix is a required input for the solver;
% preconditioning is not really being used in this example

% Pass the matrix T and vector v so they don't need to be
% computed in the Jacobian multiply function
options = optimset('JacobMult',...
    @(Jinfo,Y,flag)lsqcirculant(Jinfo,Y,flag,T,v));

lb = -5*ones(1,n);
ub = 5*ones(1,n);

[x,resnorm,residual,exitflag,output] = ...
    lsqlin(Jinfo,d,[],[],[],[],lb,ub,[],options);

The Jacobian multiply function lsqcirculant is as follows:

function w = lsqcirculant(Jinfo,Y,flag,T,v)
% This function computes the Jacobian multiply functions
% for a 2n-by-n circulant matrix example

if flag > 0
    w = Jpositive(Y);
elseif flag < 0
    w = Jnegative(Y);
else
    w = Jnegative(Jpositive(Y));
end

    function a = Jpositive(q)
        % Calculate C*q
        temp = v;

        a = zeros(size(q)); % allocating the matrix a
        a = [a;a]; % the result is twice as tall as the input

        for r = 1:size(a,1)
            a(r,:) = temp*q; % compute the rth row
            temp = temp*T; % shift the circulant
        end
    end

    function a = Jnegative(q)
        % Calculate C'*q
        temp = fliplr(v)*T; % the circulant for C'

        len = size(q,1)/2; % the returned vector is half as long
        % as the input vector
        a = zeros(len,size(q,2)); % allocating the matrix a

        for r = 1:len
            a(r,:) = [temp,temp]*q; % compute the rth row
            temp = temp*T; % shift the circulant
        end
    end
end

When n = 3000, C is an 18,000,000-element dense matrix. Here are the results of the dolsqJac function for n = 3000 at selected values of x, and the output structure:

[x,resnorm,residual,exitflag,output] = dolsqJac(3000);

Optimization terminated: relative function value changing by
 less than OPTIONS.TolFun.

x(1)
ans =
    5.0000

x(1500)
ans =
   -0.5201

x(3000)
ans =
   -5.0000

output
output = 
       iterations: 16
        algorithm: 'large-scale: trust-region reflective Newton'
    firstorderopt: 5.9351e-005
     cgiterations: 36
          message: [1x87 char]

Example: Nonlinear Curve Fitting with lsqcurvefit

lsqcurvefit enables you to fit parameterized nonlinear functions to data easily. You can use lsqnonlin as well; lsqcurvefit is simply a convenient way to call lsqnonlin for curve fitting.

In this example, the vector xdata represents 100 data points, and the vector ydata represents the associated measurements. Generate the data using the following script:

stream = RandStream('mt19937ar','Seed',5489); % reproducible
xdata = -2*log(rand(stream,100,1));
ydata = (ones(100,1) + .1*randn(stream,100,1)) + (3*ones(100,1)+...
    0.5*randn(stream,100,1)).*exp((-(2*ones(100,1)+...
   .5*randn(stream,100,1))).*xdata);

The modeled relationship between xdata and ydata is

(6-112)

The script generates xdata from 100 independent samples from an exponential distribution with mean 2. It generates ydata from Equation 6-112 using a = [1;3;2], perturbed by adding normal deviates with standard deviations [0.1;0.5;0.5].

The goal is to find parameters , i = 1, 2, 3, for the model that best fit the data.

In order to fit the parameters to the data using lsqcurvefit, you need to define a fitting function. Define the fitting function predicted as an anonymous function:

a = [1;3;2];
predicted = @(a,xdata) a(1)*ones(100,1)+a(2)*exp(-a(3)*xdata); 

To fit the model to the data, lsqcurvefit needs an initial estimate a0 of the parameters. Enter

a0 = [2;2;2];

Run the solver lsqcurvefit as follows:

[ahat,resnorm,residual,exitflag,output,lambda,jacobian] =...
   lsqcurvefit(predicted,a0,xdata,ydata);

lsqcurvefit calculates the least-squares estimate of :

ahat =
    1.0169
    3.1444
    2.1596

The fitted values ahat are within 8% of a = [1;3;2].

If you have Statistics Toolbox™ software, use the nlparci function to generate confidence intervals for the ahat estimate.

  


Recommended Products

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.

 © 1984-2009- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS