Model fit using lsqcurvefit (non-linear least squares fitting)

Question

0 votes

Hi,

I've been trying to use lsqcurvefit for a simple equation: y = a*x(1) + b*x(2) + c*x(3), where a,b and c are the unknowns (constants) and I have the vectors y,x(1),x(2) and x(3).

For now, I have a reference for 'a' so I know if the values I get from the model are more or less correct.

When I tried linear least squares functions the results were very unstable and 'a' can get almost every value depending on the value of 'c'.

However, using lsqcurvefit I got putty good estimations of 'a' but I still have stability/sensitivity issues depending on initial conditions and boundries.

Does anyone know what are the best assigned properties for lsqcurvefit to solve such a model?

Also, I don't understand why lsqcurvefit yield better results than the simpler linear fitting functions.

Are there better alternatives? 'fminsearch' for example or other?

Thanks !

3 Comments
Show 1 older comment Hide 1 older comment

Zack on 27 Oct 2012

1. Both has measurement noise. x(2) and x(3) has the largest noise. y and x(1) are much less noisy.

2. Using regress,lsqlin,etc.. You mean there might be an error in the code?

3. If I put a number instead of 'c'.

4. so why do I get good results using lsqcurvefit ? This is true for ~10 different sets of data.

Matt J on 28 Oct 2012

I can't explain why lsqcurvefit does better, based on your comments so far, but it sounds like you are doing ordinary least squares instead of total least squares fitting. The latter is better when you have noise in x. The code in my answer below implements it, so you might want to give it a try.

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Star Strider on 27 Oct 2012

Edited: Star Strider on 27 Oct 2012

Open in MATLAB Online

0 votes

The only suggestion I have is to explore the possibilities of optimset.

'When I tried linear least squares functions the results were very unstable and 'a' can get almost every value depending on the value of 'c'.'

This suggests to me that at least x(1) and x(3) and perhaps y could be highly correlated. You may want to explore these possibilities with corrcoef and related functions. See what the results of:

[R,P] = corrcoef([x(1) x(2) x(3)]);

are. (I assume here that x(1), x(2), and x(3) are column vectors.) Since the R values may be difficult to interpret, look at the P values. A value of P <= 0.05 means that the corresponding colums of x are highly correlated.

4 Comments
Show 2 older comments Hide 2 older comments

Star Strider on 27 Oct 2012

Edited: Star Strider on 27 Oct 2012

You might want to consider experimenting with the various tolerances that Optimization Options offers.

One thing I didn't ask about previously was the possiblity that you could have scaling problems — some of your variables could be several orders of magnitude different from others. This frequently causes problems in linear and nonlinear regression because it challenges machine precision limits. The solution in that situation is relatively easy, since your function is linear — normalize your x vectors (multiplying them by individual scale factors) to roughly the same orders of magnitude (preferably to within about 100 times each other or less), then rescale your estimated parameters by those factors when your regression has converged. The nonlinear parameter estimation functions are relatively more robust to scaling problems than the linear parameter estimation functions are, so lsqcurvefit might have worked when the linear functions failed.

If scaling isn't the problem, I suggest that you explore the possibility that your variables could be correlated.

Zack on 29 Oct 2012

Hi,

Thank you for your answer ! I don't understand why is it a problem if the variables (x(1) and x(3) for example) are correlated.

Star Strider on 29 Oct 2012

If they're correlated, the columns are essentially linearly dependent (or close to it). It is then difficult to uniquely determine the coefficients. The problem arises in a statement like B = X\Y that is essentially equivalent to B = inv(X)*Y (for illustration only — please don't use inv). In this situation, X with correlated columns is singular or close to it. That could explain your inability to uniquely estimate some of your coefficients.

Sign in to comment.

Answer 2

Matt J on 27 Oct 2012

Open in MATLAB Online

0 votes

See if this does what you need. It's from a recent Newsgroup thread

   function [x0, a, d, normd] = lsplane(X)
  % ---------------------------------------------------------------------
  % LSPLANE.M Least-squares plane (orthogonal distance
  % regression).
  %
  % Version 1.0
  % Last amended I M Smith 27 May 2002.
  % Created I M Smith 08 Mar 2002
  % ---------------------------------------------------------------------
  % Input
  % X Array [x y z] where x = vector of x-coordinates,
  % y = vector of y-coordinates and z = vector of
  % z-coordinates.
  % Dimension: m x 3.
  %
  % Output
  % x0 Centroid of the data = point on the best-fit plane.
  % Dimension: 3 x 1.
  %
  % a Direction cosines of the normal to the best-fit
  % plane.
  % Dimension: 3 x 1.
  %
  % <Optional...
  % d Residuals.
  % Dimension: m x 1.
  %
  % normd Norm of residual errors.
  % Dimension: 1 x 1.
  % ...>
  %
  % [x0, a <, d, normd >] = lsplane(X)
  % ---------------------------------------------------------------------
  % check number of data points
     m = size(X, 1);
     if m < 3
       error('At least 3 data points required: ' )
     end
  %
  % calculate centroid
     x0 = mean(X)';
  %
  % form matrix A of translated points
     A = [(X(:, 1) - x0(1)) (X(:, 2) - x0(2)) (X(:, 3) - x0(3))];
  %
  % calculate the SVD of A
     [U, S, V] = svd(A, 0);
  %
  % find the smallest singular value in S and extract from V the
  % corresponding right singular vector
     [s, i] = min(diag(S));
     a = V(:, i);
  %
  % calculate residual distances, if required
     if nargout > 2
       d = U(:, i)*s;
       normd = norm(d);
     end

5 Comments
Show 3 older comments Hide 3 older comments

Zack on 30 Oct 2012

Let me try to re-phrase my issue after working through this today:

I have a simple equation: y = a*x1+b*x2+c*x3, where x1, x2, x3, y are data vectors and a,b,c are the unknown parameters that I would like to estimate in Matlab.

a,b,c represent physical parameters - one of which (a) is measurable by an independent method. The data vectors x1 and x3 are indeed linearly-dependent - that is, x1 and x3 change in very similar ways throughout the time series. Because of this, any linear least squares regression I perform (backslash operator, regress, lsqlin) will set c to zero and spits out a basic solution for parameters a and b. This makes sense because essentially there is not enough information for the regression to uniquely determine each parameter. The end result is that the Matlab solution for 'a' does not agree with my independent measurement of 'a'.

However, when I use some of the iterative nonlinear regression options in Matlab (for example, lsqcurvefit with algorithm: 'large-scale: trust-region reflective Newton'), the optimization gives realistic values for a,b,c. The Matlab estimate of 'a' agrees well with the independent method of measuring 'a'. The Matlab solution does depend on my choice of the initial condition, however. Also, only some of the nonlinear algorithms give the correct solution for 'a', while others don't.

I am trying to understand why these iterative optimization schemes seem to work over the linear least squares regression for collinear data? What unique features of LSQCURVEFIT allow it to overcome collinearity? Or, am I just getting the right value of 'a' with lsqcurvefit by chance because I happened to pick a lucky initial condition?

Thanks in advance for your continued help in understanding this!

Matt J on 30 Oct 2012

What unique features of LSQCURVEFIT allow it to overcome collinearity? Or, am I just getting the right value of 'a' with lsqcurvefit by chance because I happened to pick a lucky initial condition?

Yes, it's just a matter of luck. There are obviously infinite solutions if your x1,x2,x3 are linearly dependent. If you initialize at any of these solutions themselves, any normal solver will refuse to progress and offer that point as the solution.

Ahmed Elsherif on 19 Feb 2020

Open in MATLAB Online

Hi,

By using

[x,resnorm,~,exitflag,output] = lsqcurvefit(f,x0,E,F3g)

How to set limits for the independent variable , E? I need to do the fitting from E=0 to E=17.

Thanks in advance

Sign in to comment.

Model fit using lsqcurvefit (non-linear least squares fitting)

3 Comments
Show 1 older comment Hide 1 older comment

Answers (2)

4 Comments
Show 2 older comments Hide 2 older comments

5 Comments
Show 3 older comments Hide 3 older comments

Categories

Tags

Community Treasure Hunt

Model fit using lsqcurvefit (non-linear least squares fitting)

3 Comments Show 1 older comment Hide 1 older comment

Answers (2)

4 Comments Show 2 older comments Hide 2 older comments

5 Comments Show 3 older comments Hide 3 older comments

Categories

Tags

See Also

Community Treasure Hunt

3 Comments
Show 1 older comment Hide 1 older comment

4 Comments
Show 2 older comments Hide 2 older comments

5 Comments
Show 3 older comments Hide 3 older comments