Fit power function to data (optimization)

I am trying to fit a general power function ydata=a*(xdata)^b (where a and b are fitting parameters, xdata and ydata are the given data).
I am trying to use lsqcurvefit to optimize for the parameters a and b but I don't believe I am getting the best fit possible. I believe that the fitting cout be way better, maybe lsqcurvefit is not the best idea for fitting a power function.
Matlab Script:
--------------------------------------------------------------------------------------------------------------------------------------------------------------
clear; clc; close all;
xdata = [0.02 0.05 0.1 0.2 0.3 0.4 0.5];
ydata = [1.7935e-006 5.6359e-006 10.8276e-006 38.1193e-006 60.4152e-006 65.8997e-006 62.0615e-006];
fun = @(x,xdata) x(1)*xdata.^x(2);
x0 = [1,0.5]; %initial guess
x = lsqcurvefit(fun,x0,xdata,ydata)
loglog(xdata,ydata,'ro', xdata, fun(x,xdata),'r-')
--------------------------------------------------------------------------------------------------------------------------------------------------------------
As you can see, the fitted line could be way better and pass through more data .
If you have any suggestion in how I can improve my code to obtain a better fit, I am open to it and I appreciate it.

 Accepted Answer

Your problem is, you are trying to use LEAST squares. But look carefully at your data.
xdata = [0.02 0.05 0.1 0.2 0.3 0.4 0.5];
ydata = [1.7935e-006 5.6359e-006 10.8276e-006 38.1193e-006 60.4152e-006 65.8997e-006 62.0615e-006];
Do you see that some of your data is nearly 2 powers of 10 larger then the rest? You plot it on a log scale, so you do't really think of it. It looks like just a straight line on those axes.
So what happens is error in some of your data points are WAY more important than are errors in the others. This is a case where you do NOT want to use a simple least squares to estimate those coefficients. You might decide to use a weighted least squares. A simple alternative is to use a straight line fit, to the log of your data.
Why is that a good idea? Because in the case you have, you really do not want to use an error structure that is additive, with a uniform variance on all data points. Instead, you want to use a proportional error structure, s multiplicative errors. What happens when you take the log? It turns what you have, thus a lognormally distributed error, into normally distributed, additive error.
xlog = log(xdata);
ylog = log(ydata);
P1 = fit(xlog',ylog','poly1')
P1 =
Linear model Poly1: P1(x) = p1*x + p2 Coefficients (with 95% confidence bounds): p1 = 1.189 (0.9811, 1.397) p2 = -8.535 (-9, -8.071)
plot(P1,xlog,ylog,'o')
I used fit there, because it makes the plotting really easy.
But you should see how well it worked here, giving you the model you were hoping to find.
Now we need to transform the model back into a and b.
b = P1.p1
b = 1.1890
a = exp(P1.p2)
a = 1.9645e-04
The distinction I pointed out here is an important one.

5 Comments

John‘s job above is just through a way of Linearization of nonlinear problems, in most of cases, this approach is effective, but sometime, there will be large error.
for original power function: y=p2*(x)^p1, the result by direct fitting will get:
p1 0.811686752153411
p2 0.00012767558643822
while, the finally result by John of above:
p1 1.18896049632509
p2 0.000196453171300679
the error for p1 and p2 will be 46.48% and 53.87%, respectively, so, at least in this case, "Linearization" is not a good way.
Not sure why alfredo get that bad result, it is actually a simple fitting problem
Root of Mean Square Error (RMSE): 7.82894795818432E-6
Sum of Squared Residual: 4.2904698292371E-10
Correlation Coef. (R): 0.958030219786875
R-Square: 0.917821902024887
Parameter Best Estimate
---------- -------------
p1 0.811686748880694
p2 0.000127675586145436
I see, would you mind showing in more detail how did you obtain the plot and the parameters?
In your Matlab code, don't use "Log" scale for your chart, and then it should look better
yes but you mention "by direct fitting" how did you do the fitting?
Alex Sha uses a completely different (and expensive) optimisation package, not MATLAB.
Ths point is that a power law fit does not work with the available data. It’s simply not an appropriate model.

Sign in to comment.

More Answers (1)

The problem is the model. It may not be appropriate for these data.
A logistic curve fits much better —
xdata = [0.02 0.05 0.1 0.2 0.3 0.4 0.5];
ydata = [1.7935e-006 5.6359e-006 10.8276e-006 38.1193e-006 60.4152e-006 65.8997e-006 62.0615e-006];
% fun = @(x,xdata) x(1)*xdata.^x(2);
fun = @(b,x) b(1)./(1+exp(b(2).*(x-b(3))));
% x0 = [1,0.5]; %initial guess
x0 = rand(1,3).*[1E-5 -15 median(xdata)];
x = lsqcurvefit(fun,x0,xdata,ydata)
Local minimum possible. lsqcurvefit stopped because the final change in the sum of squares relative to its initial value is less than the value of the function tolerance.
x = 1×3
0.0001 -23.1434 0.1804
figure
plot(xdata,ydata,'ro', xdata, fun(x,xdata),'r-')
figure
loglog(xdata,ydata,'ro', xdata, fun(x,xdata),'r-')
.

Categories

Products

Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!