How can I find the values for Settting Start Points when fitting point to a curve?

40 views (last 30 days)
Hello:
I'm trying to fit some points
NUMBERof = transpose([8 12 36 37 15 152 21 182 107 169 235 324 394 ...
462 545 707 655 769 832 838 812 849 864 950 932 879 604 637 743 ...
757 683 605 510 619 517 567 523 551 585 565 410 399 430 435 440 ...
367 378]);
DAY = transpose(1:length(NUMBERof));
with a Gaussian model:
fittype('a1*exp(-((x-b1)/c1)^2) + a2*exp(-((x-b2)/c2)^2) + a3*exp(-((x-b3)/c3)^2)',...
'dependent',{'y'},...
'independent',{'x'},...
'coefficients', {'a1','b1','c1','a2','b2','c2','a3','b3','c3'});
myprevof = fit(DAY, NUMBERof,...
'a1*exp(-((x-b1)/c1)^2) + a2*exp(-((x-b2)/c2)^2) + a3*exp(-((x-b3)/c3)^2)',...
'start',[175 20 1 550 15 5 550 30 15]);
As you can see I've choosed a set of starting points (and I've tried several more) for the fitting, but the curve I always get is very bad.
I'd like to know if there's a method to calculate the better (or at least a good one) value for Star Points, and what is "Start Point" meaning to understand better the fitting.
Thank you in advance for your help.
Sergio.

Accepted Answer

John D'Errico
John D'Errico on 25 Apr 2020
Edited: John D'Errico on 25 Apr 2020
First, you don't understand how to use fittype. fittype does not set something that fit can use, UNLESS YOU PASS IN THE RESULTS OF FITTYPE INTO FIT.
Second, your data really does not merit three terms.
Finally, you can just use the gauss2 model form anyway.
F = fittype('gauss2')
F =
General model Gauss2:
F(a1,b1,c1,a2,b2,c2,x) = a1*exp(-((x-b1)/c1)^2) + a2*exp(-((x-b2)/c2)^2)
Now, call fit. But what parameter start estimates make sense there? PLOT YOUR DATA. LOOK AT THE PLOT. THINK ABOUT WHAT YOU SEE. ALWAYS DO THESE THREE THINGS.
plot(DAY,NUMBERof,'o')
grid on
xlabel DAY
ylabel casualties
Now you clearly have a peak around day 23. How wide is it? The width of that main peak is probably around 25-30 days, if we guess where the 10% points on the peak are. Divide that width by 4, to give a decent scale parameter in this form. How high is the main peak? I'll guess 800 at the center.
The second term is off to the right. Say it peaks around day 40. The width is wider, maybe I'll guess a width of 60 days. (Again, divide by 4 to be the parameter estimate.) How high? Say 300.
We don't need perfect numbers here. How do we use fit now? PASS IN F!
mdl = fit(DAY,NUMBERof,F,'start',[ 800, 23, 7.5, 300, 40, 15])
mdl =
General model Gauss2:
mdl(x) = a1*exp(-((x-b1)/c1)^2) + a2*exp(-((x-b2)/c2)^2)
Coefficients (with 95% confidence bounds):
a1 = 641.3 (247.6, 1035)
b1 = 20.62 (19.74, 21.5)
c1 = 8.827 (6.169, 11.48)
a2 = 522.1 (422.6, 621.7)
b2 = 36.25 (28.74, 43.76)
c2 = 17.25 (8.059, 26.44)
My initial estimates were not dead on, but reasonably close. The secondary peak is disappointingly higher than I would like to see. But then people are too often foolish.
plot(mdl)
hold on
plot(DAY,NUMBERof,'bo')
xlabel DAY
ylabel casualties
grid on
Again, the fit seems entirely reasonable. The noise around the curve seems pretty random at this point, with little apparent lack of fit. That tells me any attempt to estimate a THIRD gaussian peak from that data would be a complete joke. You would be kidding yourself if you tried that, without considerably more and considerably better data.
In order to choose intelligent estimates for starting points, you need to understand what each parameter in the model does, and how it impacts the shape of the curve. Sorry, but there is no magical way to do that without understanding the model you are using.
  6 Comments
Sergio Sanz
Sergio Sanz on 26 Apr 2020
Hello again:
It's easy to understand the behaviour of the curve when changing the coefficients, but my troubles are about the STARTING VALUES to plot the graphics in Matlab.
I can't see the relationship between the two things.
Sorry for questioning again.
John D'Errico
John D'Errico on 26 Apr 2020
Edited: John D'Errico on 26 Apr 2020
Here I'm a bit confused as to what you are asking. You only need to supply starting values to perform the fit. Essentially, the fitting routine needs to know a guess as to reasonable set of parmeters for the model. Then it refines those estimates. But you don't need starting values for the plot.
For example, here is the model I fit to your data.
F = fittype('gauss2');
mdl = fit(DAY,NUMBERof,F,'start',[ 800, 23, 7.5, 300, 40, 15])
mdl =
General model Gauss2:
mdl(x) = a1*exp(-((x-b1)/c1)^2) + a2*exp(-((x-b2)/c2)^2)
Coefficients (with 95% confidence bounds):
a1 = 641.3 (247.6, 1035)
b1 = 20.62 (19.74, 21.5)
c1 = 8.827 (6.169, 11.48)
a2 = 522.1 (422.6, 621.7)
b2 = 36.25 (28.74, 43.76)
c2 = 17.25 (8.059, 26.44)
plot(mdl)
hold on
plot(DAY,NUMBERof,'bo')
xlabel DAY
ylabel casualties
grid on
As you can see, I supplied starting values. But there were needed for the fit only. After the fit is performed, that model structure contains the fitted parameters. Here we can see it does contain the fitted values:
mdl.a1
ans =
641.28
At this point, no starting values are needed. You can do the plot directly from the model.
So are you talking about the prediction intervals around the model as you plottted it? The predint code does allow you to supply some additional information to it, but the default values will be reasonabe to use as they are. I could explain some of them, if that is your question. I'm just not sure what you mean about starting values for the plot.

Sign in to comment.

More Answers (0)

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!