Error trying to fit data to gaussian function: Inf computed by model function, fitting cannot continue

Question

0 votes

I am trying to find out whether my data fit to a gaussian distribution or not and am using the fit function for that.

When I try to execute this line:

[~,gof]=fit((1:length(median(norm_spikes_flight_control,2,'omitnan')))', ...
                      median(norm_spikes_flight_control,2,'omitnan'),'gauss1');

I get the following error message:

Error using fit>iFit
Inf computed by model function, fitting cannot continue.
Try using or tightening upper and lower bounds on coefficients.
Error in fit (line 116)
[fitobj, goodness, output, convmsg] = iFit( xdatain, ydatain, fittypeobj, ...

I have attached the 3001x1 double that results from "median(norm_spikes_flight_control,2,'omitnan')" for a set of data that gives no error and one dataset where the code above works just fine.

If anyone is sufficiently knowledgeable what is happening here and how to find my error, I would be extremely grateful.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

dpb on 10 Aug 2023

Edited: dpb on 10 Aug 2023

Open in MATLAB Online

1 vote

d1=readmatrix('data_working.txt');

d2=readmatrix('data_error.txt');

plot([d1 d2]), xlim([1 numel(d1)]), legend('Working','Error')

Neither of those look anything at all like a single-peak Gaussian when plotted seqentially which is what the fit() algorithm tries to match.

You can use Statistics TB fitdist to consider the data as a whole without the given order; histfit does it and shows a figure when done...

histfit(d1)

fitdist(d1,'normal')

ans =

NormalDistribution Normal distribution mu = -0.0737158 [-0.0755425, -0.0718891] sigma = 0.0510356 [0.0497764, 0.0523606]

[mean(d1) std(d1)]

ans = 1×2

-0.0737 0.0510

The result for it is, you see simply the mean and standard deviation with error estimates.

histfit(d2)

What is the purpose behind trying to fit a Gaussian model to those data as shown originally? That model simply doesn't fit the data in that form and the histograms show the overall distributions are multi-mode and also far from being simply a single Gaussian distribution.

ADDENDUM

I forgot to add the look at what the fitted model actually did...

x=[1:numel(d1)].';

[f,gof]=fit(x,d1,'gauss1')

f =

General model Gauss1: f(x) = a1*exp(-((x-b1)/c1)^2) Coefficients (with 95% confidence bounds): a1 = 0.01526 (-0.02586, 0.05637) b1 = 607.3 (559.1, 655.5) c1 = 21.89 (-46.23, 90.02)

gof = struct with fields:

sse: 24.1150 rsquare: -2.0862 dfe: 2998 adjrsquare: -2.0882 rmse: 0.0897

plot(f,x,d1,'fit','residuals')

subplot(2,1,1), ylabel('D1'), xlim([1 numel(d1)])

legend location southwest

subplot(2,1,2), ylabel('Residuals'), xlim([1 numel(d1)])

legend location southwest

As the above shows, the fitted model is totally inadequate; an adjusted R-square of -2 is absurd. If one looks very carefully, one can see the model fit red line overlays the blue data at the peak around x=600; otherwise the model prediction is essentially zero. The plot of the model residuals is almost indistinguishable from the data plot although the magnitude of the peak at x=607 is almost removed; anywhere else farther away from that location parameter value the exponential term rapidly vanishes. The model simply doesn't represent the data at all in that form.

figure, hAx=subplot(2,1,1);

xlim([1 numel(x)])

plot(f)

xlim([1 numel(x)])

legend off

title('Model Full Range')

subplot(2,1,2)

xlim([400 800])

plot(f)

legend off

title('Model About Central Peak')

illustrates the model output itself without the original data; it shows that it is, indeed, a Gaussian, but that other than around the one central location the value is essentially zero because the exponential vanishes.

Again, would need to know what the end purpose here is in order to have any idea of what might be feasible/make sense; simply throwing data at a model without exploratory analysis is exceedingly dangerous.

4 Comments
Show 2 older comments Hide 2 older comments

dpb on 10 Aug 2023

Open in MATLAB Online

data_error.txt

d=readmatrix('data_error.txt');

plot(d), xlim([1 numel(d)]), legend('Error')

x=[1:numel(d)].';

[f,gof]=fit(x,d,'gauss2')

f =

General model Gauss2: f(x) = a1*exp(-((x-b1)/c1)^2) + a2*exp(-((x-b2)/c2)^2) Coefficients (with 95% confidence bounds): a1 = -0.07336 (-0.0742, -0.07251) b1 = 883.1 (876.3, 889.8) c1 = 559.7 (550.1, 569.2) a2 = -0.09688 (-0.09839, -0.09537) b2 = 2978 (2935, 3020) c2 = 1048 (1003, 1093)

gof = struct with fields:

sse: 0.2379 rsquare: 0.8834 dfe: 2995 adjrsquare: 0.8832 rmse: 0.0089

plot(f,x,d,'fit','residuals')

subplot(2,1,1), ylabel('D'), xlim([1 max(x)])

legend location southwest

subplot(2,1,2), ylabel('Residuals'), xlim([1 max(x)])

legend location southwest

Got to wondering how it would do if did try to model the larger patterns; then it doesn't do too badly; the residuals plot shows quite a lot of finer structure remaining, but the overall representation isn't too bad; the adjusted Rsq is almost 0.9.

Now, whether it means anything or not is another question; just because you can do something doesn't mean you should.

dpb on 7 Sep 2023

Edited: dpb on 7 Sep 2023

That would be one conclusion to be drawn, yes. I haven't dug into the details of the solution algorithm used as to the exact cause of the inf being returned; somewhere it's dividing by zero which most likely comes from underflow of the tails of the normal pdf function being evaluated. But, I'd contend that's certainly not a reliable test; it's simply a heuristic occurrence that is dependent upon some specific condition within a dataset.

For your case, given the sample data, I'd say neither is a normal distribution and that your analysis/observation of your data should be prepared to observe/find the multi-modal cases and identify them as opposed to a case that actually does show a normal-like distribution.

I <answered a previous Q? some time back> on testing for normality, I still haven't implemented an m-file to do the Shapiro-Wilk, however.

Christian Kraus on 14 Sep 2023

Thank you again for taking the time with this!

I will give this some thought and discuss this with some people in our group to see how we want to tackle this in the future. Definitely something to keep i

When it comes to the Shapiro-Wilk test, there is a function by Ahmed Ben Saïda in the community file exchange that I could use for testing for normality.

Sign in to comment.

Error trying to fit data to gaussian function: Inf computed by model function, fitting cannot continue

0 Comments
Show -2 older comments Hide -2 older comments

Answers (1)

4 Comments
Show 2 older comments Hide 2 older comments

Categories

Products

Release

Tags

Community Treasure Hunt

Error trying to fit data to gaussian function: Inf computed by model function, fitting cannot continue

0 Comments Show -2 older comments Hide -2 older comments

Answers (1)

4 Comments Show 2 older comments Hide 2 older comments

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

4 Comments
Show 2 older comments Hide 2 older comments