R2 between histogram and (weibull) distribution

Looking for the following:
R2 between histogram of parameter 'Weibull.Wh_Wp' and weibull distribution of parameter 'Weibull.Wh_Wp'.
[parmHat, parmCI]=wblfit(Weibull.Wh_Wp);
X=linspace(min(Weibull.Wh_Wp),max(Weibull.Wh_Wp));
plot(X,wblpdf(X,parmHat(1),parmHat(2)),'Color','k', 'LineWidth',1.3)
hold on
histogram(Weibull.Wh_Wp,'BinWidth',10,'FaceColor',[0.4660 0.6740 0.1880], 'EdgeColor', [0.45 0.45 0.45],'Normalization','pdf')
Below the figure of the parameter 'Weibull.Wh_Wp' with its histogram and fitted weibull distribution.

 Accepted Answer

Try something like this —
wblfcn = @(p,x) wblpdf(x,p(1),p(2));
Weibull.Wh_Wp = wblrnd(150, 3, 1, 100);
X=linspace(min(Weibull.Wh_Wp),max(Weibull.Wh_Wp));
[parmHat,parmCI] = wblfit(Weibull.Wh_Wp);
plot(X,wblpdf(X,parmHat(1),parmHat(2)),'Color','k', 'LineWidth',1.3)
hold on
h1 = histogram(Weibull.Wh_Wp,'BinWidth',10,'FaceColor',[0.4660 0.6740 0.1880], 'EdgeColor', [0.45 0.45 0.45],'Normalization','pdf');
ctrs = h1.BinEdges(1:end-1) + diff(h1.BinEdges(1:2))/2;
y = h1.Values;
wblmdl = fitnlm(ctrs, h1.Values, wblfcn, parmHat)
wblmdl =
Nonlinear regression model: y ~ wblpdf(x,p1,p2) Estimated Coefficients: Estimate SE tStat pValue ________ _______ ______ __________ p1 151.42 9.9346 15.241 1.7925e-12 p2 2.6423 0.36199 7.2994 4.6607e-07 Number of observations: 22, Error degrees of freedom: 20 Root Mean Squared Error: 0.00251 R-Squared: 0.373, Adjusted R-Squared 0.341 F-statistic vs. zero model: 41.9, p-value = 7.05e-08
I’m not certain that an value on a distribution fit using maximum likelihood is statistically correct, however this will provide it if desired. See the documentation for fitnlm for details on it.
.

4 Comments

Thank you so much for your quick response and the script.
Now that I think about it more, I think you're right that the using a R2 value statistically not correct.
Do you have an idea how I can prove that a certain distribution fits well?
My pleasure.
The problem is that the independent variable value here is artificial (bin locations), and the dependent variable data are histogram counts.
Probably the most reliable fit information here is ‘parmCI’ since that tells how well the parameters are estimated. If the confidence limits for each parameter are of the same sign, the parameter is significant and the fit is appropriate. If they are opposite signs for any parameter, that means that the confidence limits include zero, and the parameter is not required in the estimation (so the fit is likely not good).
So I would use (or cite) the parameter confidence limits as the only necessary measure of the goodness-of-fit here.
.
Thank you so much for explaining and taking the time. It has helped a lot.
As always, my pleasure!
Thank you!
.

Sign in to comment.

More Answers (0)

Products

Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!