Edges and Observed values of Chi2gof

167 views (last 30 days)
Nathan
Nathan on 26 Dec 2025 at 17:16
Commented: William Rose on 28 Dec 2025 at 6:40
I am trying to do a chi squared test on the means of this data which I have already sorted and is not the issue here. However, my observed values are just 4 ones even though I inputted the means I calculated. Does anyone know why this happens and how to solve it? Your help would be appreciated.
a=readmatrix("DIP Project- Population.xlsx");
population=a(:,1);
production=a(:,7);
gr1gen36=a(61:90,7);
gr1gen36(isnan(gr1gen36))=0;
gr1gen36;
m1=mean(gr1gen36);
gr2gen36=a(151:180,7);
gr2gen36(isnan(gr2gen36))=0;
gr2gen36;
m2=mean(gr2gen36);
gr3gen36=a(241:270,7);
gr3gen36(isnan(gr3gen36))=0;
m3=mean(gr3gen36);
gr4gen36=a(331:360,7);
gr4gen36(isnan(gr4gen36))=0;
gr4gen36;
m4=mean(gr4gen36);
gen36=[m1,m2,m3,m4];
expected=mean(gen36);
ex=[expected,expected,expected,expected];
[h,p,tbl]=chi2gof(gen36,'Expected',ex,'Alpha',0.05)
tbl =
struct with fields:
chi2stat: 66.0168
df: 3
edges: [9.3667 13.5667 17.7667 21.9667 26.1667]
O: [1 1 1 1]
E: [18.4500 18.4500 18.4500 18.4500]
  6 Comments
Nathan
Nathan on 26 Dec 2025 at 20:28
Edited: Nathan on 26 Dec 2025 at 20:30
The input to chi2gof is [21.7000 26.1667 16.5667 9.3667].
Edit: If chi2gof() won't work I can always manually calculate the chi-squared value like so:
OE=(gen36-expected).^2/expected
sum(OE)
chi2inv(0.05,3)
Your help is greatly appreciated :)
Nathan
Nathan on 26 Dec 2025 at 22:53
I have solved my issue by doing the chi-squared goodness of fit test manually to get a more reasonable value. Your help has been greatly appreciated and I thank you very much :)

Sign in to comment.

Accepted Answer

the cyclist
the cyclist on 26 Dec 2025 at 21:04
Edited: the cyclist on 26 Dec 2025 at 21:40
Here's what is happening. chi2gof() is expecting the raw, observed data. It is not expecting you to have precalculated the means. Therefore, what is it doing with your inputs?
It thinks that all of your observed data points are
x = [21.7000 26.1667 16.5667 9.3667]; % Total of four data points, to be binned
You then effectively tell it that you have four bins, because that is the length of the vector ex.
So, what does chi2gof do with this information? It puts one value of x into each of the four bins. This is why tbl.O = [1 1 1 1]. And then it (correctly) calculates the chi^2 stat, based on one observation in each bin, when it was told to expect 18.45 observations per bin.
You should have fed chi2gof() the raw counts.
  12 Comments
the cyclist
the cyclist on 27 Dec 2025 at 2:21
Well, a complication is that the test is for categorical or nominal variables, and yours is continuous. So in some ways it is just not the right test. You'll could bin the data to put them into a contingency table, and then it looks like you can use crosstab to report on the test. Maybe an AI can help you write the code. Depending on how critical the result is, you might want solicit someone with greater expertise, if you can. I don't want you to rely on my quick thoughts on a Friday evening.
That being said, a bigger issue is that the appropriate research question and statistical tests should really be determined before the results are seen. I think ANOVA (or perhaps kruskalwallis), by itself, is likely the best test, full stop.
William Rose
William Rose on 28 Dec 2025 at 6:40
@Nathan, I think @the cyclist is providing you with excellent advice: do the ANOVA or kruksal-wallis, then stop. And 100% for "the appropriate research question and statistical tests should really be determined before the results are seen".

Sign in to comment.

More Answers (0)

Tags

Products


Release

R2025a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!