Analyzing data from dataset structures using kruskalwallis function and grouping variable
4 views (last 30 days)
Show older comments
Hi,
I'm having trouble determining the most elegant method for analyzing data from my dataset structure, and it may hinge on my (lack of) understanding of the 'grouping variable' I'm trying to apply when performing a 'kruskalwallis' function call OR the grouping variable may just be implemented in a clunky way.
The essence of the problem is this: I have a dataset with many rows and columns, and I want to be able to perform analysis (using anova1 or kruskalwallis) on any grouping of the data simply by specifying different groups (specifically, more than one group at a time) in my grouping variable when I invoke the specific analysis function. The problem is that different functions appear to respond differently to the same grouping variable, some accepting it, some rejecting it.
E.g. If I perform a 'boxplot' function call with the following notation, referencing three of my different columns, 'A1', 'txt_length' and 'gxt': boxplot(A_15_acute_chronic.A1,{A_15_acute_chronic.txt_length,A_15_acute_chronic.gxt},'notch','on');
... everything works fine. The boxplot shows results for 8 different groups (2 (from txt_length) x 4 (from gxt) = 8 total).
However, if I perform a 'kruskalwallis' function call with the following notation: [p,t,s] = kruskalwallis(A_15_acute_chronic.A1,{A_15_acute_chronic.txt_length,A_15_acute_chronic.gxt});
I get this error: Error using anova1 (line 80) X and GROUP must have the same length.
Error in kruskalwallis (line 44) [p,anovatab,stats] = anova1('kruskalwallis', varargin{:});
Yet the grouping variables are the same between the calls, and they're taken from the same dataset as the observations so they're definitely the same length.
I know a clunky way of dealing with this problem, and that is to specify a new column/variable in the dataset which simply encodes the combination of the two lower variables, but this seems ridiculously inelegant and tedious. Can anyone tell me what I might be doing wrong?
Thanks for your time, Chris
0 Comments
Answers (2)
Peter Perkins
on 12 Mar 2012
Chris, I think you should be using FRIEDMAN, not KRUSKALWALLIS. The latter is for a one-way test, and you're using two grouping variables.
Peter Perkins
on 12 Mar 2012
Kruskal-Wallis as I understand it is a one-way test. Accepting two grouping variables would not make sense. Nothing to do with MATLAB.
Friedman's test is for balanced data, which is why the FRIEDMAN function doesn't have the same signature as, say, BOXPLOT or ANOVAN.
It does seem that the error messages you saw are not all that helpful though. I've made a note to look into improving them.
0 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!