Effective Categorical Coding Scheme?

2 views (last 30 days)
dpb
dpb on 17 Nov 2014
Commented: dpb on 20 Nov 2014
Background -- Have a dataset of the various accounts, income and expense for a educational foundation. Within one group of, say, the "Designated" funds there are some 40-50 individual funds which can be used for some sizable number of objectives. I'd like to code into a (relatively) small number of categories to look at general trends by classes of programs rather than by each individual accounting category.
Difficulty/question -- I can define a hierarchical definition of the classes I can think of that are interesting but it grows geometrically and haven't yet come up with a really easily-used storage/coding scheme.
Example--Take the general top-level program areas. These can be considered as one of three--
Scholarships
Grants
Programs
Now within Scholarships there are two general classes
Academic
Athletic
Within each of these there are a couple of others; for Athletic at least
In-State
Out-of-State
are interesting as they have significantly different cost structure associated with each.
So, already there are a total of five conditions for Athletic Scholarships--this gets much larger as the rest of the areas are enumerated.
Anybody have a good way to code such so that can reasonably effectively pickout the various conditions desired and lump subcategories or pick combinations thereof?
One idea is a logical for each end condition but it gets awkward simply by numbers.
  1 Comment
dpb
dpb on 20 Nov 2014
Interesting NO responses...I seem to have a knack for it. :)
I went ahead w/ the job and built a set of nominal/ordinal variables that pseudo-replicate a dummy-variable solution except it's hierarchical. Works but is somewhat klunky to use albeit grpstats comes to the rescue for the most part.
MATLAB needs a BY keyword a la SAS and a more effective converter from structures to datasets. Ends up being very verbose to address this way but did get me where needed to go for the time being.
The idea broached earlier about top-level dynamic structure naming would have made it much easier in the structure approach taken.
When started I thought it would be simple to simply (so to speak :) ) use struct2dataset but it doesn't traverse the structure depth so doesn't really help. Being pressed for time, I just went ahead brute force at that point instead; perhaps with more time-in-grade with Matlab structures and datasets I'd have pressed forward and found a better representation.

Sign in to comment.

Answers (0)

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!