Converting categorical data to prespecified numbers

Hi,
I have a categorical array : ['SC' 'SC' 'SC' 'SC' 'SC' 'SC' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'MI' 'MI' 'MI' 'MI' 'MI' 'MI' 'MI' 'MI' 'MN' 'MN' 'MN' 'MN' 'MN' 'MN'];
I want to convert each category to a number of my choice: for instance SC should be 23, GA=10;MI=13,MN=15 etc... How can I do so?

 Accepted Answer

keys = categorical({'SC', 'GA', 'MI', 'MN'});
values = [23, 10, 13, 15];
[found, where] = ismember(yourcategoricalarray, keys)
correspondingvalues = nan(size(yourcategoricalarray));
correspondingvalues(found) = values(where(found));
correspondingvalues will be nan for those entries you don't care about.

More Answers (1)

You can do the do this as follow,
data = {'SC' 'SC' 'SC' 'SC' 'SC' 'SC' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA'...
'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA'...
'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA'...
'GA' 'GA' 'GA' 'GA' 'MI' 'MI' 'MI' 'MI' 'MI' 'MI' 'MI' 'MI' 'MN' 'MN'...
'MN' 'MN' 'MN' 'MN'};
dataCategorical = categorical(data);
tablePattern = categorical({'SC', 'GA', 'MI', 'MN'}); % make list of all unique pattern
valueValue = [23, 10, 13, 15]; % make list of all values corrosponding to unique patterns
index = (dataCategorical' == tablePattern)*(1:4)';
data2Value = valueValue(index);

6 Comments

@Danielle Leblance's answer moved here:
dataCategorical = categorical(StateNo);
tablePattern = categorical({'AL' 'AK' 'AZ' 'AR' ...
'CA' 'CO' 'CT' ...
'DE' 'DC'...
'FL' 'GA' 'HI' ...
'ID' 'IL' 'IN' 'IA' ...
'KS' 'KY' 'LA' ...
'ME' 'MD' 'MA' 'MI' 'MN' 'MS' 'MO' 'MT' ...
'NE' 'NV' 'NH' 'NJ' 'NM' 'NY' 'NC' 'ND' ...
'OH' 'OK' 'OR' 'PA' 'RI' ...
'SC' 'SD' 'TN' 'TX' ...
'UT' 'VT' 'VA' ...
'WA' 'WV' 'WI' 'WY' 'WY'});
valueValue = [1 2 3 4 5 6 7 9 48 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 50 51];
index = (dataCategorical' == tablePattern)*(1:52)';
data2Value = valueValue(index);
I am receiving an error: Matrix dimensions must agree.
Error in == (line 41) t = (acodes == bcodes) & (acodes ~= 0);
Error in ConvertStateToMyNumber (line 10) index = (dataCategorical' == tablePattern)*(1:52)';
could it be because there are categories that are not included as i simply decided to ignore them. For example I have a catgory XX that I don't care about . There are many of these.
Yes, It will create problem if some of the patterns are not included in the tablePattern. The data2Value will contain less entries then dataCategorical, which might cause problem.
is there a way to go about it without the other categories?
What you want to do about those categories. Do you want to just remove them or make them equal to zero? If you just remove them then you will have no way of knowing that which element in data2Value corresponds to which element in data because several values are missing. It depend on what you want to do with those values.
make them nan. I will look at Guillaume's answer
Note that you could have used the same method as in my answer (construct nan matrix, then fill with valueValue(index)) with Ameer's method. However using ismember is probably faster and certainly a lot less demanding in memory than the 2D array generated by the implicit expansion of ==

Sign in to comment.

Categories

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!