Converting categorical data to prespecified numbers
Show older comments
Hi,
I have a categorical array : ['SC' 'SC' 'SC' 'SC' 'SC' 'SC' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'MI' 'MI' 'MI' 'MI' 'MI' 'MI' 'MI' 'MI' 'MN' 'MN' 'MN' 'MN' 'MN' 'MN'];
I want to convert each category to a number of my choice: for instance SC should be 23, GA=10;MI=13,MN=15 etc... How can I do so?
Accepted Answer
More Answers (1)
Ameer Hamza
on 8 May 2018
You can do the do this as follow,
data = {'SC' 'SC' 'SC' 'SC' 'SC' 'SC' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA'...
'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA'...
'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA' 'GA'...
'GA' 'GA' 'GA' 'GA' 'MI' 'MI' 'MI' 'MI' 'MI' 'MI' 'MI' 'MI' 'MN' 'MN'...
'MN' 'MN' 'MN' 'MN'};
dataCategorical = categorical(data);
tablePattern = categorical({'SC', 'GA', 'MI', 'MN'}); % make list of all unique pattern
valueValue = [23, 10, 13, 15]; % make list of all values corrosponding to unique patterns
index = (dataCategorical' == tablePattern)*(1:4)';
data2Value = valueValue(index);
6 Comments
Ameer Hamza
on 8 May 2018
Edited: Walter Roberson
on 8 May 2018
@Danielle Leblance's answer moved here:
dataCategorical = categorical(StateNo);
tablePattern = categorical({'AL' 'AK' 'AZ' 'AR' ...
'CA' 'CO' 'CT' ...
'DE' 'DC'...
'FL' 'GA' 'HI' ...
'ID' 'IL' 'IN' 'IA' ...
'KS' 'KY' 'LA' ...
'ME' 'MD' 'MA' 'MI' 'MN' 'MS' 'MO' 'MT' ...
'NE' 'NV' 'NH' 'NJ' 'NM' 'NY' 'NC' 'ND' ...
'OH' 'OK' 'OR' 'PA' 'RI' ...
'SC' 'SD' 'TN' 'TX' ...
'UT' 'VT' 'VA' ...
'WA' 'WV' 'WI' 'WY' 'WY'});
valueValue = [1 2 3 4 5 6 7 9 48 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 50 51];
index = (dataCategorical' == tablePattern)*(1:52)';
data2Value = valueValue(index);
I am receiving an error: Matrix dimensions must agree.
Error in == (line 41) t = (acodes == bcodes) & (acodes ~= 0);
Error in ConvertStateToMyNumber (line 10) index = (dataCategorical' == tablePattern)*(1:52)';
could it be because there are categories that are not included as i simply decided to ignore them. For example I have a catgory XX that I don't care about . There are many of these.
Ameer Hamza
on 8 May 2018
Yes, It will create problem if some of the patterns are not included in the tablePattern. The data2Value will contain less entries then dataCategorical, which might cause problem.
Danielle Leblance
on 8 May 2018
Ameer Hamza
on 8 May 2018
What you want to do about those categories. Do you want to just remove them or make them equal to zero? If you just remove them then you will have no way of knowing that which element in data2Value corresponds to which element in data because several values are missing. It depend on what you want to do with those values.
Danielle Leblance
on 8 May 2018
Guillaume
on 8 May 2018
Note that you could have used the same method as in my answer (construct nan matrix, then fill with valueValue(index)) with Ameer's method. However using ismember is probably faster and certainly a lot less demanding in memory than the 2D array generated by the implicit expansion of ==
Categories
Find more on Creating and Concatenating Matrices in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!