MATLAB Answers

0

Categorical input to numerical array

Asked by Elien Bellon on 10 Nov 2018 at 14:20
Latest activity Commented on by dpb
on 11 Nov 2018 at 15:21
Hi,
I have imported data from an excel file with 2 columns with numerical info using the import wizard. In the excel: The first column contains the subject numbers (e.g., 1, 2, 3, 4, ...). The second column contains different digits per cel that refer to the name of the task the subject did (e.g., [3 4 8 9] for subject 1, [4 5 6 8 9] for subject 2)
My goal is to have 2 numeric variables in Matlab, i.e., "SubjectID" and "Name_of_runs" So: I want both columns to be seperate in my workspace and I want them both to be numeric.
Using the import wizard, the first column is created as I want, I can create a numeric variable.
However, the second column with the names of the tasks is not numeric, but categorical input. I want it to be numerical. For example, I get [3 4 8 9] as my ans for subject 1, a 1x1 categorical array. I want a 1x4 numerical array.
How can I solve this?
Thanks in advance!

  9 Comments

dpb
on 10 Nov 2018 at 17:15
, I use the num2str function..."
cv=categorical(1:3);
fn=sprintf('yourfile%03d.txt',cv(1))
fn =
'yourfile001.txt'
Or, of course, you can wrap inside double and leave num2str.
If there are really no other uses for these variables in looking for cases or other ways that are related to the variables as categories, then there may not be any strong advantage.
Thank you for your response. How can I "wrap inside double" so that I can still use the num2str?
dpb
on 11 Nov 2018 at 15:21
Well, actually, it's going to char() that is needed to use str2num, not double().
But, the problem is more fundamental in this case owing to the structure that the variable is a composite one, not the single value until you convert it as either Walter or I showed...

Sign in to comment.

2 Answers

Answer by Walter Roberson
on 10 Nov 2018 at 17:08

t = readtable('names_SubjectID_Runs.xlsx');
SubjectID = t.SubjectID;
Name_of_runs = cellfun(@str2double,regexp(t.name_of_runs,'\d+', 'match'),'uniform', 0);
Name_of_runs cannot be numeric because you have a different number of runs for different rows. Instead it is a cell array of numeric vectors.

  2 Comments

Thank you. And can I make matlab realise that in each cell, there are different numerical values? Like if when I manualy type in name_of_runs = [3 4 5 7 8] matlab realises these are different numerical values. Can I do the same with this excel file?
In the above code, MATLAB already knows it. It already knows that, for example, Name_of_runs{2} is a numeric row vector of length 4, and that Name_of_runs{17} is a numeric row vector of length 3.

Sign in to comment.


Answer by dpb
on 10 Nov 2018 at 17:37

The problem is your spreadsheet is structured such that the runs variable array is stored as a text string of a series of values enclosed in braces, each array in a single cell instead of as numbers; one per cell. Matlab did best it knew how to retrieve it.
t=readtable('names_SubjectID_Runs.xlsx');
>> cellfun(@str2num,t.name_of_runs,'uni',0)
ans =
26×1 cell array
{1×4 double}
{1×4 double}
...
{1×4 double}
{1×3 double}
{1×3 double}
{1×3 double}
{1×3 double}
>>
Unfortunately, there aren't the same number of observations in each row so you'll have to either augment the shorter w/ NaN or use a cell array to hold the values.
But, the above shows how to convert what you have; if could change the way the data are saved into Excel could solve the problem there if wanted to, instead, altho this is simple-enough once know what the issue actually is.

  0 Comments

Sign in to comment.