How to extract values from structure as array and assign numeric values to them?

3 views (last 30 days)
So i have a structure called data
looks like this
data. gender = 'female'
. age = '25'
.reproductiontime = 100x2 double
I would like to extract the values from age and gender from all my data which is around 200 files and they're all in the same format add a value to gender such as female = 1, male=2 non-binary=0
Can I assign values to gender structure or a sub-structure perhaps?
And do I just put this in the output of my function then? As in function [age, gender] = m(data) ?
Thanks in advance,
sorry if horribly basic but I feel like i'm ovecomplicating it and there might be a quick and easy answer I am not seeing.
  2 Comments
Mau Dudas
Mau Dudas on 26 Nov 2019
Well I tried to use
load ('repro_*.mat') which is the format name for all the 200 files
and then give
files= dir('repro_*.mat')
which is as far as i've got with it as it isn't quite right
the data structure for each repro_*.mat is a 1x1 with the abovementioned three fields.
Does that help?

Sign in to comment.

Accepted Answer

Ridwan Alam
Ridwan Alam on 26 Nov 2019
% assuming your 200 'data' structs are named as data001, data002, ...
varnames = whos('data*');
% since both 'age' and 'gender' properties are strings, string arrays are sufficient;
% can also use cell arrays
age = [];
gender = [];
for v = 1:length(varnames)
curData = eval([varnames(v).name]);
age = [age; curData.age];
% if you want to replace the gender string with numbers 1 and 2:
gender = [gender; double(strcmp(curData.gender,'male'))+1];
% else to just extract the gender string use this:
% gender = [gender; curData.gender];
end
  4 Comments
Ridwan Alam
Ridwan Alam on 26 Nov 2019
Edited: Ridwan Alam on 26 Nov 2019
varnames contain all the variables with names starting with data***
the assumption was that from the 200 files you have loaded the variables to your workspace as data001, data002, ... data200.
given your filename, this is the updated code:
filenames = dir('repro_*.mat');
% since both 'age' and 'gender' properties are strings, string arrays are sufficient;
% can also use cell arrays
age = [];
gender = [];
for f = 1:length(filenames)
data = load(filenames(f).name);
curData = data;
age = [age; curData.age];
% if you want to replace the gender string with numbers 1 and 2:
gender = [gender; double(strcmp(curData.gender,'male'))+1];
% else to just extract the gender string use this:
% gender = [gender; curData.gender];
clear data;
end
Hope it works!
Turlough Hughes
Turlough Hughes on 27 Nov 2019
Edited: Turlough Hughes on 27 Nov 2019
If Mau has described the contents of the files repro_*.mat correctly then the above won't work as
data = load(filenames(f).name);
is in fact nesting the 1x1 structure, stored in 'repro_*.mat' inside another structure called data, so in order to extract the data you would need replace:
age = [age; curData.age];
gender = [gender; double(strcmp(curData.gender,'male'))+1];
with
age = [age; curData.data.age];
gender = [gender; double(strcmp(curData.data.gender,'male'))+1];
If this did happen to work, it is because a given file repro_*.mat in fact contains the 3 variables, gender, age and reproductiontime.

Sign in to comment.

More Answers (1)

Turlough Hughes
Turlough Hughes on 26 Nov 2019
Edited: Turlough Hughes on 26 Nov 2019
Assuming that when you load, the name of the variable is data as you mention in the question, then the following should do the job.
files = dir('repro_*.mat')
gender = cell(200,1);
age = cell(200,1);
gen_num = nan(200,1);
for c = 1:length(files)
load(files(c).name);
gender{c,1} = data.gender;
gend_num(c,1) = strcmp(data.gender,'female')*1+strcmp(data.gender,'male')*2+strcmp(data.gender,'non-binary')*3;
age{c,1} = data.age;
end
  2 Comments
Mau Dudas
Mau Dudas on 26 Nov 2019
Edited: Mau Dudas on 26 Nov 2019
Just one more question.
Could you tell me what values belong to the 'cells' in the code?
As that is literally the only one I can't figure out...
Also it keeps asking to define 'data' for some reason
Turlough Hughes
Turlough Hughes on 27 Nov 2019
Could you tell me what values belong to the 'cells' in the code?
Are you referring to these?
age = cell(200,1);
gender = cell(200,1);
What I do there is preallocate memory for storing the age and gender on each iteration of the for loop. I'm guessing you know this already but it's good practice to preallocate memory when you know the size the variables will be in advance.
Also it keeps asking to define 'data' for some reason
By this do you mean that you are getting the error:Undefined variable "data" or class "data.gender". If this is the case it is because the variables stored in each .mat file; 'repro_001.mat', 'repro_002.mat', etc, do not in fact contain a structure variable called data. You could try this:
fils=dir('repro_*.mat');
gender = cell(200,1);
age = cell(200,1);
gen_num = nan(200,1);
for c = 1:length(files)
temp=load(fils(c).name);
field_name=fields(temp);
data=temp.(field_name{1,1});
gender{c,1} = data.gender;
gend_num(c,1) = strcmp(data.gender,'female')*1+strcmp(data.gender,'male')*2+strcmp(data.gender,'non-binary')*3;
age{c,1} = data.age;
clear data temp field_name
end
If that doesn't work, then write:
clear all
fils=dir('repro_*.mat');
load(fils(1).name);
whos
and let me know the result of whos.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!