advise on accessing cell array containing structures

Question

0 votes

I have a cell array "wrkspcs" containing names of cell arrays as shown below (some entries omitted for brevity)

    wrkspcs =
      13×1 cell array
        {'ZLH_151210_WrkSpc'}
        {'MXG_151210_WrkSpc'}
        {'LF_151223_WrkSpc' }

each entry refers to a 6x6 cell array in the workspace. Using eval, I can see the referenced cell array

>> eval(wrkspcs{1})
ZLH_151210_WrkSpc =
  6×6 cell array
    {1×1 struct}    {[1]}    {'16557'}    {'1210'}    {'zlh_1a'}    {'ZLH_151210'}
    {1×1 struct}    {[2]}    {'16557'}    {'1213'}    {'zlh_2a'}    {'ZLH_151210'}
    {1×1 struct}    {[3]}    {'16557'}    {'1216'}    {'zlh_3a'}    {'ZLH_151210'}
    {1×1 struct}    {[1]}    {'16676'}    {'1210'}    {'zlh_1b'}    {'ZLH_151210'}
    {1×1 struct}    {[2]}    {'16676'}    {'1213'}    {'zlh_2b'}    {'ZLH_151210'}
    {1×1 struct}    {[3]}    {'16676'}    {'1216'}    {'zlh_3b'}    {'ZLH_151210'}

If I want to get the number of entries in the first workspace "ZLH_151210_WrkSpc", I can do

>> tmp=eval(wrkspcs{1});n = length(tmp(:,1))
n =
     6

but if I try to eliminate the creation of the temporary variable "tmp" and access the length directly, I get the following error:

>> n = length(eval(wrkspcs{1})(:,1))
Error: ()-indexing must appear last in an index expression.

However, if I try the following everything is fine.

>> eval(['n = length(',wrkspcs{iloop},'(:,1))'])
n =
     6

So, I am trying to understand which syntax rule I am violating in the second case and what is the "proper" way of obtaining the length without either creating the variable "tmp" or including the assignment in the 'eval' statement (which the Matlab documentation states I should try to avoid).

Any comments, insights, or suggested alternatives would be appreciated.

5 Comments
Show 3 older comments Hide 3 older comments

Stephen23 on 4 May 2018

Edited: Stephen23 on 4 May 2018

Open in MATLAB Online

Rather than using this slow, complex, and buggy way of storing your data, it would be simpler to store all of the data in one structure. Then accessing the data is simple using its fieldnames: no ugly eval is required.

Note that the MATLAB documentation specifically advises against what you are doing: "A frequent use of the eval function is to create sets of variables such as A1, A2, ..., An, but this approach does not use the array processing power of MATLAB and is not recommended. The preferred method is to store related data in a single array". Note that either creating or accessing variable names dynamically suffers from the same disadvantages described in the documentation.

The important question is: how did you get all of those variables into your workspace? Usually beginners do this by calling load without an output argument, and spamming lots of variables into the workspace, which are then very difficult to process (and so they resort to writing ugly, slow, complex, buggy code using eval, and getting the variable names using slow who or whos). There is most likely an easy way around what you are doing, e.g. by simply calling load with an output argument:

S = load(...);

You might also like to read this discussion of the topic:

https://www.mathworks.com/matlabcentral/answers/304528-tutorial-why-variables-should-not-be-named-dynamically-eval

Dave on 8 May 2018

Open in MATLAB Online

OK, I'm taking your suggestions to heart...eliminate the "eval"s and "poofing" variables into my workspace in my code. But I have a question:

Each wkrspc is saved to its own *.mat file with a single variable (a 6x6 cell array) with the structure shown above. If I want to replace the code:

load([wrkspcs{iwrk},'.mat']) %creates variables in workspace matching the file name

(which "poofs" the variable 'ZLH_151212_WrkSpc' into the workspace) with the preferred "load into structure" replacement statement

>> tmp=load([wrkspcs{1},'.mat'])
tmp = 
  struct with fields:
      ZLH_151212_WrkSpc: {6×6 cell}

it works fine. I can then copy the cell array from the 'tmp' struct to myStruct

>> myStruct.(wrkspcs{1})=tmp.(wrkspcs{1})
myStruct = 
struct with fields:
    ZLH_151210_WrkSpc: {6×6 cell}

But how can I do the load directly into 'myStruct' struct? When I try:

>> myStruct.(wrkspcs{1})=load([wrkspcs{1},'.mat'])
myStruct = 
  struct with fields:
      ZLH_151210_WrkSpc: [1×1 struct]
  >> myStruct.ZLH_151210_WrkSpc
  ans = 
    struct with fields:
      ZLH_151210_WrkSpc: {6×6 cell}
  >> myStruct.ZLH_151210_WrkSpc.ZLH_151210_WrkSpc
  ans =
    6×6 cell array
      {1×1 struct}    {[1]}    {'16557'}    {'1210'}    {'zlh_1a'}    {'ZLH_151210'}
      {1×1 struct}    {[1]}    {'16676'}    {'1210'}    {'zlh_1b'}    {'ZLH_151210'}
      {1×1 struct}    {[2]}    {'16557'}    {'1213'}    {'zlh_2a'}    {'ZLH_151210'}
      {1×1 struct}    {[2]}    {'16676'}    {'1213'}    {'zlh_2b'}    {'ZLH_151210'}
      {1×1 struct}    {[3]}    {'16557'}    {'1216'}    {'zlh_3a'}    {'ZLH_151210'}
      {1×1 struct}    {[3]}    {'16676'}    {'1216'}    {'zlh_3b'}    {'ZLH_151210'}

and my 6x6 cell array gets buried 2 levels deep in 'myStruct'. Now I have no idea how to directly get the 6x6 cell array to be "loaded" directly as the first level field in 'myStruct' -- same as the result I get when going through a 'tmp' struct as shown above. Specifying the variable name in the 'load' statements above makes no difference.

Again, I need some help understanding this behavior and any workaround to avoid going thru loading to a temporary structure and copying the desired field to my structure variable.

Stephen23 on 8 May 2018

Edited: Stephen23 on 8 May 2018

Open in MATLAB Online

"But how can I do the load directly into 'myStruct' struct?"

You can't. Not in the way that you are trying to do it, without a temporary variable. load returns a scalar structure and you will have to allocate that to a temporary variable and then access its fields, as Ameer Hamza already explained. This is quite efficient and does not waste memory, so there is no reason to avoid it.

"...any workaround to avoid going thru loading to a temporary structure..."

as Ameer Hamza wrote, MATLAB does not allow arbitrary indexing/fieldname access to be suffixed onto function calls, so it is quite normal in MATLAB to allocate data to a temporary variable before doing some simple indexing, or accessing fields. This is standard MATLAB practice, wastes no memory whatsoever, and you have not explained why you need to avoid it.

Alternative 1: using a non-scalar structure has advantages also, when you try to process/access the data. You might like to consider doing something like this:

tmp = load([wrkspcs{k},'.mat']);
myStruct(k).data = tmp.(wrkspcs{1});
myStruct(k).name = wrkspcs{k};

The trick is to think of meta-data as data in their own right. Storing data in this way will make your code much simpler, more robust, and more generalized, which means that you can spend more time on actually processing your data rather than worrying about fieldnames and variable names and mat files and ...

Alternative 2: if each .mat file contains exactly one field/variable, then there is no real advantage to using a structure anyway, and you could easily use a cell array for all of your data. If the fields are the same size then it could even be a 3D array and then there would be no nesting of cell arrays:

out = cell(6,6,numel(wrkspcs));
for k = 1:numel(wrkspcs)
    tmp = load([wrkspcs{k},'.mat']);
    out(:,:,k) = tmp.([wrkspcs{k})
end

Alternative 3: Note that most of the complication here come from bad data design anyway: contrary to what some beginners think, it is much easier to process data when the variable names do not change (yes, even the ones inside .mat files). If each .mat file simply had the exactly same variables, e.g. data and name, then you really could import the files in exactly the way that you requested, without any temporary variable:

for k = numel(wrkspcs{k}):-1:1
    S(k) = load([wrkspcs{k},'.mat']);
end

and you would get one non-scalar structure containing all of your data, without any nesting:

S(1).data
S(1).name

or all of the names in a cell array:

{S.name}

etc

Dave on 8 May 2018

thanks for your clear answer. (I reposted this as a new "Ask" since it really is a different question than original post...but you answered it)

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Ameer Hamza on 4 May 2018

Open in MATLAB Online

0 votes

Unlike C++ or python, in MATLAB you can't directly index the output of a function. You firstly need to store the data in a separate variable and then index the variable that variable to access the required data. So what is happening here:

1) If first case: eval() is a MATLAB function and you are trying to further index its output. Which as already stated is not supported in MATLAB.

2) In the second case: you are effectively running the following command

n = length(ZLH_151210_WrkSpc(:,1))

i.e. indexing into a cell array. This a perfectly supported MATLAB syntax. Even in the first case, the following line will work

n = length(eval([wrkspcs{1}, '(:,1)']))

as you can see that again I am trying to index in the cell array, not the output of a function.

Note: Accessing variables using eval is a very bad idea. It makes your code slow and difficult to debug. For better coding practice, you should look into storing all the variables in a struct and then access the required data using field names.

2 Comments
Show None Hide None

Dave on 5 May 2018

Ameer's answer was to the point and confirmed my testing results.

While I truly appreciate everyone's comments about avoiding "eval" and using structured input from "load" -- sometimes you need to deal with external data files, naming conventions, and existing code base. I need to deal with many data sets collected with a certain organization and file names and write general code that works with, and tracks, an arbitrary number of files and names. I wish I was clever enough to do this without invoking "eval". Even if I was, there is existing code that I must insert my functions and results into without major rewrites...so it is a balance (manage the dangers and inefficiency of "eval" against ease of integration with existing data and code).

Thanks for your comments

Stephen23 on 8 May 2018

Edited: Stephen23 on 8 May 2018

Open in MATLAB Online

"While I truly appreciate everyone's comments about avoiding "eval" and using structured input from "load" -- sometimes you need to deal with external data files, naming conventions, and existing code base."

The names of external files are irrelevant to this issue. The only topic that might be relevant is the "existing code base".

"...so it is a balance (manage the dangers and inefficiency of "eval" against ease of integration with existing data and code)."

You missed one of the other main points about eval: code that has to dynamically access variable names is code that wastes the programmers time: it makes code complex and hard to debug. Your question and the days that you have spent fighting the task of simply importing data is an example of this.

Using lots of different names in the .mat files is really the design decision that has made this so complicated for you: if the .mat files used exactly the same field/variable names (e.g. data and name) then your code would be trivially simple (and yes, you could load them without any intermediate variable):

for k = ...
   S(k) = load(...);
end

and that would be all! Better code through better data design: never underestimate the importance of designing your data well!

Sign in to comment.

advise on accessing cell array containing structures

5 Comments
Show 3 older comments Hide 3 older comments

Accepted Answer

2 Comments
Show None Hide None

More Answers (0)

Categories

Tags

Community Treasure Hunt

advise on accessing cell array containing structures

5 Comments Show 3 older comments Hide 3 older comments

Accepted Answer

2 Comments Show None Hide None

More Answers (0)

Categories

Tags

See Also

Community Treasure Hunt

5 Comments
Show 3 older comments Hide 3 older comments

2 Comments
Show None Hide None