MATLAB Answers

0

Operation on variables whose name contain a certain string.

Asked by Julien Gorenflot on 14 Sep 2017
Latest activity Commented on by Jan
on 20 Sep 2017
Hi!
I've been trying a bunch of ways to treat some data, so I end up having a few variable called "treated_data_blablabla".
Let's say I want to do an operation on all of those variable (plot them, normalize them, etc). How can I just access them all, or loop through them?
I can get the list of them with
who treated_data_*
but that's only displayed. I managed to get the list in a variable with
test = who('-regexp','treated_data*');
But this is a matrix of cells, not even strings. So I still don't know how I access the variables whose names are in stored in test

  7 Comments

"Simpler would be to put your data into an array. Like an ND array, a cell array, a non-scalar structure, or a table. Then you can access the data using simple, neat, and efficient indexing."
Well... simple indexing would not tell me the difference between the different approaches I've tried. I could add some "description" attributes, but then I would have to open them to know which index corresponds to which approach when I come back to my work, with an idea for a new approach let's say one week later. Having the description in the variable part of my variable name, is a fast and dirty way. I see it immediately from the list in the workspace.
But indeed, as you say, it is not quite compatible with the spirit of matlab. Only matlab is already so good that the efficiency of the code is not an issue for me. The issue is more to understand what physical approach I've used in the first attempt, what I have used in the second attempt, in the third attempt. And why one describes better the actual experimental results.
Well. I'll take your point still. I'll try to make things clean next time. Having a description attribute could actually be a good way because I could make it longer than just a few key words in variable names. + I indeed do suffer from the number of my variables. I'm just naturally messy... Takes some iterations before I start do things the clean way. But clearly putting everything in a structure would help, I'll try in the future. Only I haven't done it so far and I still need to handle my old data and sometimes, you need the result in one hour, so it's too late to reorganize everything...
And btw "why beginers always do blablabla".. because it's more straitforward. Beginners like me use matlab as a calculator, just we can store the result and use more complicated functions, so it's straitforward to just play with the name where you store the result.
"Beginners like me use matlab as a calculator, just we can store the result and use more complicated functions, so it's straitforward to just play with the name where you store the result."
If you continue to write badly-designed code then you will always find it very hard to use MATLAB as anything more than a big expensive calculator. What a waste of a good opportunity to learn some transferable skills.
"I indeed do suffer from the number of my variables. I'm just naturally messy"
Messy code is buggy code. Luckily that is your problem, not mine. All we can do is point out how you could make your own life much simpler and easier, which if you read the links that I gave is advice that has been given a thousand times before. Have you ever wondered why experienced users give the same advice? It is not a conspiracy against you, but it is because they have learned (through experience, though knowledge, through explanations, through research) that there are actually good code practices, and that these really make a difference to their own productivity.
"so it's straitforward to just play with the name where you store the result."
No it isn't. You just said that have not tried other methods, so your comparison is one complex-and-buggy way to write code against nothing. What an interesting comparison! Actually it is more complex to construct and evaluate a string than to use indexing (including looking up some attributes) or use fieldnames of a structure. And is also slower, buggier, hard to debug, obfuscates the code intent, etc.
The fact that you are even asking this question is proof that it is not "straightforward" at all. Instead of being "straightforward" you have been stuck for nearly a week doing something that should really be totally trivial: access your data! Seriously, accessing data is such a basic starting point for doing work, yet your "straightforward" method can't even do it without wasting a huge amount of your time. Yep, clearly it is very "straightforward" and very good way to be productive.
"I see it immediately from the list in the workspace"
Many beginners like the feeling of seeing data in the workspace: it is comforting somehow - I can remember that feeling. But as you get more experienced you learn to trust your code more: experienced users write functions that are well designed, have clear input and output specifications, and are well tested. My base workspace is essentially empty most of the time, and it certainly plays absolutely no part in any of the calculations I do on thousands of text files or other data: all data is imported and handled inside functions and classes, which have appropriate warnings, errors, etc. to alert me when something is not working as expected.
"it is not quite compatible with the spirit of matlab"
Not just MATLAB but many many other programming languages too. You obviously missed reading the points about static code checking, searching for variables in code (possibly in multiple files), variable highlighting, tab completion, etc., etc. Perhaps you don't know what these things are for (or even what they are), but suffice to say, they were invented precisely because programmers (in lots of languages) learned that in order to use any language as more than just a big calculator then it really helps to have some tools to help you write your code. Big projects need those tools, and experienced users use those tools all the time. But what you are doing now, with your badly designed code, means that none of these tools work. You simply limit yourself to using MATLAB as a very big expensive calculator.
This link has links to more information on all of those tools:
Good luck with "straightforward", and see you in another week with the same problem.
@Julien: I hope you see the chances to improve the programming style for your future projects. Many existing projects suffer from a bad design coming from the early phases of the development. A code starts as a short hack, a bunch of script, global variables, input and output through MAT files. Then the code grows and the programmer hesitates to replace the clumsy methods, because they are working already. And finally the software reaches a limit, where further modifications are getting exponentially harder due to the bad foundation.
The standard solution is a refactoring of the code: Use the collected experiences and rewrite the complete code from scratch. And in the next project, start to implement the parts such, that they are not effected by a modification of the underlying data structure.
But what is the best suggestion for you to your current problem? If this is the final phase of your project and after the presentation nobody will use the code anymore? Then a complete refactoring is a waste of time. The awkward who('-regexp','treated_data*') method will work. It is easy to access a matrix of cells. But I think Donald's suggestion is very useful and can be applied directly.
@Stephen: "Luckily that is your problem, not mine." Ooops?! You care very exhaustively about the problems of programmers, who fail at the limits of the dynamic creation of variables. Your contributions in this forum are much more useful and valuable than "this is your problem". What about: "Luckily your problem can be solved easily and I can show you how."?

Sign in to comment.

Tags

No tags entered yet.

1 Answer

Answer by OCDER
on 16 Sep 2017
Edited by OCDER
on 16 Sep 2017

In case you're stuck with 1000's of treated_data_blahblah, here's an example showing how to work around this issue by correcting the variables before trying to operate on them.
%Poorly labeled variables
A_1 = rand(1);
A_2 = rand(2);
A_3 = rand(3);
A_4 = rand(4);
A_10 = rand(10);
%Convert poorly labeled variables to scalar structure first (and to cell later if you can)
Prefix = 'A_'; %Variable name before the changing portion (in your case, 'treated_data_')
save('temp_BadVarNames.mat', '-regexp', Prefix);
T = load('temp_BadVarNames.mat'); %Loads variables into scalar structure, T.A_1, T.A_2, etc.
delete('temp_BadVarNames.mat');
%OPTION1: Using scalar structures
%To operate on every data, use structfun or for loop. Ex:
A = structfun(@(x) x*10, T, 'UniformOutput', false); %Multiplies everything by 10
%OPTION2: Using cells
%If your variable name changes numerically, (A_1 A_2 instead of A_a A_b)
%then you can store values in a cell instead.
%Convert loaded variable into cell
CellIndex = cellfun(@(x) str2double(strrep(x, Prefix, '')), fieldnames(T));
A = cell(max(CellIndex), 1);
A(CellIndex) = struct2cell(T);
%To operate on every cell element in A, use cellfun or a for loop. Ex:
A = cellfun(@(x) x*10, A, 'UniformOutput', false); %Multiplies everything by 10

  1 Comment

Hmmmm, I thought this answer was great for your case...
"Only I haven't done it so far and I still need to handle my old data and sometimes, you need the result in one hour, so it's too late to reorganize everything..."
Which is why the above code automatically corrects your labeling scheme, in like, 0.0002 seconds. I already figured you had "1000's of treated_data_blahblah" according to Mind Reading Toolbox. :)
For your case, stick to Option1 : Scalar Structure, if variable name is more informative than index. Ex:
T.treated_data_monkey
T.treated_data_dog

Sign in to comment.