Tips & Tricks
Follow


Stephen23

TUTORIAL: Why Variables Should Not Be Named Dynamically (eval)

Stephen23 on 26 Sep 2016 (Edited on 2 Apr 2024 at 12:48)
Latest activity Reply by webdesign on 10 Apr 2024 at 20:40

Summary:
Dynamically accessing variable names can negatively impact the readability of your code and can cause it to run slower by preventing MATLAB from optimizing it as well as it could if you used alternate techniques. The most common alternative is to use simple and efficient indexing.
Explanation:
Sometimes beginners (and some self-taught professors) think it would be a good idea to dynamically create or access variable names, the variables are often named something like these:
  • matrix1, matrix2, matrix3, matrix4, ...
  • test_20kmh, test_50kmh, test_80kmh, ...
  • nameA, nameB, nameC, nameD,...
Good reasons why dynamic variable names should be avoided:
There are much better alternatives to accessing dynamic variable names:
Note that avoiding eval (and assignin, etc.) is not some esoteric MATLAB restriction, it also applies to many other programming languages as well:
MATLAB Documentation:
If you are not interested in reading the answers below then at least read MATLAB's own documentation on this topic Alternatives to the eval Function, which states "A frequent use of the eval function is to create sets of variables such as A1, A2, ..., An, but this approach does not use the array processing power of MATLAB and is not recommended. The preferred method is to store related data in a single array." Data in a single array can be accessed very efficiently using indexing.
Note that all of these problems and disadvantages also apply to functions load (without an output variable), assignin, evalin, and evalc, and the MATLAB documentation explicitly recommends to "Avoid functions such as eval, evalc, evalin, and feval(fname)".
The official MATLAB blogs explain why eval should be avoided, the better alternatives to eval, and clearly recommend against magically creating variables. Using eval comes out at position number one on this list of Top 10 MATLAB Code Practices That Make Me Cry. Experienced MATLAB users recommend avoiding using eval for trivial code, and have written extensively on this topic.
webdesign
webdesign on 10 Apr 2024 at 20:40
Thank you, that helped me a lot.
Hassaan
Hassaan on 10 Apr 2024 at 7:37
Very insightful. Thank you.
Ed Wheatcroft
Ed Wheatcroft on 24 Nov 2022
@Stephen23, I just wanted to say thanks for putting this extensive page together. I was about to use eval() for something because I thought there was no other way, but after reading this page from top to bottom I feel like a) I now understand why not to use eval() and b) I know the correct way to solve the issue I was having. I found the 'better alternaives' section especiclaly helpful, as a lot of the other posts about eval() seem to focus heavily on the reasons not to use it, but don't really teach you how to avoid it. Anyway, thanks for creating/maintaining the page :)
Steven Lord
Steven Lord on 30 Apr 2019
Alternative: Use a table or timetable Array
table (introduced in release R2013b) and timetable (introduced in release R2016b) arrays allow you to store data with row and/or column names with which you can access the data. For example, if you create a table with variables named Age, Gender, Height, Weight, and Smoker and rows named with the last names of the patients:
load patients
patients = table(Age,Gender,Height,Weight,Smoker,...
'RowNames',LastName);
you can ask for all the ages of the first five patients:
patients(1:5, 'Age')
or all the data for the patients with last names Smith or Jones:
patients({'Smith', 'Jones'}, :)
You can also add new variables to the table, either by hard-coding the name of the variable:
% Indicate if patients are greater than five and a half feet tall
patients.veryTall = patients.Height > 66
or using variable names stored in char or string variables. The code sample below creates new variables named over40 and under35 in the patients table using different indexing techniques.
newname1 = 'over40';
patients.(newname1) = patients.Age > 40;
newname2 = 'under35';
patients{:, newname2} = patients.Age < 35;
patients(1:10, :) % Show the first ten rows
The code sample below selects either Height or Weight and shows the selected variable for the fifth through tenth patients using dynamic names.
if rand > 0.5
selectedVariable = 'Height';
else
selectedVariable = 'Weight';
end
patients.(selectedVariable)(5:10)
See this documentation page for more information about techniques you can use to access and manipulate data in a table or timetable array. This documentation page contains information about accessing data in a timetable using the time information associated with the rows.
Stephen23
Stephen23 on 30 Apr 2019
Simpler and more robust way to generate a table from that .mat file:
S = load('patients.mat');
T = struct2table(S,'RowNames',S.LastName);
Stephen23
Stephen23 on 17 Apr 2019 (Edited on 2 Apr 2024 at 5:30)
Alternative: save the Fields of a Scalar Structure
The save command has an option for saving the fields of a scalar structure as separate variables in a .mat file. For example, given a scalar structure:
S.A = 1;
S.B = [2,3];
this will save variables A and B in the .mat file:
save('myfile.mat','-struct','S')
This is the inverse function of loading into a structure. Some threads showing how this can be used:
Stephen23
Stephen23 on 30 Nov 2017
PS: eval is Not Faulty:
Some users apparently think that eval (and friends) must be faulty and should be removed from MATLAB altogether. They ask "if eval is so broken, why has it not been removed?"... but it is important to understand that the problem is caused by magically accessing variable names regardless of what tool or operation is used, and that eval (or assignin, or evalin, or load without an output argument, etc.) is simply being used inappropriately because there are much better methods available ( better in the sense faster, neater, simpler, more robust, etc). Read these discussions for good examples of this confusion:
It is important to note that any feature of a language can be used inefficiently or in an inappropriate way, not just eval, and this is not something that can be controlled by the language itself. For example, it is common that someone might solve something with slow loops and without preallocating the output arrays: this does not mean that for loops are "faulty" and need to be removed from MATLAB!
It is up to the programmer to write efficient code.
Stephen23
Stephen23 on 19 Jul 2017
Magically Making Variables Appear in a Workspace is Risky
This leads to many subtle bugs that are extremely difficult to track down, if they are even noticed at all!
1) For a start variables of the same name will be overwritten without warning. Even just a spelling mistake or adding extra variables to a MAT file can change the behavior of your code, and because it depends on the data files that you are working with, can be very difficult to track down.
2) Importing multiple files in a loop can ruin your data: consider what will happen if your code processes a sequence of MAT files, which you think all contain the same variables. But one of them contains different variables (yeah, I know, your data files are perfect... sure). Consider what happens in badly-written, fragile code that simply LOADs directly into the workspace: it will happily process the data from the previously loaded file, without giving you any warning or notification that your data are now from the wrong file. Processing continues using the wrong data.
3) There is another serious yet subtle problem, which is caused by the MATLAB parser finding alternative functions/objects/... and calling those instead of using the magically-created variable: basically if the variable does not exist then the parser does its best to find something that matches where the name is called/used later... and it might just find something! The documentation also explains this:
Some example threads discussing this topic:
Or in some cases the parser might not find anything:
The solution is simple: do not magically "poof" variables into existence: Always load into a structure, and never create variable names dynamically.
Stephen23
Stephen23 on 26 Sep 2016
Alternative: Use more Efficient Ways to Pass Variables between Workspaces (applies to evalin, assignin, etc)
Use nested functions, or pass arguments, or use any of the other efficient ways to pass data between workspaces:
Stephen23
Stephen23 on 26 Sep 2016 (Edited on 2 Apr 2024 at 17:15)
Other Languages: do not use eval!
In case you think that avoiding dynamic variable names is just some "weird MATLAB thing", here is the same discussion for some other programming languages, all advising "DO NOT create dynamic variable names":
Some (most likely interpreted) languages might use, require, or otherwise encourage dynamic variable names: if that is how they work efficiently, then so be it. But what is efficient in one language means nothing about the same approach in other languages... if you wish to use MATLAB efficiently, make your code easier to work with, and write in a way that other MATLAB users will appreciate, then that means learning how to use MATLAB features and tools:
Stephen23
Stephen23 on 26 Sep 2016 (Edited on 2 Apr 2024 at 12:49)
Alternative: load into a Structure, not into the Workspace
The MATLAB documentation explains this in detail:
In almost all cases where data is imported programmatically (i.e. not just playing around in the command window) it is advisable to load data into an output argument (which is a structure if the file is a .mat file):
S = load(...);
The fields of the structure can be accessed directly, e.g:
S.X
S.Y
or by using dynamic fieldnames. Note that this is the inverse of saving the fields of a scalar structure.
It is important to note that (contrary to what some users seem to think) it is actually easier and much more robust to save and load data within a loops when the variable names in the .mat files do not change, as having to process different variable names in each file actually makes saving/loading the file data more complex, inefficient, and fragile.
Summary: when using a loop, keep all variable names the same!
Here are real-world examples of loading into variables:
And finally Steven Lord's comment on load-ing straight into the workspace:
Stephen23
Stephen23 on 26 Sep 2016
Alternative: Non-Scalar Structure (with Indexing)
Using a non-scalar structure is much simpler than trying to access dynamic variable names. Here are some examples:
A very neat example is using the output from DIR to store imported file data:
S = dir(..);
for k = 1:numel(S)
S(k).data = readmatrix(S(k).name);
end
Stephen23
Stephen23 on 26 Sep 2016
Stephen23
Stephen23 on 26 Sep 2016
Alternative: Indexing into Cell Array or ND-Array
Oftentimes when a user wants to use eval they are trying to create numbered variables, which are effectively an index joined onto a name. It is usually better to turn that pseudo-index into a real index: MATLAB is fast and efficient when working with indices, and using indices will make code much much simpler than anything involving dynamic variable names:
Using ND-arrays is a particularly efficient way of handling data: many operations can be performed on complete arrays (known as code vectorization), and ND-arrays are easy to get data in and out of, and reduces the chance of bugs:
Or simply put the data into the cells of a cell array:
And some real-world examples of where indexing is much simpler than eval:
Stephen23
Stephen23 on 26 Sep 2016
Confuses Data with Code
The inclusion of data and meta-data within variable names (e.g. naming a variable with the user's input, the name of a test subject, or (very commonly) adding an index onto a variable name) is a subtle (but closely related) problem, and it should definitely be avoided. This quote from Image Analyst explains the problem succinctly: "When you start writing code to generate variable names, you're no longer writing code to process your data, you're writing code to generate the code that will process your data, and the increased complexity of this metaprogramming is always an added risk (of bugs, security issues, etc.)"
Read these discussions for an explanation of why it is a poor practice to put data and meta-data in variable names:
In many cases that meta-data is just a de-facto index, i.e. a value that proscribes the order of the data. But in that case the de-facto index should be turned into a much more efficient real numeric index:
Stephen23
Stephen23 on 26 Sep 2016
Code Helper Tools do not Work
The MATLAB editor contains many tools that advanced users continuously make use of, and beginners should particularly appreciate when learning MATLAB. However none of these tools work with code hidden inside eval:
Note that these do not work when using eval, evalc, etc. to magically create or access variable names. Would you want to disable the tools that help you to write functioning code? Here are examples of how eval hides code errors and makes it hard to debug code:
Tom Hawkins
Tom Hawkins on 7 Feb 2019
On this topic, it would be great if the Code Analyzer and checkcode would actually flag a warning when eval etc. are used. Perhaps that would cut down the number of questions about them on here?
Stephen23
Stephen23 on 26 Sep 2016 (Edited on 2 Apr 2024 at 12:42)
Obfuscated Code Intent
What does this code do?:
x1 = [119,101,98,40,39,104,116,116,112,58,47,47,119,119,119];
x2 = [46,121,111,117,116,117,98,101,46,99,111,109,47,119,97];
x3 = [116,99,104,63,118,61,100,81,119,52,119,57,87,103,88];
x4 = [99,81,39,44,39,45,98,114,111,119,115,101,114,39,41];
eval(char([x1,x2,x3,x4]))
Unfortunately eval makes it easy to write code which is hard to understand: it is not clear what it does, or why. If you ran that code without knowing what it does, you should know that it could have deleted all of your data, or sent emails to everyone on your contact list, or downloaded anything at all from the internet, or worse...
Because eval easily hides the intent of the code many beginners end up writing code that is very hard to follow and understand. This makes the code buggy, and also harder to debug! See these examples:
Properly written code is clear and understandable. Clear and understandable code is easier to write, to bug-fix, and to maintain. Code is read more times than it is written, so never underestimate the importance of writing code that is clear and understandable: write code comments, write a help section, use consistent formatting and indentation, etc.
Stephen23
Stephen23 on 26 Sep 2016 (Edited on 2 Apr 2024 at 5:17)
Difficult to Work With
Many beginners come here with questions that are basically some version of "I have lots of numbered variables but I cannot figure out how to do this simple operation...", or "my code is very slow/complex/buggy... how can I make it better?":
Even advocates of eval get confused by it, fail to make it work properly, and can't even figure out why, as these two examples clearly show:
Why can't they figure out why it does not work?:
  • Totally obfuscated code due to indirect code evaluation.
  • More complex than it needs to be.
  • The code helper tools do not work.
  • Syntax highlighting does not work.
  • Static code checking does not work.
  • No useful error messages, etc. etc.
Writing code is hard. Don't make it even harder by turning off the tools that check and help improve your code.
Stephen23
Stephen23 on 26 Sep 2016
Security Risk
eval will evaluate any string at all, no matter what commands it contains. Does that sound secure to you? This string command might be malicious or simply a mistake, but it can do anything at all to your computer. Would you run code which could do anything at all to your computer, without knowing what it was about to do?
For some users the surprising answer is "yes please!".
For example, try running this (taken from Jos' answer here):
eval(char('fkur*)Ykvj"GXCN"{qw"pgxgt"mpqy"yjcv"jcrrgpu0"Kv"eqwnf"jcxg"hqtocvvgf"{qwt"jctfftkxg"000)+'-2))
Did you really run it on your computer even though you had no idea what it would do? Every time code gets a user input and evaluates it gives that user the ability to run anything at all. Does that sound secure to you?
Steven Lord
Steven Lord on 23 Oct 2019
Running the char command is safe, that will just create a char vector you can read. Remove the eval() around the char command and it won't execute the command stored in the char vector.
Adam Danz
Adam Danz on 22 Oct 2019
Hint: the char(str-2) is a caesar-cipher that produces a command that is exectued by eval().
Alexander Geldhof
Alexander Geldhof on 22 Oct 2019
I'm late to the party, but what does this do?
For the same reasons you mentioned, I'm rather wary from entering this in my Matlab command line.
Stephen23
Stephen23 on 26 Sep 2016 (Edited on 2 Apr 2024 at 12:35)
Buggy
Using eval makes it really hard to track down bugs, because it obfuscates the code and disables lots of code helper tools. Why would you even want to use a tool that makes it harder to debug and fix your code?
Here are some examples to illustrate how what should have been simple operations become very difficult to debug because of the choice to use eval:
Code that generates variable names dynamically based on imported data or user inputs is also susceptible to the names reaching the name length limit:
This quote sums up debugging eval based code: "I've never even attempted to use it myself, but it seems it would create unreadable, undebuggable code. If you can't read it and can't fix it what good is it?" Note that eval's equally evil siblings evalc, evalin and assignin also make code slow and buggy:
Stephen23
Stephen23 on 26 Sep 2016
Slow
The MATLAB documentation Alternatives to the eval Function explains that code that uses eval is slower because "MATLAB® compiles code the first time you run it to enhance performance for future runs. However, because code in an eval statement can change at run time, it is not compiled".
MATLAB uses JIT acceleration tools that analyze code as it is being executed, and to optimize the code to run more efficiently. When eval is used the JIT optimizations are not effective, because every string has to get compiled and run again on every single iteration. This makes loops with eval very slow. This is also the reason why not just creating variables with dynamic variable names is slow, but accessing them is also slow.
Even the eval hidden inside of str2num can slow down code:
Stephen23
Stephen23 on 26 Sep 2016
Stephen23
Stephen23 on 15 Jul 2018
@Cris Lengo: you can still access that newsreader thread here:
The relevant text is:
MATLAB is not lying to you.
When you run your function, MATLAB needs to determine what each identifier
you use is as part of the process of parsing the function. At that time,
there's no indication in your code that debug should be a variable; however,
there is a function named debug. Therefore, MATLAB decides that the
instances of debug in the code should be calls to that function. When the
code is actually executed, a variable named debug is created, and WHICH
reflects that fact -- but at that point, it's too late for MATLAB to "change
its mind" and it tries to call the debug function on the last line. DEBUG
is a script file, though, and so you correctly receive an error.
This is why you SHOULD NOT "poof" variables into the workspace at runtime,
whether via EVALIN, ASSIGNIN, EVAL, or LOAD.
--
Steve Lord