MATLAB Answers

Tanil
1

EVAL is evil? Using variables created dynamically - info retrieval

Asked by Tanil
on 9 Dec 2012
Latest activity Edited by per isakson
on 13 Oct 2016
Accepted Answer by Jan
Hi,
I am creating an information retrieval program that will implement the Rocchio method. This isn't important but can be relevant.
I basically have a text document with 40 documents; 20 truthful, 20 deceptive.
The documents are resolved to one word per line and these are the documents to be read. These documents are named like d_hotelname_int or t_hotelname_int
d - deceptive ... t - truth
hotelname refers to the different hotel names
int is the document number
I wanted variables to reflect the document names to avoid confusion so when I wanted to find the term frequency (TF). I want the variable to be TF_documentName.
I went about it like this:
TFdocname = sprintf('TF_%s', documentName);
eval([TFdocname) '= termFrequencies']);
output: TF_d_hilton_1 TF_d_hilton_2 ... TF_t_affinia_20
now I want to get the weight of these new variables which hold the TF
Thought Experiment:
weight = 1 + log10(TFdocname)
in theory I thought it would work but is there a way of calling these variables from eval.

  1 Comment

  • Eval isn't fast, it is slower than just calling any function directly.
  • Eval obscures the code intent. Totally.
  • Eval is not compiled for optimized running. Every call has to eval all over again!
  • Eval makes debugging almost impossible.
  • Eval can produce different outputs in normal and debug modes.
  • Eval can create and overwrite variables in workspaces.
  • Eval is often associated with other practices that are not an efficient use of MATLAB... sequential variable names, for example.
These topics have been covered many times in MATLAB's official documentation, blogs and other discussions:

Sign in to comment.

3 Answers

Answer by Jan
on 9 Dec 2012
 Accepted Answer

The EVAL approach stores information in the name of the variable. This is in general a bad approach, because further tricks are required to get this information back afterwards. You will get more flexible programs, when the actual data are separated consequently from the structure of the program. An exaggerated example:
Instead of storing the value of pi by using the symbol "pi", you can create a variable called ValueOfPiIs_3p141592653589793 and assign the empty matrix as value. Even now you can access the value by parsing the name dynamically and you will get the same results. Only the level of complexity of the program is increased. And now imagine, you have to store the temperature of something at a specific pressure, color, weight and price. Now decide between these solutions:
Temperature_81Hpa_red_71p3kg_76Euro = 15;
Or:
Temperature.Pressure = 81;
Temperature.Color = 'red';
Temperature.Weight = 71.3;
Temperature.Price = 76;
Temperature.Value = 15;
It is easy to create an array of the struct version to manage e.g. 2000 probes, but for the 1st version this will be horrible. And when the program is almost complete, your professor decides that the zodiac sign is important also. Then expanding the struct approach is trivial, while the methods to access the information from the name of the variable will explode.
Even, or especially, if this is a coursework project only, it is the right time to learn clean and efficient programming techniques.

  0 Comments

Sign in to comment.


Answer by per isakson
on 9 Dec 2012
Edited by per isakson
on 13 Oct 2016

The standard answer:
  • yes, EVAL is indeed evil
  • see the FAQ
  • try structures with dynamical field names. Syntax: struct_name.(string_var))
Next one might say: it depends on the context
  • are you making a tool that you will use and improve over some time?
  • are you making a small experiment and will through away the code in a couple of days?
  • are you the only one who will use the tool?

  2 Comments

Ill have a read up about that. So you cannot refer to a variable dynamically when using eval?
Its a tool for coursework.
Its just to see whether a document is truthful or deceptive. Based on test data (the query) we use the training data to find out if the query is true or not. We have to compute this using three methods ... Rocchio, Naive Bayes and KNN. The problem doesn't lie with the implementation or methodology in tackling this problem rather, the syntax and many functions Matlab has.
IMO:
  • the eval-approach doesn't offer any(?) advantages over structure with dynamic field names, but disadvantages
  • squeezing various information into the variable names often leads to complicated code
I guess, I would have tried to use a structure array with some appropriate fields with descriptive names.
You could make a couple of small experiments and ask for comments here.

Sign in to comment.


Answer by Image Analyst
on 9 Dec 2012

I don't quite follow what you your code is trying to do but I'm sure it can be done without using eval() - your eval statement isn't even the correct syntax because it's missing a right bracket. For example:
TFdocname = sprintf('TF_%s', documentName); % documentName is s pre-defined string.
TFdocname = 'termFrequencies'; % Overwrite TFdocname
TF_documentName = 'TF_documentName.' % TF_documentName is the name you said you wanted.
Again, I'm not really sure what you want so I can't give the correct code.

  1 Comment

okay basically I have:
160 documents in each folder - 80 deceptive, 80 truth;
of these 80 documents - 20 documents about 4 different hotels;
they are labels as such
d_hilton_1.oneline
d_hilton_2.oneline
...
t_affinia_20.oneline
NOTE .oneline is how the file is saved representing the words listed on one line.
so, I extract the words from the files and count how many times they appear (Term Frequency)
I then find the unique words in each document using the unique function
From all this I calculate how many times a word appears in that document.
I then want to assign this array of termfrequencies to a variable. The variable will be name TF_documentName where documentName changes in regards to the document being scanned.
I am using textread atm and know it will be out of date soon but this project will be fine for those dates.
I.e.
%loop around the document
%loop around the unique terms
%when a term matches increament the corresponding term related to the row.
termFrequencies = termFrequencies + 1;
TF_d_hilton_1 = termFrequencies; <-- this is my issue ---
I want TF_ to always be in the variable name but the other part reflect the document name
In otherwords TF_d_hilton_1, d_hilton_1 derives from documentName
Eval handles this but I can not refer back to it by simply saying
TFdocname = sprintf('TF_%s', documentName);
eval([TFdocname) '= termFrequencies']));
Then assume I want to work out something else like
weight = 1 + log10(TFdocname);
it wont work. I know it seems complicated but with this many files and documents not naming the variables according to filename is difficult and taxing to keep up with it.
SIDENOTE I am creating a temp variable atm which is holding the TermFrequency of that document when in the loop to refer to. But I obviously can not call the temp again if the loop overrides it. I know this is messy but for the short term it works.

Sign in to comment.