Dynamic comments on variables?

21 views (last 30 days)
Xiaohan Du
Xiaohan Du on 17 Jan 2018
Edited: Adam on 18 Jan 2018
Hi all,
Is there a way to add comments to variables dynamically? For example, in my code, if I type this variable in command window:
K>> obj.pmExpo.iter
ans =
2×1 cell array
[1]
[3]
Can I add a 'dynamic comment' when defining this variable, such that calling it gives the variable with some comments:
K>> obj.pmExpo.iter
ans =
2×1 cell array
[1]
[3]
this is a comment indicating the use of this variable, so I do not need to go back to the code to know what it does
Edit: thanks to all the replies, very helpful. I can only accept one answer but I really would like to accept all. Now I'm trying to edit all the help of methods including 2 parts: what the method does and what the variables do.
  2 Comments
Rik
Rik on 17 Jan 2018
You can probably write a class that does this, but a better solution would be to use descriptive variable names.
Jan
Jan on 17 Jan 2018
A very interesting discussion. Thanks for this question. +1

Sign in to comment.

Accepted Answer

Jan
Jan on 17 Jan 2018
This sounds like a magic meta programming. I recommend to avoid such smart tricks and keep comments in the source code as usual. Bot of course it is possible:
obj.pmExpo.iter_comment = ['this is a comment indicating the use of this variable, ', ...
'so I do not need to go back to the code to know what it does'];
And now the function to display a field:
function MagicDisp(S, Field)
sprintf('Field: %s\n', Field);
disp(S.(Field));
Magic = [Field, '_comment'];
if isfield(S, Magic)
fprintf('%s\n', S.(Magic));
end
end
Now call it as:
MagicDisp(obj.pmExpo, 'iter')
Well, the name "MagicDisp" implies, that I would not use it for productive code. The function could be enhanced, but I would not do it, because variable which do explain their meaning actively are too tricky to be useful. What would happen if you run a larger code, which uses 200 of such variables and process them in a loop?
If a variable should do anything actively, that it would be starting the daily backup. This is more important, forgotten too often, and the explanations can appear in the code instead.
  3 Comments
Jan
Jan on 18 Jan 2018
  • Some examples for naming schemes:
% "List" means an unordered list without a specific vector
% orientation:
DirList = dir(Pattern);
% "Name" contains --- guess: A name
FileName = {DirList.name};
PathName = {DirList.folder};
% Counters start with "i", sulkingCamelCase
for iFile = 1:numel(FileName)
% An object from a list starts with "a":
aFile = fullfile(PathName{iFile}, FileName{iFile});
...
end
Scalars get lowercase characters, arrays in uppercase. Variables which are used inside the current function have sulkingCamelCase, and if they belong to the input and output they have CamelCase. Temporary variables have a trailing underscore or an appended "_tmp":
Long.Nested.Struct.Data = 4711;
...
data_ = Long.Nested.Struct.Data;
for k = 1:10
data_ = data_ + rand;
end
Long.Nested.Struct.Data = data_;
  • Example for caring about nested structs:
I'm supporting are large code (several 100'000 lines of code) for clinical examinations. The main data objects are implemented as structs: "Person", "Trial", "Model", "Result". E.g. the "Person.Name" is defined during the initialization and provided to all subfunctions. Then a methods is required to prevent the subfunctions from modifying the person's name during the processing. The scientists can implement own plug-ins and it is essential for the reliability, that they cannot destroy information stored in these structs. This can be managed by strictly defined interfaces, which forbid, that functions from the user-land can return these structs in the outputs. But they could still modify the person's name and forward the struct to another user-land plug-in and cause serious confusion. Therefore the source codes of all functions are checked automatically if they contain a "Person." on the left hand side of a "=".
Well, an clean object oriented approach would be easier and more reliable. But the development started with Matlab 5.1 and it is a working and reliable code - I don't touch it now.
Back to your problem: For each of these main structs a specific function exists for creating it with all nested fields. No other function is allowed to add fields in it. Then the corresponding comments can be found easily by opening this function in the editor. If I need the units of the body height, I can check this by: edit moCreatePerson, where "mo" specifies the software and "Create" is reserved for the functions, which are allowed to create the structs.
By this way I used static comments in the source code, but have a general system, how to find them even in large codes.
Adam
Adam on 18 Jan 2018
I use
doc validateattributes
regularly in almost all my public functions for input arguments. Sometimes I end up commenting it out to just act as documentation if the function is called in a massively iterative environment where validating inputs millions of times starts to become an unwanted overhead, but it allows me to see at a glance if something should be a scalar or vector and whether it is a double or string or custom class (on thing I really miss from C++ in a function signature!).
Of course Matlab allows flexibility in that sense that a function taking two arguments could, if programmed to behave that way work by passing in a double and a string or by passing in two custom classes since typing is not forced, but I hate that so I enforce types on all my inputs using validateattributes.
This doesn't help for intermediate variables, of course, but I use naming and a small number of comments for that.

Sign in to comment.

More Answers (4)

Guillaume
Guillaume on 17 Jan 2018
Short answer is no. Such a thing would have to be directly integrated into matlab, it's not something you can add. I'm not aware of any language that provides such a facility.
A better answer is that you already have a facility for documenting what the variable does: its name. So use a more descriptive variable name. iter is useless, it does not tell you anything about the variable other that it may be an iterator (over what?). You have namelengthmax characters to describe the variable, use them.
  5 Comments
Adam
Adam on 18 Jan 2018
Edited: Adam on 18 Jan 2018
An 80 character limit just forces the kind of abbreviated incomprehensible variable names that people often end up with.
It is true that when doing a line of maths I do sometimes fine my long variable names obfuscate what the actual maths is (as in trying to find the operators in amongst the variable names), but I always try to name my variables in a manner I can understand. Even then though it is still hard to come back to code a year later and just understand what it does by glancing at it - a variable name cannot be a sentence, after all.
If I see code like
xi = (xn+a*b(i)+c1-d4)*inc + 47
my brain shouts at me! Sure, I don't have to scan to see it all, but what on earth does it all mean?! (In this case nothing, obviously, but the short variable names would indicate nothing even if it did do something sensible).
Sure, scrolling can be annoying, but life is never perfect, it's always a balance of pros and cons and scrolling is something I don't mind.
When passing arguments, for example, I will often use a line per argument for readability e.g.
myFunc( ...
longDescriptiveArgumentName1,...
longDescriptiveArgumentName2,...
longDescriptiveArgumentName3,...
longDescriptiveArgumentName4 )
rather than having them all on one long line.
Steven Lord
Steven Lord on 18 Jan 2018
Re: line limit, there are two Preferences in the Editor/Debugger section of the preferences for MATLAB that may be relevant here.
The first is in the Display section, which allows you to determine whether or not to show the right-hand text limit line and where to display it. This doesn't actually affect the behavior of the Editor as far as I'm aware, it simply gives the coder an indication of how long their line of code is.
The second is in the Language section, where there are preferences related to comment formatting. You can specify that you want comments to wrap automatically if they reach a particular length, either from the beginning of the line or from the beginning of the comment.
Automatically wrapping code could change the behavior of that line of code, particularly if you're defining a long vector by explicitly listing its elements and the automatic line wrapping code splits that vector definition in such a way that it turns into a 2-by-N matrix (or an M-by-N matrix if the vector is VERY long.)

Sign in to comment.


John D'Errico
John D'Errico on 17 Jan 2018
As the others have said, no you cannot do this. At least not without writing your own classes for no reason except to be friendly to your memory. Such a class may greatly impact the efficiency of your code, so friendliness can be costly. It also risks creating bugs for no good reason.
I'd suggest that if you are creating lots of such variables that have meanings too long to express in the name, then you should not be cluttering your workspace with them, and then hope to come back and remember what all those variables meant a month later.
Instead, you might write a script, one that defines all of your variables. Next to each variable definition, add a comment line. A whole paragraph if you like. There are no charges for comment length. The nice thing about such a script is you run it once. You can even publish the script, which will create nicely readable documentation of all your variables as created.
Of course, ALWAYS use descriptive variable names, ones that are memory aids to their interpretation. If you cannot say everything needed about a variable in 80 characters or so, then you are doing something wrong. And since MATLAB does offer tab completion for variables, how can a name be too long?
Learn to write and use function m-files. The help for each function is easy to provide, as just the block of contiguous comments at the beginning of each function. It should describe what each input does. And when a function returns, since most of the variables are just trash and can be deleted, that is what happens to them if you do not return them as outputs. This prevents your workspace from turning into a pile of deadwood.
Essentially, I'd suggest the real solution is to not build sloppy programming habits. Instead, practice good habits. Both you and your code will be happier in the end. And think of the poor sot who one day might inherit your code, just in case you get run down by the crosstown bus? Yes, I suppose variable names with the capability to attach tooltips might be interesting. But code that has no such need for those tooltips is far batter.
  4 Comments
Jan
Jan on 17 Jan 2018
I type names of variables in the command window only during debugging, but never during productive work. At debugging I rely on the exhaustive comments in the code. Automatic messages in the command window would hide more information than they reveal. The "dynamic comments" have the disadvantage, that they might not be available during reading the source code, and then this impedes the debugging. Of course you could write exhaustive comments and insert "dynamic comments", but as soon as the information is existing twice, you will get problems with keeping the contents up to date when the code is modified.
In a programming language, in which all data are dynamic objects, such comments will be useful. But for Matlab (procedural and object oriented) this is not the case.
The problem reminds me to https://www.mathworks.com/matlabcentral/fileexchange/38977-physical-units-toolbox: Here it is not a comment, but the physical units which are attached to variables. You cannot equip all variables with this feature, e.g. no doubles arrays, cells or structs, but only objects of this type. Therefore it has a limited use for keeping comments.
Another aspect: It is an advantage to keep the code modular. Inside e.g. the function std it does not matter, where the variables are coming from. After std has been debugged exhaustively with test data, the function can be accepted as trustworthy and skipped during debugging productive code. If you extrapolate this argument, the meaning of variables are actually only important inside the main codes, and here meaningful comments in the source code are easy and a common standard. Inventing a completely new programming style is very interesting and might be useful. But it is more likely, that the established methods are sufficient already to solve the actual problem efficiently.
John D'Errico
John D'Errico on 18 Jan 2018
Would it be nice to have that capability? Perhaps for some people. I would never use it. But I suppose that different people work differently. I think the problem is, it would tend to encourage sloppy programming styles. That is, having workspaces crammed with various variables that mean different things. The variables all have short non-mnenomic names now, but who cares, because you have attached a description to each variable.
And suppose I am working on several projects at once? Now it would be easier to keep all my data from all projects in one workspace. So right now, I'm working on project A. I have some variables for project A there, but also projects B, C and D. Hey, its ok, because I've spent the time to label each variable with an attached comment.
The point is, something I would think is bad is for TMW to become an enabler for sloppy programming. Use good mnemonic variable names. Not something you need to mouse over or display at the command line to learn the meaning.

Sign in to comment.


Walter Roberson
Walter Roberson on 17 Jan 2018
It would be possible to write a class that defined a display() method that emitted the comment, so it is not actually the case that it would have to be integrated into MATLAB.
It would probably be easier to write a routine such as the one attached, whatis()
To use this code:
  • call whatis() with no input arguments to get a list of all known variables and their descriptions
  • call whatis() with one character vector input argument to get information about that one variable
  • call whatis() with two character vector input arguments to set the description of the variable named in the first argument to be the second argument
  • If you call whatis() with two inputs and no outputs then the routine just sets the description and exits
  • Otherwise if you call whatis() with one output argument then the routine returns a struct of names and descriptions;
  • otherwise when you call whatis() with no output arguments, the information is displayed in formatted form on the command window
The idea is that when you go to create a variable, you would also call whatis() in the two argument form, giving the variable name and some description. Later when you want to know about the variable, you would call whatis() with the name of that variable to find out what you had previously stored about it.
This code makes no attempt at all to make the information local to its current workspace, so if you re-use variable names for different meanings then the information will get overwritten.
Also, the information is cumulative, and I did not provide any explicit means to throw it all out or to remove the information for a specific name. To reset the stored information completely then
clear whatis
By the way:
The solution that I adopted for this problem with a large program was that I adopted a naming convention that encoded the expected number of dimensions and the data type. Doing the renaming over a fair number of files took more time then one might expect, because I was reading the code to check for consistency as I went. It turned out that I was correctly handling the data types for nearly every case, but there did turn out to be a variable that I was treating as 2D in one place and as a vector in a different place.

Steven Lord
Steven Lord on 18 Jan 2018
If you're storing tabular data in a table or timetable array, those types do have a property called VariableDescriptions that you can use to store some information about the purpose behind the variables in the table or timetable. The summary function displays those descriptions if they have been set.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!