clear is solwer than writing
1 view (last 30 days)
broken_arrow on 20 Oct 2021
I have some code generating a large number of large arrays with a runtime of about 4 minutes (including a parfor loop). Memory consumption after completion is about 13 GB. However, when I run clear afterwards, it seems to take forever, so long that I've never actually waited it out (the longest time I've waited is twice the run time before exiting Matlab). How can that be? After all, during creation addresses and variables have to be written, whereas clearing only requires discarding the addresses. Seems a bit weird to me. If I just close Matlab, the workspace is "cleared" in a moment's notice.
Edit: I can't help thinking it must be some kind of bug. If i reduce the number of arrays by a factor of 10, clear takes just a few seconds (and is faster than the writing operation as one would expect).
Walter Roberson on 20 Oct 2021
You might be interested in the timing tests I did at https://www.mathworks.com/matlabcentral/answers/60240-what-is-the-difference-between-the-effect-of-clear-and-clearvars#answer_757137
My suspicion is that when you use clear for variables that are not "large", that MATLAB takes the time to try to return the memory to the free list in some kind of "best" order. It is known that "large" variables are handled differently: there is a free pool for small variables, and beyond a certain size, MATLAB requests an allocation of free space and returns the memory to the operating system afterwards.
What I do not know at the moment is why returning memory due to reaching the end of a function could that much faster than returning an individual variable.
I know that some implementations of some programming languages use the strategy of allocating a virtual memory heap for all local variables, and returning from the function can consist of deallocating the entire heap in one go, whereas clearing an individual variable could require maintaining a local free list -- more overhead. I do not know whether MATLAB does this; I have never heard that it does, and it has some implementation consequences that I have not seen evidence of. In particular, if you want to return a variable that might have portions allocated locally, then either you do a deep copy out of the local space, or else you have to do some kind of tracking of variables that are marked as return variables, and do a deep copy at the time one of them is assigned to, or else you have to do some kind of tracking of variables that are marked as return variables and do some crufty back tracking to find all places where memory might be returned (including through layers of struct) and allocate those in a different way (that last one seems unlikely.)