How to profile calls in a for loop?

I have a piece of code that does something like:
mf = matfile('newfile','Writable','true');
mf.var01 = zeros(100,100,100);
for ii = 1:100
data = ncread('filename',inputs);
mf.var01(:,:,ii) = myfunction(data);
end
Although as you can see from the code above, I'm preallocating the matrices, etc, the for loop slows down significantly over time. Something is happening with myfunction that is causing this, but I'm having trouble tracking down the problem.
If I use the profiler, I can see the total number of calls made to myfunction, but I don't think there's a way to see how much time each call took, or to profile each individual call in the for loop to undertand why the 100th call was slower than the first.
Any idea of how to debug this? (I am fairly positive this isn't a classic 'you forgot to preallocate matrices' problem).
Thanks.

8 Comments

Stephen23
Stephen23 on 28 Jan 2021
Edited: Stephen23 on 28 Jan 2021
File compression?
Just a guess, but what happens if you preallocate using random data (similar to your actual data).
Hi @Stephen Cobeldick, would you mind expanding on your comment?
Stephen23
Stephen23 on 28 Jan 2021
Edited: Stephen23 on 28 Jan 2021
@Carlos: please tell us which version mat file you are working with. You might get this information by viewing the mat file in a text editor, or looking at your default preferences:
Mat file versions 7 and later use compression by default. As that link explains, it is possible that the whole (or significant parts of) the array get compressed/uncompressed as you write to it, and like most compression algorithms the speed (and compression ratio) is affected by the data itself: lots and lots of zeros are probably going to compress/uncompress very quickly (at the start of your loop) and get slower later on as you add what is basically random data.
Also note that partial loading of variables (e.g. using indexing, as your example shows) is only supported by version 7.3, and this version must be explicitly selected when creating the mat file:
Hi @Stephen Cobeldick, I make extensive use of partial loading/saving of variables as I deal with very large files. My .mat files are v7.3 by default, including the ones above.
Below is a working test to illustrate the problem:
mf = matfile('looptest.mat','Writable',true);
T = 100;
Lp = 1454;
Mp = 766;
mf.l02_temp = zeros(Lp, Mp, T);
for ii = 1:T
tic
mf.l02_temp(:,:,ii) = rand(Lp,Mp,1);;
Timing(ii) = toc; toc
disp(ii)
end
Lp and Mp are the actual dimensions of my data variables, but T is >700 in this case. Here's a plot showing the results of plot(1:T,Timing):
Given your comments about compression, I tried this:
a = 1;
save looptest2 a -v7.3 -nocompression
mf2 = matfile('looptest2.mat','Writable',true);
T = 100;
Lp = 1454;
Mp = 766;
mf2.l02_temp = zeros(Lp, Mp, T);
for ii = 1:T
tic
mf2.l02_temp(:,:,ii) = rand(Lp,Mp,1);;
Timing(ii) = toc; toc
disp(ii)
end
But "Timing" still increases more or less linearly. Granted, I don't know whether Matlab is deciding to use compression of the l02_temp variable even though I used the 'nocompression' option.
The plot below shows the problem for the analysis I'm conducting. You can ignore those peaks before 35 as I think my computer decided to do something else at those times. The two lines just represents two parts of the loop with slightly different code blocks.
I appreciate any ideas you might have to make this better.
@Carlos: please try my earlier suggestion of preallocating the array using random numbers, e.g. replacing
zeros(Lp, Mp, T)
with
rand(Lp, Mp, T)
With the -nocompression option this is unlikely to make any difference, but is easy to check.
You're correct, pre-allocating radom data instead of zeros helps in the sense that the time is now stable at ~30 seconds, but of course saving a single 1456x766 matrix that slowly isn't great performance (that operation takes 1/3 second on my machine, if I save it to its own file).
Here's a plot showing the results of tic/tocing the write operation to the matfile (code below):
For my purposes, this makes matfile fairly ineffective as a tool.
Thank you for help, I wasn't aware that preallocating random data instead of zeros would make such a difference in this situation.
mf = matfile('looptest3.mat','Writable',true);
T = 100;
Lp = 1454;
Mp = 766;
mf.l02_temp = rand(Lp, Mp, T);
for ii = 1:50
tic
ktemp = rand(Lp,Mp,1);
toc
tic
mf.l02_temp(:,:,ii) = ktemp;
Timing(ii) = toc; toc
disp(ii)
end
plot(1:50,Timing)
The random-data-preallocation was to test whether the speed decrease was related to replacing the 0s in mf.var01 with more complex values. Preallocating it with random values definitely does not help as your data indicate, and I'm sure you understand. The first plot you showed with 0-allocation resulted in durations that grew from 20s to 30s seconds. The second plot you showed with rand-allocation resulted in durations that are all greater than 29 seconds. The rand-allocation wasn't meant to speed things up; it was a diagnostic test.
With 0-preallocation we saw a speed decrease as the number of iterations increased.
With rand-preallocation, the speed varied but did not gradually decrease as the number of iterations increased. Furthermore, it varied close to but a little above the maximum duration when the preallocation was done with 0s toward the end of the loop when the array no longer had many 0s.
This is evidence that the slow-down is due to the increased complexity of variables stored in your array.
If the speed of each iteration was primarily dependent on partial loading with matfile, you'd still see the gradual change.
Indeed, that makes sense.
Now the question is, is 30 seconds for this operation reasonable for what we expect from matfile?
Best.

Sign in to comment.

Answers (1)

Adam Danz
Adam Danz on 28 Jan 2021
Edited: Adam Danz on 28 Jan 2021
I'd start with a simple tic/toc time of each iteration and plot out the results to see if the increase happens at a specific iteration or gradually increases. You can use that data to know when to pause the loop to determine if the 'data' variable or your function is doing something different.
More info on timing loops: see this answer (this demo uses method 3).
mf = matfile('newfile','Writable','true');
mf.var01 = zeros(100,100,100);
tStart = tic(); % added
times = nan(1,100); % added
for ii = 1:100
data = ncread('filename',inputs);
mf.var01(:,:,ii) = myfunction(data);
times(ii) = toc(tStart); %added
end
% added this section
iterationDurations = diff([0, times]);
plot(1:100, iterationDurations)
Keep in mind the first iteration or so may take longer than the next few due to the JIT compiler. If the computations done within the loops are nearly uniform in their complexity, then it's expected that the speed of each loop will increase (become faster) as the number of iterations increase. A decrease in speed suggests that the size of "data" may increase or the operations performed on "data" in "myfunction" may differ.

3 Comments

Carlos
Carlos on 28 Jan 2021
Edited: Carlos on 28 Jan 2021
Thanks, I've been trying tic/tocs as you suggest, but just wanted to clarify that I'm correct to think that the profiler doesn't provide the timing of individual calls in a for loop? So If I want to know where "myfunction" is slowing down in time, I will have to perhaps put tic/tocs in the function itself to manually profile the code.
I was exploring saving a profile per iteration in the loop, but it generates a ton of html files and the results seem challenging to navigate.
Thanks for your time.
Adam Danz
Adam Danz on 28 Jan 2021
Edited: Adam Danz on 28 Jan 2021
The profiler does not return info on single iterations of loops to my knowledge. But before you produce a ton of profiler reports, I'd plot the tic/toc time of each iteration to understand if it's a gradual growth of time or a sudden step that reliably happens at the same iteration.
If there's a sudden jump in time you have a starting point in your investigation and can run the profiler at that iteration.
If it's a gradual change you can also collect the size of "data" variable in each iteration (assuming it changes, I have no idea) and plot that out as a function of tic/toc duration to see if that's the culprit. If the size of that varible doesn't change, then perhaps its content adds to the complexity in your function.
Another possibility is the changing content of mf.var01. Preallocating that varible reduces overhead but that doesn't mean changes to that variable won't affect processing time. As that variable changes from 0s to larger values, it consumes more memory which can affect speed.
If all else fails, you can program the function to produce a profiler report every n-iterations so get ~5 reports that span the first and last iteration.
Carlos
Carlos on 28 Jan 2021
Edited: Carlos on 29 Jan 2021
Thanks, very helpful. The size of "data" doesn't change much (it's the output of a numerical model), and the change is gradual. The iteration time starts growing at about the 40th iteration (of +700) or so. I've "tictoed" the reading of the data and that doesn't seem to be the problem.
The memory footprint doesn't seem to be growing that much, but maybe the system is swapping more to disk over time or something like that.
I'll keep trying. Thanks.

Sign in to comment.

Categories

Find more on Graphics Performance in Help Center and File Exchange

Asked:

on 28 Jan 2021

Commented:

on 29 Jan 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!