Performance impact of using package folders

4 views (last 30 days)
I have been trying to do a major re-factoring of my code and thought I would use the the package system (ie. +folders) to organize things better.
After noticing the slow down of my application. I did some benchmarking and noticed that there was a large difference in run time when using the same function in a package and not. To give you a sense of the impact, using a package increase the run time of one of my functions 50%, to put that in perspective that function is 100 lines of code, uses 48 variables, does 72 variable assignments, 76 additions, 36 multiplications, and 9 divisions (It is an incredibly fast algorithm to calculate the basis matrix for a spline of degree 4, that I am very proud of). So that is a lot of computations.
After doing to digging it appears that the package system is implemented as a series of objects/classes (which make sense). Therefore I assume that the slowdown because of that. So I have the following questions/observations.
1.) I am using 2012b, is there any improvements or better integration of the package system on the horizon, such that if I take the hit now the performance impact will be negligible in future.
2.)The performance impact seems to be the same regaurdless of package depth, but has more to do with the number of items in that package(ie more files = more impact).
3.) Use of a function handle to one of those packaged function seems to take that package reference penalty every time it is called instead of just at the creation of the function handle.
Please let me know if you have any thoughts about what I might be doing wrong.
  2 Comments
per isakson
per isakson on 12 Apr 2013
I fail to reproduce your results with R2012a 64bit on Windows 7. With package I see only a very small penalty. Could you provide an example code.
Nicholas Dinsmore
Nicholas Dinsmore on 18 Apr 2013
I created dumbfunc1-dumbfunc20 which or empty functions and smartfunc1 which is a wrapper around nchoosek(100,10) which has a lot of multiplications
The test code I ran is:
Iterations=100000;
FuncHandles = { @dumbfunc1,@a.b.c.d.e.dumbfunc1,@z.dumbfunc1,@x.dumbfunc1 };
for nFunc=1:length(FuncHandles);
TestHandle=FuncHandles{nFunc};
tic
for n=1:Iterations
TestHandle();
end
TIME=toc;
disp({TestHandle,TIME});
end
%This is so that the JIT takes care of nchoosek
for n=1:Iterations
nchoosek(100,10);
end
FuncHandles = { @smartfunc1,@a.b.c.d.e.smartfunc1,@z.smartfunc1 };
for nFunc=1:length(FuncHandles);
TestHandle=FuncHandles{nFunc};
tic
for n=1:Iterations
TestHandle();
end
TIME=toc;
disp({TestHandle,TIME});
end
my results are:
@dumbfunc1 [0.0232]
@a.b.c.d.e.dumbfunc1 [0.3063]
@z.dumbfunc1 [0.2991]
@x.dumbfunc1 [0.2993]
@smartfunc1 [3.3941]
@a.b.c.d.e.smartfunc1 [3.6498]
@z.smartfunc1 [3.9589]

Sign in to comment.

Answers (3)

Ray
Ray on 18 Jan 2018
A similar attempt using R2017a on Windows.
I created foo above and placed it in:
  • onPathDir/foo.m
  • onPathDir/+pack1/foo.m
  • onPathDir/+pack1/+pack2/foo.m
I then added onPathDir to my MATLAB path and changed directory away from it. the following test was then saved in a different directory:
function [] = testFoo()
a = rand(1);
b = rand(1);
nRun = 1000000;
y0 = zeros(1, nRun);
y1 = zeros(1, nRun);
y2 = zeros(1, nRun);
tic
for x = 1 : nRun
y0(x) = foo(x, a, b);
end
t0 = toc;
tic
for x = 1 : nRun
y1(x) = pack1.foo(x, a, b);
end
t1 = toc;
tic
for x = 1 : nRun
y2(x) = pack1.pack2.foo(x, a, b);
end
t2 = toc;
[t0 t1 t2]
end
If run as a function, I saw no overhead to the use of packages on the search path. In fact, on some repeats, the package times were actually lower. However, if I comment out the function definition line and the final end, and turn testFoo.m into a script, there is an extreme overhead for using packages: ~800x for single package and ~1000x for nested packages.
It looks like the function compiler resolves the path and then code execution is fast whereas in a script, there may be path resolution overhead on each loop iteration!

Sean de Wolski
Sean de Wolski on 18 Apr 2013
Edited: Sean de Wolski on 18 Apr 2013
There is unfortunately a bit more overhead in the function call when calling packages. Here is the timing I did:
With this function both in and not in a package (+foopack):
function y = foo(x,a,b)
% I create awesome lines!
%
y = a.*x+b;
end
And this timing function:
function timeit
%Time foo v. foopack.foo calls
%
%
%SCd - 735262
%
%Some values:
[t1, t2] = deal(0);
a = 1;
b = 2;
x = 3;
%Sum their times over 1000 function calls:
for ii = 1:1000
tic
y = foo(x,b,a);
t1 = t1+toc;
tic
yp = foopack.foo(x,b,a);
t2 = t2+toc;
end
%Display results:
fprintf(1,'\nfoo regular: %fs\nfoo package: %fs\n',t1,t2);
fprintf(2,'\nSlowdown of Package: %f\n\n', t2./t1);
end
It is my understanding that this is pretty much the worst case scenario since the overhead controls over the computation time. I really like packages and use them a fair amount. But for speed critical applications, where a function will be called a lot of times, it might pay to pull those computations outside. It's also important to realize that even though it's slower, as far as total time is concerned, it's still pretty quick.
  1 Comment
Nicholas Dinsmore
Nicholas Dinsmore on 19 Apr 2013
Sean,
That begs the question is the a commitment within Mathworks to reduce that overhead? I am trying to figure out whether that performance hit is short term thing or if in future versions it should be reduced. I can think of many ways you could improve that performance just from my own work make large OOP systems in Matlab and then running them through an ODE solve(ie where speed is important).

Sign in to comment.


Joel Fischer
Joel Fischer on 21 Feb 2022
Since I'm also refactoring a code base, I tried this again on R2021a and can confirm what @Ray found:
When calling package functions from a script, there is a considerable overhead, however when calling them from a function there doesn't seem to be any difference in performance. Additional depth (calls to a function in a sub-package) seems to slightly increase the overhead as well.
In my case, the overhead when calling from a script was ~10us per call (run on a Intel i7-9750H @ 2.6GHz, 64GB DDR4 @ ~2600MHz).
Histogram showing the time needed per 100k function calls from within a script.
My test function was:
function [C] = foo(n)
A = rand(n);
B = rand(n);
C = A\B;
end
Three copies of which were located at (relative to the current working directory):
  • foo.m
  • +test/foo.m
  • +test/test2/foo.m
As suggested by @Ray, the benchmark was run once as a function and once as a script (by just commenting out the first and the last line):
function [t,fig] = foo_test()
%% setup
n = 20;
t = zeros(3*n,3);
%% benchmark
for j = 1:n
A = 0; a = tic(); for i=1:100000; A = A + foo(5); end; t((j-1)*3+1,1) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + test.foo(5); end; t((j-1)*3+1,2) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + test.test2.foo(5); end; t((j-1)*3+1,3) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + test.foo(5); end; t((j-1)*3+2,2) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + test.test2.foo(5); end; t((j-1)*3+2,3) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + foo(5); end; t((j-1)*3+2,1) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + test.test2.foo(5); end; t((j-1)*3+3,3) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + foo(5); end; t((j-1)*3+3,1) = toc(a);
A = 0; a = tic(); for i=1:100000; A = A + test.foo(5); end; t((j-1)*3+3,2) = toc(a);
end
%% plotting
fig = figure();
%title('script');
title('function');
hold on;
C = colororder();
for i=1:3
histogram(t(:,i),'FaceColor',C(i,:),'FaceAlpha',0.5);
end
hold off;
legend({'foo(5)','test.foo(5)','test.test2.foo(5)'},'Location','North');
ylabel('counts [-]');
xlabel('t/100k calls [s]');
grid on;
box on;
end

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!