nested smpd instructions

7 views (last 30 days)
Marco
Marco on 25 Mar 2011
Commented: Muna Tageldin on 27 Sep 2020
Hello,
I am trying to achieve a high degree of parallelism in my matlab implementation of a support vector machine. From the conceptual point of view the parallelization of a multiclass support vector machine (SVM) is not extremely hard. Indeed a multiclass SVM is composed by a set of binary SVMs that can be trained in parallel. Moreover, if I want to validate my model using a n-fold cross-validation I have another point in the code in which i can achieve parallelism.
I was thinking to parallelize my code by using nested spmd statements (outer spmd at the level of the n-fold crossvalidation, inner spmd at the level of the training of the multiclass SVM). If the inner spmd statement is in a function that is called by the outer spmd then the inner spmd sees always 0 workers available and the code crashes.
Is this normal or am I doing something wrong? Is there any alternative way to achieve parallelism on multiple levels? (parfor doesn´t do that)
Thanks for the answers!
Marco

Answers (3)

Walter Roberson
Walter Roberson on 25 Mar 2011
What you see is normal: nested SPMD loops that are visible to the program only allocate workers at the outer level. One of the Mathworks people who works on the parallel programming facilities has posted indicating that what you can do is have the outer SPMD loop call a function, and inside the function have SPMD loops: the inner ones would then get their own workers.

Marco
Marco on 26 Mar 2011
Thank you for the prompt reply Walter!
The way you described is the way I do it. Unfortunately, when I call the function
s = matlabpool('size');
in the function called by the outer SPMD loop, s = 0 always. Therefore the inner SPMD doesn´t have any worker to be deployed on and everything crashes. This happens even if at the beginning of the program I open a pool of 8 workers ans the first SPMD uses only 2, for example (there should be 6 free right?).
Thanks!
Marco

Jiro Doke
Jiro Doke on 26 Mar 2011
Nesting spmd or parfor does not work. At least for parfor, you can put it in a function as you are doing, but the inner parfor will simply behave like a regular for loop since the outer loop will use up all the workers you open. EDIT: spmd also works in an inner function, but it doesn't use any additional workers.
I believe the only way you can achieve nested parallelism is to use either matlabpooljob or batch. These will allow you to create multiple jobs that can each have matlabpool workers. So your "outer loop" (k-fold cross validation) can be the multiple matlabpooljobs or batch jobs, but specify desired number of workers for each. Note that batch will ultimately require one extra worker in addition to the number of workers you request (because it requires a virtual client worker). Also, the outer parallelism are independent jobs, so they don't talk to each other. If you need interactions between them, I think it will be extremely non-trivial. In your case, I would assume the k-fold cross validations can be set up to be k independent validation jobs, so that shouldn't be a problem.
Here's an example of how you would use matlabpooljob:
jm = findResource();
job(1) = createMatlabPoolJob(jm);
job(1).MaximumNumberOfWorkers = 4;
job(1).MinimumNumberOfWorkers = 4;
createTask(job(1), @trainingFcn, 1, {in1a, in2a});
job(2) = createMatlabPoolJob(jm);
job(2).MaximumNumberOfWorkers = 4;
job(2).MinimumNumberOfWorkers = 4;
createTask(job(2), @trainingFcn, 1, {in1b, in2b});
% submit the 2 jobs
for id = 1:2
submit(job(id))
end
for id = 1:2
waitForState(job(id), 'finished');
results{id} = getAllOutputArguments(job(id));
end
destroy(job);
The above example illustrates how you might create two jobs where each job requests 4 workers (for a total of 8 workers). Inside your function "trainingFcn" above, you can have spmd and parfor.
  5 Comments
Walter Roberson
Walter Roberson on 25 Jun 2012
This is confusing in view of
http://www.mathworks.com/help/toolbox/distcomp/brukbnp-9.html#brukbnp-12
"The body of an spmd statement cannot contain another spmd. However, it can call a function that contains another spmd statement. Be sure that your MATLAB pool has enough workers to accommodate such expansion."
Muna Tageldin
Muna Tageldin on 27 Sep 2020
so if i have this program which use spmd and drange over distributed range (size of array a)
spmd
y=codedistributed(random(3,20),codedistributor1d(2));
a=codedistributed(random(1,20),codedistributor1d(2));
b=codedistributed(random(3,20),codedistributor1d(2));
for i=drange(1:20)
y(1,i)=user_defined_function(a(i),b (1,i));
y(2,i)=user_defined_function(a(i),b (1,i));
y(3,i)=user_defined_function(a(i),b (1,i));
end
end
The user_defined_functin contains 3 nested for loops and the function takes (75 seconds) and I want to speedup the function time by using another spmd.
function y=user_defined_function(a,b)
[ep,u,lam]=ndgrid(1e-3:1e-2:1,1e-3:1e-2:1,1e-3:1e-2:1);
for i=1:size(ep,3)
for j=1:size(ep,2)
for p=1:size(ep,1)
l1(p,j,i)=ep(p,j,i)+u(p,j,i)*a+sum(lam(p,j,i)*exp(-b));
end
end
end
By the documentation, I know I can use spmd inside another function. My question is if I call the function 3 times inside for-drange loop, will it effects the excution time of the function (from 75 seconds to 4 seconds)

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!