Problem with spmd when data sizes increase

1 view (last 30 days)
Hello,
Ive run into a problem with a script that im working on where when the matrices reach the size of 130x130 the script doesnt run (effectively hangs). With the "size" variable set to 100, the script runs fine. Change it to 130 and it doesnt run at all. The "disp" commands do not print, nothing happens.
Can anybody help with this issue?
Heres the code:
clear
close all
clc
%initialize input data
size = 100;
a = eye(size);
b = eye(size);
%check and make sure pool is closed (cant open multiple pools!)
if(matlabpool('size') > 0)
matlabpool close
end
matlabpool open
tic;
%concurrent environment
spmd
if(labindex == 1)
disp('lab1 start');
comp4 = labReceive('any',4);
comp3 = labReceive('any',3);
disp('lab1:comp6');
comp6 = comp4;
labSend(comp6,2,6);
comp5 = labReceive('any',5);
disp('lab1:comp9');
comp9 = comp6;
labSend(comp9,2,9);
disp('lab1 done');
elseif(labindex == 2)
disp('lab2 start');
disp('lab2:comp0');
comp0 = a;
disp('lab2:comp1');
comp1 = comp0;
disp('lab2:comp2');
comp2 = comp1;
disp('lab2:comp3');
comp3 = comp2;
labSend(comp3,1,3);
labSend(comp3,3,3);
comp4 = labReceive('any',4);
disp('lab2:comp5');
comp5 = comp3;
labSend(comp5,1,5);
comp6 = labReceive('any',6);
disp('lab2:comp8');
comp8 = comp5;
disp('lab2:comp10');
comp10 = comp8;
disp('lab2:comp11');
comp11 = comp0;
comp9 = labReceive('any',9);
disp('lab2:comp12');
comp12 = comp11;
disp('lab2:comp13');
comp13 = comp12;
disp('lab2:comp15');
comp15 = comp13;
comp7 = labReceive('any',7);
disp('lab2:comp14');
comp14 = comp12;
disp('lab2 done');
elseif(labindex == 3)
disp('lab3 start');
disp('lab3:comp4');
comp4 = b;
labSend(comp4,2,4);
labSend(comp4,1,4);
comp3 = labReceive('any',3);
disp('lab3:comp7');
comp7 = comp3;
labSend(comp7,2,7);
disp('lab3 done');
end
end
time = toc;
fprintf('Execution time: %f ms\n',time*1e3);
matlabpool close
  2 Comments
Walter Roberson
Walter Roberson on 26 Jan 2014
Which MATLAB version are you using? How much memory do you have on your system?
Sam
Sam on 27 Jan 2014
Edited: Sam on 27 Jan 2014
Im using Matlab 2012b, installed on the local machine (no remote stuff).
My system specifications:
  • Intel i7 2600 3.4GHz (4 cores, 8 threads)
  • 16GB RAM (about 4GB used by the system, 12GB remaining free)
  • 500GB HDD (385GB free)

Sign in to comment.

Accepted Answer

Edric Ellis
Edric Ellis on 27 Jan 2014
The problem here is that labSend is permitted to block until the corresponding labReceive is posted if the message is "too large". In practice, the point at which labSend starts to block is defined by the underlying MPI implementation - MPICH2. This point is about 128kB - which corresponds to a 128x128 double matrix.
To fix this, you can take one of two approaches:
  1. If you can rework your problem to use a completely deterministic communication pattern, then labSendReceive is the best way to avoid locking up.
  2. If your communication pattern cannot be predicted, then you might try having a deterministic round of communication to enable everyone to agree who they need to communicate with - and then use labSendReceive. See example below for the sort of thing I mean
if isempty(gcp('nocreate'))
parpool('local', 4);
end
spmd
for idx = 1:10
if labindex == 1
% lab 1 is in control of the communication pattern. Pick a random permutation,
% but ensure no-one is trying to send to themselves.
while true
sendTo = randperm(numlabs);
if all(sendTo ~= 1:numlabs)
% Ok
break
end
end
labBroadcast(1, sendTo);
else
sendTo = labBroadcast(1);
end
% Each lab now has 'sendRecv' - a 2 x numlabs array where
% the first row defines who each lab should send to, and
% the second row defines who each lab should receive from.
myDestination = sendTo(labindex);
mySource = find(sendTo == labindex);
% Each lab makes a payload and exchanges it
payload = rand(130);
otherPayload = labSendReceive(myDestination, ...
mySource, ...
payload);
end
end
  2 Comments
Sam
Sam on 27 Jan 2014
Thanks for a great answer!
I have one final question. In order to completely replace the labsend and labreceive commands in my code, I need to be able to specify a tagID for both the sending data & the receiving data. Is this possible? The information on the labsendreceive command info page does not specify how the tag is used.
Sam
Sam on 27 Jan 2014
Ive been looking into how this works and found that both labs need to use the same tag value for the labSendReceive to work correctly (contrary to my implementation that uses different tag values for sending/receiving). I tested this with the script at the bottom (for anyones reference), switching to use the tag value of 2 for the 2nd lab does not work (hangs) confirming my conclusion. This is unfortunate, since the MPIsendrecv command does support separate tags for sending and receiving.
Is there any way to create a custom function calling the MPIsendrecv function and using two separate tags? Maybe through the mex interface?
clear
close all
clc
system_dependent(7)
%check and make sure pool is closed (cant open multiple pools!)
if(matlabpool('size') > 0)
matlabpool close
end
matlabpool open
%initialize input data
size = 10;
a = eye(size);
tic;
%concurrent environment
spmd
if(labindex == 1)
%separate send & receive
fprintf('Lab1 sending to %d with tag %d\n',2,1);
labSend(a,2,1);
[data,src,tag] = labReceive;
fprintf('Lab1 received from %d with tag %d\n',src,tag);
%combined sendreceive
fprintf('Lab1 sending to %d with tag %d\n',2,1);
data = labSendReceive(2,2,a,1);
fprintf('Lab1 received from %d with tag %d\n',2,1);
elseif(labindex == 2)
%separate send & receive
fprintf('Lab2 sending to %d with tag %d\n',1,2);
labSend(a,1,2);
[data,src,tag] = labReceive;
fprintf('Lab2 received from %d with tag %d\n',src,tag);
%combined sendreceive
%uncomment to use tag value of 2
% fprintf('Lab2 sending to %d with tag %d\n',1,2);
% data = labSendReceive(1,1,a,2);
% fprintf('Lab2 received from %d with tag %d\n',1,2);
%uses tag value of 1
fprintf('Lab2 sending to %d with tag %d\n',1,1);
data = labSendReceive(1,1,a,1);
fprintf('Lab2 received from %d with tag %d\n',1,1);
end
end
matlabpool close

Sign in to comment.

More Answers (0)

Categories

Find more on Startup and Shutdown in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!