parallelization and mex-compiled code

I want to optimize a mex-compiled function (fortran-90 source) defined over an 1D interval by computing its values on a sufficiently fine sampling. It works fine with a for-loop but when I try parfor (for speed) I get crashes in the mex-compiled code (getting a error from one of the workers). Is this a documented problem, and does anyone have suggestions how to localize what goes wrong?
I run MatlabR2013a and Ubuntu 13.10 on a 16 core (32 virtual) machine and I get 12 workers when I do matlabpool.

1 Comment

No, there is no general prohibition against using mex files with parfor. Show us the plain for-loop and the parfor version.

Sign in to comment.

 Accepted Answer

Matt J
Matt J on 6 Feb 2014
Edited: Matt J on 6 Feb 2014
You should try running a plain for-loop first, but with the iterations in random order, i.e., instead of
for i=1:n
...
end
run as
for i=randperm(n)
...
end
This is a good way to test whether your code is independent of the order of the iterations (a basic requirement of parfor) before the Parallel Computing Toolbox even gets involved.

5 Comments

My script:
cat timeme_straightsearch.m
stupidvector=zeros(1,360);
disp('search with conventional for-loop:')
tic
for countindex = 1:360
stupidvector(countindex)=localsearchstrul_valuespectral(countindex);
end
[foundpsi,whichelement]=min(stupidvector)
toc
disp('search with parfor-loop:')
tic
parfor countindex = 1:360
stupidvector(countindex)=localsearchstrul_valuespectral(countindex);
end
[foundpsi,whichelement]=min(stupidvector)
toc
and in matlab:
>> matlabpool
Starting matlabpool using the 'local' profile ... connected to 12 workers.
>> timeme_straightsearch
search with conventional for-loop:
foundpsi =
-5.3948
whichelement =
121
Elapsed time is 186.328337 seconds.
search with parfor-loop:
Error using distcomp.remoteparfor/getCompleteIntervals (line 22)
The session that parfor is using has shut down.
Error in timeme_straightsearch (line 13)
parfor countindex = 1:360
Caused by:
Error using distcomp.remoteparfor/getCompleteIntervals (line
22)
The session that parfor is using has shut down.
The client lost connection to lab 1. This might be due to network
problems, or the interactive matlabpool job might have errored.
Are your parallel labs on remote machines? It rather does look to me like a network error like the error message suggests.
No, its one machine. My interpretation of the error was that the the process generating the error just was aware that the process on the worker had died
Can you try it on a different machine to see if it's hardware problem? I don't see anything wrong with the code.
Thanks for your input, I will try another machine asap. Just an additional observation: The program crashes on the fortran90 statement "call mxCopyPtrToReal8(inptr_xdim,realxdim,1)" i.e a standardconstruction right out of the manualmapges for mex

Sign in to comment.

More Answers (0)

Categories

Asked:

on 6 Feb 2014

Commented:

on 9 Feb 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!