MATLAB Answers

Why am I unable to start a local MATLABPOOL from multiple MATLAB sessions that use a shared preference directory using Parallel Computing Toolbox 4.0 (R2008b)?

49 views (last 30 days)
I am using the Parallel Computing Toolbox 4.0 (R2008b) on multiple Linux computers that run separate sessions of MATLAB. These computers share the same Preferences directory (PREFDIR) for MATLAB, which is a shared file system.
When I start a local scheduler session and worker pool on one computer, using the MATLABPOOL command, I see the following error message after trying to start another, independent Local Scheduler on another computer that shares the Preferences directory:
Destroying 1 pre-existing parallel job(s) created by matlabpool that were in the finished or failed state.

Accepted Answer

MathWorks Support Team
MathWorks Support Team on 18 Oct 2013
The error message can be ignored. Both MATLABPOOLs are expected to function regardless of the message.
To work around this issue, a unique data location directory must be specified for each local worker pool session on each computer. This may be done according to the following steps:
1. On the MATLAB desktop, navigate to:
Parallel -> Manage Configurations
2. Select the "Local" configuration.
3. Right-click and select "Properties..."
4. Specify the DataLocation directory in the corresponding text box.
OR
Change the 'DataLocation' property programmatically to unique locations by executing the following MATLAB code before starting the Local Scheduler, i.e. before running MATLABPOOL:
sched=findResource('scheduler','type','local');
sched.DataLocation = /home/users/JohnSmith/.matlab/local_scheduler_data/<directory name>
If using Parallel Computing Toolbox on MATLAB R2012b or later, please consider using the following new API to set the unique scheduler data locations and to start the MATLAB pool. This is because the old API using 'findResource' will be deprecated in a future release.
pc = parcluster('local')
pc.JobStorageLocation = 'C:\temp'
matlabpool(pc, 4)

  1 Comment

Andreas
Andreas on 25 Nov 2013
I strongly recommend to use separate directories for multiple MATLAB parpools, since I have repeatedly experienced race conditions where two processes try to write to the same temporary .mat file and corrupt it. I use the following to create a new JobStorageLocation for each process (tested with 2013a and 2013b):
c=parcluster();
t=tempname();
mkdir(t);
c.JobStorageLocation=t;
if exist('parpool')
% >= 2013b
parpool(c);
else
% < 2013b
matlabpool(c, c.NumWorkers);
end
Even when using separate JobStorageLocations, I noticed some problems when starting many parpools on the same machine at exactly the same time, where I would get this error occasionally (2013a):
Caused by:
Error using parallel.internal.apishared.ConnMgrBuilder.buildForCJS (line
78)
MatlabPoolPeerInstance{fUuid=...,
fGroupUuid=..., fLabIndex=-1,
fNumberOfLabs=-1} could not bind a ServerSocketChannel on compute-3-4 to
port 27,377; it failed with a JVM Exception: Invalid argument
As a workaround I use pause(1+60*rand())) before starting each parpool.

Sign in to comment.

More Answers (0)

Sign in to answer this question.