error with parfor and big matrices

4 views (last 30 days)
Kathrin
Kathrin on 19 Jan 2013
Commented: Aya Ibrahim on 22 Jul 2016
I have a code which uses large multidimensional matrices inside a parfor loop. However, for the size of matrices I need to use, I get the following error (for smaller matrices there is no problem):
Warning: Error caught during construction of remote parfor code.
The parfor construct will now be run locally rather than
on the remote matlabpool. The most likely cause of this is
an inability to send over input arguments because of a
serialization error. The error report from the caught error is:
Error using distcompserialize
Error during serialization
Error in distcomp.remoteparfor (line 36)
obj.SerializedInitData =
distcompMakeByteBufferHandle(distcompserialize(varargin));
Error in parallel_function (line 437)
P = distcomp.remoteparfor(W, @make_channel, parfor_C);
> In parallel_function at 450
If I run the same code with for instead of parfor, the program runs through without any error. My code looks like this (a simple example which throws this warning):
matlabpool open
BigMat = randn(30,30,30,30,30,30);
parfor i = 1:20
CopyOfBigMat = BigMat;
end
matlabpool close
Also, I tried to run it with only one worker, but I still get the same error. So I don't think that the problem is caused by not enough physical memory (I run 64bit R2012b with 16GB RAM).
Since my problem is similar to the one posted here
I also tried to use the WorkerObjWrapper function which was suggested in that post. My code then looks like this:
matlabpool open
BigMat = randn(30,30,30,30,30,30);
w = WorkerObjWrapper(BigMat);
parfor i = 1:20
data = w.Value; %#ok<*PFBNS>
CopyOfBigMat = data;
end
matlabpool close
but this actually throws another error:
Error using distcompserialize
Error during serialization
Error in spmdlang.RemoteSpmdExecutor/initiateComputation (line 82)
fcns = distcompMakeByteBufferHandle( ...
Error in spmdlang.spmd_feval_impl (line 14)
blockExecutor.initiateComputation();
Error in spmd_feval (line 8)
spmdlang.spmd_feval_impl( varargin{:} );
Error in WorkerObjWrapper>(spmd) (line 155)
spmd
Error in WorkerObjWrapper/workerInit (line 155)
spmd
Error in WorkerObjWrapper (line 97)
WorkerObjWrapper.workerInit( tmpId, ctor, args, dtor );
Does anyone know what causes the error during parfor? And has anyone found a solution/workaround for it? Thanks a lot!

Accepted Answer

Matt J
Matt J on 19 Jan 2013
Edited: Matt J on 19 Jan 2013
Seems like it could be a memory error to me, in spite of your insistence that it isn't. BigMat is 5.4 GB, so a matlabpool of even 2 workers will result in 3 clones of BigMat, easily exceeding your 16GB RAM. Though you say it works with 1 worker, this will still result in a minimum of 11GB memory consumption and 5.4GB seems like a lot to transmit. Maybe there's some sort of time-out error? Or maybe you have other arrays of similar size as BigMat in the workspace that are causing memory limits to be exceeded?
In any case, comparing PARFOR to FOR isn't meaningful, because an ordinary for-loop doesn't do any data cloning.
  7 Comments
Matt J
Matt J on 25 Jan 2013
Edited: Matt J on 26 Jan 2013
Could you perhaps elaborate on the structure of the computations, explaining among other things why the entire data set is needed by each parallel job? Parallel processing would make more sense here if each job required only a piece of the data, so that you could then slice the data up into smaller pieces and distribute them to the workers. Or possibly you could use a GPU-based approach, which is better designed for sharing large amounts of data (assuming you have a graphics card with enough RAM).
Tushar Anchan
Tushar Anchan on 19 Jun 2015
I am facing a similar problem, can anyone suggest me how to load data directly on the workers? Do you mean to load the specific parts that will be used by the worker or load the whole data? If we load the whole data on each workers, will it still not exceed the memory limit?

Sign in to comment.

More Answers (1)

Edric Ellis
Edric Ellis on 1 May 2013
Please note that this problem is fixed in R2013a.
  4 Comments
Kathrin
Kathrin on 7 May 2013
That's great news, thank you!
Aya Ibrahim
Aya Ibrahim on 22 Jul 2016
Hi Edric, I am using Matlab R2013a and I still have this problem! is there something special that we should do to make that work in R2013a?!

Sign in to comment.

Categories

Find more on Parallel Computing Fundamentals in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!