error with parfor and big matrices

I have a code which uses large multidimensional matrices inside a parfor loop. However, for the size of matrices I need to use, I get the following error (for smaller matrices there is no problem):
Warning: Error caught during construction of remote parfor code.
The parfor construct will now be run locally rather than
on the remote matlabpool. The most likely cause of this is
an inability to send over input arguments because of a
serialization error. The error report from the caught error is:
Error using distcompserialize
Error during serialization
Error in distcomp.remoteparfor (line 36)
obj.SerializedInitData =
distcompMakeByteBufferHandle(distcompserialize(varargin));
Error in parallel_function (line 437)
P = distcomp.remoteparfor(W, @make_channel, parfor_C);
> In parallel_function at 450
If I run the same code with for instead of parfor, the program runs through without any error. My code looks like this (a simple example which throws this warning):
matlabpool open
BigMat = randn(30,30,30,30,30,30);
parfor i = 1:20
CopyOfBigMat = BigMat;
end
matlabpool close
Also, I tried to run it with only one worker, but I still get the same error. So I don't think that the problem is caused by not enough physical memory (I run 64bit R2012b with 16GB RAM).
Since my problem is similar to the one posted here
I also tried to use the WorkerObjWrapper function which was suggested in that post. My code then looks like this:
matlabpool open
BigMat = randn(30,30,30,30,30,30);
w = WorkerObjWrapper(BigMat);
parfor i = 1:20
data = w.Value; %#ok<*PFBNS>
CopyOfBigMat = data;
end
matlabpool close
but this actually throws another error:
Error using distcompserialize
Error during serialization
Error in spmdlang.RemoteSpmdExecutor/initiateComputation (line 82)
fcns = distcompMakeByteBufferHandle( ...
Error in spmdlang.spmd_feval_impl (line 14)
blockExecutor.initiateComputation();
Error in spmd_feval (line 8)
spmdlang.spmd_feval_impl( varargin{:} );
Error in WorkerObjWrapper>(spmd) (line 155)
spmd
Error in WorkerObjWrapper/workerInit (line 155)
spmd
Error in WorkerObjWrapper (line 97)
WorkerObjWrapper.workerInit( tmpId, ctor, args, dtor );
Does anyone know what causes the error during parfor? And has anyone found a solution/workaround for it? Thanks a lot!

 Accepted Answer

Matt J
Matt J on 19 Jan 2013
Edited: Matt J on 19 Jan 2013
Seems like it could be a memory error to me, in spite of your insistence that it isn't. BigMat is 5.4 GB, so a matlabpool of even 2 workers will result in 3 clones of BigMat, easily exceeding your 16GB RAM. Though you say it works with 1 worker, this will still result in a minimum of 11GB memory consumption and 5.4GB seems like a lot to transmit. Maybe there's some sort of time-out error? Or maybe you have other arrays of similar size as BigMat in the workspace that are causing memory limits to be exceeded?
In any case, comparing PARFOR to FOR isn't meaningful, because an ordinary for-loop doesn't do any data cloning.

7 Comments

It also doesn't look like you're using WorkerObjWrapper as intended (or at least not to maximum effect). In Edric's examples, he doesn't pass the data itself to the wrapper, though I guess you could do so. He passes a function, so that data transfer will be avoided and so that the data will instead be generated locally on the workers. So your example looks like it should really have been,
s=rng;
w = WorkerObjWrapper(@randn,{30,30,30,30,30,30});
parfor i = 1:20
rng(s); %seed random number gen identically on all workers
data = w.Value; %#ok<*PFBNS>
CopyOfBigMat = data;
end
Again though, given the RAM you have, it looks like only 1 worker could support the data size you are using.
Thank you for your clarification of WorkerObjWrapper. With your code I don't get an error. However, in my actual application I have matrices of data (not generated by a function). I don't know how to use it then (other than what I tried above)?
I further played around with matrix sizes:
matlabpool open 1
BigMat = randn(n,n,n,n,n,n);
parfor i = 1:20
CopyOfBigMat = BigMat;
end
matlabpool close
The code runs without error for n=25 (BigMat is then 1.76 GB), even for 4 workers. But for n=26 (BigMat is then 2.23 GB) I get the error, even with 1 worker.
I don't have anything else in the workspace, the error occurs even if I execute the code right after starting matlab.
However, in my actual application I have matrices of data (not generated by a function).
I've never heard of arrays/matrices consuming GB of data that didn't come from a function. They certainly weren't hand-generated! Maybe if you elaborated on how it was generated...?
You could also try saving BigMat to a file and loading it within the parfor loop,
parfor i = 1:20
S = load(filename,'BigMat');
CopyOfBigMat = S.BigMat;
end
or if you want to make it persistent across different parfor loops using the wrapper object
w = WorkerObjWrapper(@(f)getfield(load(f),'BigMat'), {filename} );
parfor i = 1:20
CopyOfBigMat = w.Value;
end
Aside from all this, though, I'm a bit skeptical that parallelization is the optimal approach here considering the amount of data you need to transmit. Are you sure that whatever parallel work you are doing is more costly than the time it takes to clone and transmit such a large amount of data?
I think Matt is exactly right here - you need to work out how to fabricate/load/whatever the data directly on the workers.
Thank you both for your answers. I tried to follow your advice but unfortunately didn't succeed in my setting. This is what I do:
The code solves an economic model where agents optimize their current behavior taking into account the effects on the future. Hence I solve the model backwards over time (using a for loop) and the optimization is done inside the parfor loop:
for time = T:-1:1
parfor state=1:S [solve for all independent states]
[optimize given results from time+1]
end
end
Unfortunately, I don't see how I can generate the data directly on the workers in this case. And loading it within the parfor loop is even slower than running it without parallelization given the large number of states S.
So for now I am running the code without parallelization because the matrices I need are bigger than the 2GB which seem to be the limit of how much can be transferred to the workers. This is a shame, in economic models you could benefit a lot by parallelization, but this limit makes it impossible for sophisticated models which need to use big matrices.
Matt J
Matt J on 25 Jan 2013
Edited: Matt J on 26 Jan 2013
Could you perhaps elaborate on the structure of the computations, explaining among other things why the entire data set is needed by each parallel job? Parallel processing would make more sense here if each job required only a piece of the data, so that you could then slice the data up into smaller pieces and distribute them to the workers. Or possibly you could use a GPU-based approach, which is better designed for sharing large amounts of data (assuming you have a graphics card with enough RAM).
I am facing a similar problem, can anyone suggest me how to load data directly on the workers? Do you mean to load the specific parts that will be used by the worker or load the whole data? If we load the whole data on each workers, will it still not exceed the memory limit?

Sign in to comment.

More Answers (1)

Edric Ellis
Edric Ellis on 1 May 2013
Please note that this problem is fixed in R2013a.

4 Comments

Edric, thank you very much for this update!
Please, could you elaborate on the details of the change? Does that mean that there is no built in limit anymore (other than physical RAM of course) in how much data can be transferred to the workers in the parfor loop?
Thank you!
Hi Kathrin - for systems where the MATLAB client and matlabpool workers are all 64-bit, we can send data to and from PARFOR loops up to the limit of the amount of RAM on your system.
That's great news, thank you!
Hi Edric, I am using Matlab R2013a and I still have this problem! is there something special that we should do to make that work in R2013a?!

Sign in to comment.

Categories

Asked:

on 19 Jan 2013

Commented:

on 22 Jul 2016

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!