How to recitfy a serialisation error while using a parfor which large data sets?

23 views (last 30 days)
I am working on a project that involves 2D projections of a specimen and i am require to transform it into cross-sectional image and then stack each cross sections to get a 3D image. I using Matlab for my simulations but as it involves huge data sets the processing time is very huge. code-
projection_length = 4100;
images = cell(1,500);
for i = 1:500
fname= sprintf('pre%03d.tif', i);
images{i} = imread(fname);
end
parfor q = 1:projection_length
tmpData = zeros(1600, 500);
for i = 1:500
fname= sprintf('pre%03d.tif', i);
tmpData(:, i) = images{i}(1 : 1600, q, :);
disp(['Analyzing projection ' num2str(q) ' of ' num2str(projection_length) ', Angle ' num2str(i) '...']);
end
idata=255-tmpData;
H = iradon(idata, 0.72, 'Hann', 0.8, 1600 );
postfname= sprintf('post%06d.tif', q);
imwrite(H, postfname);
end
I am loading all images into a cell in the beginning because I found that imread used most of the processing time when I read specific regions again and again in the loop, but since the images are large and the images variable is also too large in this case, i run out of memory. For projection length of around 1600 I undergo a serialization error that says-
Warning: Error caught during construction of remote parfor code. The parfor construct will now be run locally rather than on the remote matlabpool. The most likely cause of this is an inability to send over input arguments because of a serialization error. The error report from the caught error is:
Error using distcompserialize Out of Memory during serialization
Error in distcomp.remoteparfor (line 36) obj.SerializedInitData = distcompMakeByteBufferHandle(distcompserialize(varargin));
Error in parallel_function (line 437) P = distcomp.remoteparfor(W, @make_channel, parfor_C);
Error in Microfab_bead_reconstruction (line 15) parfor q = 1:projection_length
In parallel_function at 450
Is there a way to fix this, is it due to the huge memory of the variable 'images' which results in communication overhead when used parallely? Are there any other alternatives? I tried using the distributed class of variables but then I won't be able to use the parfor.

Accepted Answer

Edric Ellis
Edric Ellis on 22 Jun 2015
The problem here is that images is a large broadcast variable - you're going to be much better off with that as a sliced variable. I presume that each image is the same size n x m where n==1600 and m==projection_length. Here's how you might proceed:
images = zeros(n, m, 500, 'uint8');
for i = 1:500
images(:, :, i) = imread(sprintf(...));
end
Then in your parfor loop, you can then operate like this:
parfor q = 1:projection_length
% This next line replaces the inner for-loop:
tmpData = squeeze(images(:, q, :));
...
end
One slight confusion I have is that in your code, you appear to index into images{i} using 3 subscripts (with one scalar) which would normally result in a 2-d array result - but you're assigning that into a vector slice of tmpData - so is images{i} 2-d or 3-d? If it really is 3-d, the above code can still be made to work.
By slicing in this way, each worker doesn't need a whole copy of images, and only gets sent the portion it needs to operate on.
  4 Comments
Tushar Anchan
Tushar Anchan on 24 Jun 2015
Edited: Tushar Anchan on 24 Jun 2015
Hey I have sliced the 'images' variable as you suggested and it works fine but I found that it takes some amount of time after the first loop(reading images) before the parfor loop iterations begin. When I use a larger data set, that amount of time increases further. What is causing it time to begin the parfor iterations?
Walter Roberson
Walter Roberson on 24 Jun 2015
Where are your measurement points?
There are four loops in the code. The outermost includes the other three, and is for working chunk by chunk. The first inner loop allocates memory. The second inner loop reads files and extracts the data and writes it to the proper arrays; I would expect that loop to be slow. The third inner loop does the radon transform and writes output. Which locations are you measuring at?

Sign in to comment.

More Answers (1)

Walter Roberson
Walter Roberson on 18 Jun 2015
  3 Comments
Walter Roberson
Walter Roberson on 19 Jun 2015
projection_length = 4100;
numfiles = 500;
numrows = 1600;
tmpData = cell(projection_length);
parfor q = 1 : projection_length
tmpData{q} = zeros(numrows,numfiles,3);
end
%I am not positive this can be sliced on i. You might have to slice on q
parfor i = 1:numfiles
fname = sprintf('pre%03d.tif', i);
thisimage = imread(fname);
for q = 1 : projection_length
tmpdata{q}(:,i,:) = thisimage(1:numrows, q, :);
end
end
parfor q = 1 : projection_length
idata = 255 - tmpData{q};
H = iradon(idata, 0.72, 'Hann', 0.8, numrows );
postfname= sprintf('post%06d.tif', q);
imwrite(H, postfname);
end
Walter Roberson
Walter Roberson on 22 Jun 2015
projection_length = 4100;
numfiles = 500;
numrows = 1600;
q_chunk_size = 75; %number of projections to handle in a row. Adjust so that 1600 x numfiles x 3 x q_chunk_size fits in memory
for chunk_base = 1 : q_chunk_size : projection_length
%last chunk might be smaller
if chunk_base + q_chunk_size > projection_length
chunk_length = projection_length - chunk_base + 1;
else
chunk_length = q_chunk_size;
end
tmpData = cell(chunk_length);
parfor qrel = 1 : chunk_length
tmpData{qrel} = zeros(numrows,numfiles,3);
end
%I am not positive this can be sliced on i. You might have to slice on q
parfor i = 1:numfiles
fname = sprintf('pre%03d.tif', i);
thisimage = imread(fname);
for qrel = 1 : chunk_length
tmpdata{qrel}(:,i,:) = thisimage(1:numrows, qrel + chunk_base - 1, :);
end
end
parfor qrel = 1 : chunk_length
idata = 255 - tmpData{qrel};
H = iradon(idata, 0.72, 'Hann', 0.8, numrows );
postfname = sprintf('post%06d.tif', qrel + chunk_base - 1);
imwrite(H, postfname);
end
end

Sign in to comment.

Categories

Find more on Startup and Shutdown in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!