Parallel processing using parfor or SPMD

3 views (last 30 days)
Hi,
I am going to use Matlab's parallel processing capabilities within my code but I run into an error. I have trained a classifier using training data and now would like to be able to use test data and predict values.
Here is my code:
tic
interval=1000; % The max number of test data processed in each iteration.
n_all_test_data=20000000;
predict_all=[];
lim1=1;
parfor j=interval:interval:n_all_test_data
predict = classRF_predict(test_data(lim1:j,:),model);
predict_all=[predict_all;predict];
lim1=j+1;
end
toc
Here is the error I get
Error using testing_RF_final_ver1 (line 134)
Error: The temporary variable lim1 in a parfor is uninitialized.
See Parallel for Loops in MATLAB, "Uninitialized Temporaries".
I also changed the above code in the following way but it is much slower than using for loop.
tic
n_all_test_data=20000000;
predict_all=zeros(1,n_all_test_data);
parfor j=1:n_all_test_data
predict_all(j) = classRF_predict(test_data(j,:),model);
end
toc
My question is: Is there any way to speed up the processing using parfor or SPMD?
Thank you.
  1 Comment
Mehdi Ravanbakhsh
Mehdi Ravanbakhsh on 22 Oct 2015
Thanks Walter. Useful comments to help me better understand how parfor works.

Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 21 Oct 2015
Your first set of code is trying to use lim1 as a variable shared between all of the instances. Suppose that you start with lim1 = 1 and parfor happens to run with j = 20000000 the first time; if the parfor were allowed then after the first iteration, lim1 would become 20000001 in time for the next value of j which might be j = 4593148 (because parfor can run the loops in any order.) That next loop would be asked to select testdata(20000001:4593148,:) which would be empty so that iteration would fail.
When you use parfor, the only information that you can carry over from iteration to iteration is the loop index itself (except for "reduction variables'). If your code would not work properly if the order of the iterations were completely scrambled then your code is not suitable for parfor.
classRF_predict appears to be part of the third-party Random Forest package. It does a little bit of options processing and then calls into some C code. I did not see a copy of the C code when I looked, but it is probably around somewhere. At the moment I have no information about how optimized that C code is.
You could improve performance a bit by digging through the classRF_predict.m and finding the call to the mex routine and making that call yourself inside parfor, avoiding the options processing overhead. This will not necessarily be any faster than using a for loop.
Note: code.google.com is shutting down in its existing form in January 2016; you will want to ensure you have a copy of the source code saved.

More Answers (0)

Categories

Find more on Parallel Computing Fundamentals in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!