MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

# How to combine parfor & Parallel Optimization ?

Asked by nah on 21 Jun 2012

I am new to Matlab Parallel computing.

Am working on an optimization problem using fminsearch. The objective function is a complicated function that takes a long time to process a long trajectory data and so I have used parfor to reduce the processing time.

I cut the long trajectory data into many shorter fragments & do the expensive calculations after distributing them via parfor to many workers and later combining the many results into a single scalar for fminsearch to work. This parallelization is working well.

The obj. function also has some fixed parameters and I want to do a parameter sweep over them. How do I achieve this with parallelization ?

As I understand:

1. No nested parfors ie., If I run the objective function under an outer parfor, it is going to run serially.
2. fminsearch doesn't obey 'UseParallel' option. Even if I use a parallel minimizer, the above problem applies

So, this seems like an insurmountable issue for me. Kindly help with your suggestions.

%%%%%%% Pseudo- Code Example:

```load trajectoryData % a [30000x2],; %timeseries data
```
```fixParams = [fp1; fp2;fp3 fp4]; % but i want to do this over a vector of fp3 & fp4 ;
```
```matlabpool start 72
```
```[optparams,fval~,~] = fminsearch('objfun', iniGuess, options,trajectoryData, fixParams)
```
```matlabpool close
```
```% objFunc.m
[objVal] = objFunc(params, trajectoryData,fixParams)
```
```[a b] = size(trajectoryData) ; %b = 2 always
% split into fragments of 1000 points
```
```fragsMat = reshape(trajectoryData,1000, a*2/1000) ; % (or anywhich way)
```
```parfor ix = 1: numFragments
% do heavy calculations
```
```costVal(ix) = costValFrag;
```
```end
```
```objVal = sum(costVal) ; % just an example;
```

%%%%%%

My configuration:

I have a cluster with 128 workers (with torque manager).

% Things am thinking of:

• do i do a dfeval ?
• do i create a batch job for each fixParam set & launch it.

(i don't know how to do these)

main problems:

• Do I need to run multiple matlabpools, in that case ?

## Products

Answer by Walter Roberson on 21 Jun 2012

It is no directly nested parfor. You can have a parfor call a function which has a parfor in it.

I have been getting mixed messages from the Mathworks people about whether pools can be nested or not.

Darn, that's the only relevant posting I can come up with at the moment. I know we have had discussions about this in the past, but I cannot seem to locate them :(

Edric Ellis on 22 Jun 2012

I'm also going to say 'NO' - you definitely cannot open a MATLABPOOL from a worker. Both PARFOR and SPMD use all available parallelism immediately.

Walter Roberson on 22 Jun 2012

PARFOR and SPMD use all *allocated* parallelism immediately. For example if you have 8 cores and your matlabpool is 5 cores, then PARFOR and SPMD use those 5, but do not use the other 3 as well even though they are "available" in some sense. Some of the past discussions have suggested that if there were unallocated nodes then a worker within a MATLABPOOL could allocate more. I gather from this current discussion that Mathworks is now saying, No, that cannot be done.

I recall there have been past discussions about automatic parallelism (e.g., LAPACK called by MATLAB) in workers. The discussions mostly tended to NO but some circumstances were left unclear, and there was a circumstance involving DCS in which (if I recall) it was said that it could happen.

(I recall that about 8 months or so ago I sent one of the Mathworks people email pointing out some conflicting words in discussions, and asking for clarification, but unfortunately no answer was forthcoming. I do remember now whom I sent the email to, so I _might_ be able to dig it up from my mail client to review what it was I found unclear at that time.)

Walter Roberson on 25 Jun 2012

I seem to be finding conflicting information on this topic.
http://www.mathworks.com/help/toolbox/distcomp/brukbnp-9.html#brukbnp-12

"The body of an spmd statement cannot contain another spmd. However, it can call a function that contains another spmd statement. Be sure that your MATLAB pool has enough workers to accommodate such expansion."

You cannot directly nest a parfor loop inside of another as the MATLAB parser will catch it and throw an error before attempting to run it. If you nest a parfor loop inside of a function that is called by another parfor loop then it will run without errors.

However, the MATLAB workers are started as single threaded processes, so they cannot parallelize anything. Note that applies to both parfor and built-in parallel capabilities like fft's.

So if you nest a second parfor inside of a function then it will run because it doesn't get caught by the MATLAB parser, but it isn't actually running in parallel, it just runs serially as if you had called parfor on your client session without opening a matlabpool.

I believe there is a way to start the workers as multithreaded processes to enable the built-in parallel capabilities, but I can't find it at the moment. However even then workers still couldn't start their own matlabpool so they couldn't run the nested parfor in parallel.

So for your particular use case you are going to have to choose between running the inner portions of the objective function in parallel or running the outer parameter sweep in parallel, but not both. You would need to test it out to determine which helps you more.

Adam Filion on 22 Jun 2012

Found the function I was looking for. You can open a pool of 2 MATLAB workers and set them each to use 2 threads with the following:

matlabpool open 2

For the best efficiency, you should have a dedicated core for each thread, so a total of 4 physical cores in this case. Note however that maxNumCompThreads is being removed and I don't know what plans there are to replace it.

Walter Roberson on 22 Jun 2012

Adam, is that approach tested or hypothetical ?

Adam Filion on 22 Jun 2012

Hi Walter, I don't have enough cores to really test this myself, but the answer we got from our developers a while back was that this should work.

Answer by nah on 29 Jun 2012

Since running parfor inside a function called under an outer parfor is no good (it only runs it serially), the solution I have adopted now to implement the parameter sweep is based on the comment Walter pointed to (Jiro's).

It is basically creating multiple jobs each with a given set of input arguments and submitting them.

%%%%%%%%%%%%%%%%%%%%%%%%%

sched = findResource('scheduler','type','torque');

sched.ClusterSize = 144;

sched.HasSharedFilesystem = true;

sched.ClusterMatlabRoot = '/storage/shares/matlabr2011b/';

sched.ResourceTemplate = '-l nodes=1:ppn=12,mem=1gb';

jobStart=tic; counter = 1;

for ix = 1:length(dfVect) for jx = 1:length(dexVect)

` 	dfin = dfVect(ix); dexin = dexVect(jx);       `
```        iniGuessInputs = [1.8 6 dfVect(ix) dexVect(jx)];
argsIn = {iniGuessInputs,timeseriesfrag,N,T,Roin,Cmin,mmin};
job(counter) = createParallelJob(sched);
set(job(counter),'FileDependencies'{'runOptimizationForGivenIniGuesses.m','obj_func_with_parfor.m'})
set(job(counter),'MaximumNumberOfWorkers',12,'MinimumNumberOfWorkers',8);
counter = counter + 1;```
`    end`

end

%% for id = 1:counter-1

` submit(job(id))`

end

timeSubmission=toc(jobStart)

for id = 1:counter-1

` waitForState(job(id), 'finished');`
` results{id} = getAllOutputArguments(job(id));`

end

timeCompletion=toc(jobStart)

destroy(jm);

%%%%%%%%%%%%%%%%

%runOptimizationForGivenIniGuesses.m

%sets up a fminsearch optimization that uses obj_func_with_parfor.m as the cost/objective function and returns the RESULTS (for given parameters)

% obj_func_with_parfor.m % takes a given longer time_series data & uses parfor to calculate costs on the smaller fragments in parallel

%%%%%%%%%%%%%

## 1 Comment

nah on 29 Jun 2012

MultiStart from Global Optimization Toolbox is supposed to do exactly this (Optimize from different starting points) but my cost function also has some fixed parameters I want to sweep on.

As well MultiStart is restriced to few solvers (fmincon) & the above method will work for any trivially parallel Optimization problem (or for that matter, to parallelize any function that already uses parfor inside it )

So, I think this as a general solution for:

1) nested parfor

2) parallelization of functions that uses parfor inside them.

3) parallelizing Optimization problems that needs parallelization also inside their objective functions. (Parallel Optimization with parfor)