Genetic Algorithm and Direct Search Toolbox 2.4.2
Using the Genetic Algorithm with the Parallel Computing Toolbox(TM)
Some optimization problems involve computationally expensive objective functions. At each generation of the genetic algorithm, the objective, or fitness, function must be evaluated at all the points in the current population. This can be very time consuming for expensive fitness functions. When the fitness function is expensive enough, it is possible to distribute these fitness function evaluations across processors and see speedup in the overall execution of the optimization algorithm. This is a demonstration of how to use the Parallel Computing Toolbox(TM) with the genetic algorithm to speed up the optimization of a computationally expensive fitness function.
Contents
A Simple but Expensive Fitness Function
We will minimize a function of two variables
min f(x) = 100 * (x(1)^2 - x(2)) ^2 + (1 - x(1))^2;
xBecause we want to optimize an expensive fitness function, we will simulate this expense by using a pause statement; specifically we will pause for 0.5 seconds each time the fitness function is evaluated on a single point.
type simple_expensive_fitness.m
function y = simple_expensive_fitness(x) %SIMPLE_EXPENSIVE_FITNESS fitness function for GA % Copyright 2005-2007 The MathWorks, Inc. % $Revision: $ $Date: $ y = zeros(size(x,1),1); %Pre-allocate y for i = 1:size(x,1) x1 = x(i,1); x2 = x(i,2); y(i) = 100 * (x1^2 - x2) ^2 + (1 - x1)^2; pause(0.5) % To simulate expensive function evaluation end
Minimizing Sequentially Using GA
If we want to minimize our fitness function using the GA function, we need create a function handle to simple_expensive_fitness as well as specify the number of variables in the problem.
FitnessFcn = @simple_expensive_fitness; numberOfVariables = 2;
If we wanted to run GA sequentially, that is, without the Parallel Computing Toolbox, the call to GA would be
[x,fval] = ga(FitnessFcn,numberOfVariables)
Setup for Distributing the Function Evaluations
Our approach will be to distribute the computation of the function evaluations rather than change the genetic algorithm. This can be done without changing the fitness function or the GA code. Instead we need only set up the distributed computing environment and then use a special fitness function that will distribute our fitness function across the worker machines.
To use the Parallel Computing Toolbox, the name of the job manager and the desired number of tasks must be specified. At this point, we will assume the job manager name is in the variable "jobManagerName" and the number of tasks is in the variable "nTasks".
Additionally, we must have a copy of our fitness function on each worker machine. We can do this easily by using the "FileDependencies" property of our distributed computing job.
FileDeps = {'simple_expensive_fitness.m'};
We create an anonymous function that takes our fitness function and distributes its function evaluations, and then collects the results into a vector. This anonymous function calls the DFEVAL function in the Parallel Computing Toolbox to evaluate our fitness function on the cluster of worker machines associated with the job manager with name jobManagerName. One of the arguments to the call of DFEVAL is the results of a call to the DISTRIBUTIONROWS function. The DISTRIBUTIONROWS function distributes the current population of the genetic algorithm as evenly as possible into a cell array with nTasks cells.
type distributionpopulation.m
function xdist = distributionpopulation(x, numTasks)
%DISTRIBUTIONPOPULATION Distributes the population of points into a cell
% array.
% C = DISTRIBUTIONPOPULATION(X, N) distributes the matrix X as evenly as
% possible across a N-by-1 cell array C.
%
% Helper function to distributed_fitness demo.
% Copyright 2005-2007 The MathWorks, Inc.
% $Revision: $ $Date: $
if nargin<2
error('gads:DISTRIBUTIONPOPULATIONS:numberOfInputs',...
'DISTRIBUTIONPOPULATIONS requires at least 2 inputs');
end
[nPts nVars] = size(x);
nPtsPerTask = ceil(nPts./numTasks); % Max number of points to use per cell
% Create vector of number of points for each cell
PtsPerTask = nPtsPerTask*ones(numTasks,1);
tI = 1:(nPtsPerTask*numTasks-nPts); % Cells to reduce by 1
PtsPerTask(tI) = PtsPerTask(tI)-1; % Reduce cell by 1
% Separate each subgroup into cell arrays
xdist = mat2cell(x,PtsPerTask,nVars);
When the call to DFEVAL completes, the call to the CELL2MAT function converts the cell array of function values returned by DFEVAL from the worker machines into a single vector, as is expected by the GA function.
distributedFitnessFcn = @(x)cell2mat(dfeval(FitnessFcn,distributionpopulatio n(x,nTasks),... 'jobmanager', jobManagerName, 'FileDependencies', FileDeps));
The anonymous function distributedFitnessFcn, when supplied as a "fitness function" to GA, will compute our fitness function in a distributed manner.
Comparison of Sequential and Distributed Fitness Function Evaluations
Our goal is to speed up the computation of our fitness function on a population of points. Before calling the GA, we can verify that the distributed function evaluations are indeed faster by creating a sample population and comparing the sequential and distributed run times. We create the sample population and then call our original fitness function.
population = rand(50,2);
tic
scores1 = FitnessFcn(population);
disp(['Elapsed time for sequential evaluation of function is ', num2str(toc)
]);
Elapsed time for sequential evaluation of function is 24.9949
Next we call the anonymous function we created which calls our fitness function in a distributed manner on this same population.
tic
scores2 = distributedFitnessFcn(population);
disp(['Elapsed time for distributed evaluation of function is ', num2str(toc
)]);
Elapsed time for distributed evaluation of function is 4.8706
Because this distributed run time is less than the sequential run time, the use of distributed computing will improve the performance of the genetic algorithm for this fitness function.
Performance of GA with the Distributed Fitness Function
We compare the GA without original fitness function and our distributed fitness function. In both cases, we stop after three generations of the algorithm and reset the rand states to ensure the same computation by the GA.
Sequential run:
rand('state',0);randn('state',0); options = gaoptimset('PopulationSize',50,'Vectorized','on', ... 'Generations',3,'Display','iter'); tic [x,fval,reason,output] = ga(FitnessFcn,numberOfVariables,options); disp(['Elapsed time in running GA using sequential function evaluation is ', num2str(toc)]);
Best Mean Stall
Generation f-count f(x) f(x) Generations
1 50 0.0006783 25.44 0
2 100 0.0006783 38.24 1
3 150 0.0006783 34 2
Optimization terminated: maximum number of generations exceeded.
Elapsed time in running GA using sequential function evaluation is 100.1128
Distributed run:
rand('state',0);randn('state',0); tic [x,fval,reason,output] = ga(distributedFitnessFcn,numberOfVariables,options) ; disp(['Elapsed time in running GA using distributed function evaluation is ' , num2str(toc)]);
Best Mean Stall
Generation f-count f(x) f(x) Generations
1 50 0.0006783 25.44 0
2 100 0.0006783 38.24 1
3 150 0.0006783 34 2
Optimization terminated: maximum number of generations exceeded.
Elapsed time in running GA using distributed function evaluation is 19.1685
We see similar performance improvements when calling GA with the distributed fitness function as we did when we compared the sequential and distributed function evaluations. Note that since we have not changed the genetic algorithm, the GA will perform the same steps and find the same best function value whether you use the sequential or distributed fitness function.
Store