How do I set up fmincon correctly for parallel computation? "Supplied objective function must return a scalar value"

Question

Shao-Yu on 22 Jan 2025

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/2173173-how-do-i-set-up-fmincon-correctly-for-parallel-computation-supplied-objective-function-must-return

Commented: Shao-Yu on 27 Jan 2025

I have an expensive objective function which takes in multiple parameters, while the contraints are constant. It looks something like this:

out1, out2 = obj_func(in1, in2)

in1 has 1000 different cases. For each possible value of in1, I want to minimize out1 by adjusting in2. At the same time, out2 has to be < 1 in all cases. Because obj_func takes a while to run, I want to parallelize the computation, either by calculating several gradients at the same time, or solve the problem for several in1.

Below is my simplified code:

nPoints	= length(INVARS(:,1));
odInputNames = {'DISA'; 'ALT'; 'MACH'};
% Number of batches depending on given batch size:
sizeBatch = 20;
nBatch = floor( nPoints/sizeBatch ) + 1;  % Last batch contains remaining points
% Define constraints
CONSTR = struct(...
    'param_A',            '...',...
    'param_B',  '...',...
    'param_C',  '...',...
    
CONSTR_REQ = [	1800; ... 
            1.10; ...
            1.15; ...
            ];
for i = 1:nBatch
    
    % Create batchwise input vectors
    for iODVar = 1:length(odInputNames)
        if i < nBatch
	        OD.(odInputNames{iODVar}) = INVARS(sizeBatch*(i-1)+1:sizeBatch*i, iODVar);
        else
            OD.(odInputNames{iODVar}) = INVARS(sizeBatch*(i-1)+1:end, iODVar);  % Last batch has fewer points
        end
    end
    
    % Setup of FMINCON Optimizer
    % ---------------------------------------------------------------------
    A			= [];
    b			= [];
    Aeq			= [];
    beq			= [];
    lb			= ones(size(OD.DISA)) * 80;
    ub          = ones(size(OD.DISA)) * 120;
    options		= optimoptions(@fmincon, ...
        'Algorithm', 'active-set',...
        'MaxIter', 200,...
        'ConstraintTolerance', 1e-8,...
        'OptimalityTolerance', 1e-8,...
        'Display', 'off',...
        'UseParallel', true);
    x0			= ones(size(OD.DISA)) * 85;
    
    % Declare constraint results to be shared between obj_fun and con_fun
    constr_fields = fieldnames(CONSTR);
    con_res = zeros(length(constr_fields), length(OD.DISA));
    
    % Execute Optimizer
    [PLA_max, obj_min,exitflag,~,~]	= fmincon( @(x)obj_fun(x), x0, A, b, Aeq, beq, lb, ub, @(x)con_fun(), options );
    
% Objective:
    function [obj] = obj_fun( PLA )
        OD.PLA        = PLA;
        [OD.OUT, OD.IN] = expensive_function( DES, OD );
        obj = - OD.OUT.param_obj;
        
        for i=1:length(constr_fields)
            con_res(i,:) = eval(['OD.' CONSTR.(constr_fields{i})]);
        end
    end
% Constraints:
    function [c,ceq] = con_fun()
        exp_CONSTR_REQ = repmat(CONSTR_REQ, 1, length(OD.DISA));
        c		= (con_res - exp_CONSTR_REQ); % Non-Equality constraint ( !<= 0 )
        ceq		= [];
    end

It works fine when I solve one case at a time, but when I try to parallelize it I got this error. How can I fix it?

 Error using fmincon (line 348) 
 Supplied objective function must return a scalar value.

2 Comments
Show NoneHide None

Torsten on 22 Jan 2025

Edited: Torsten on 22 Jan 2025

Open in MATLAB Online

Why is lb > ub ?

    lb = ones(size(OD.DISA)) * 120;
    ub = ones(size(OD.DISA)) * 80;    

Why does the error message report something about fminunc ? It is not referenced in your code.

Shao-Yu on 22 Jan 2025

Sorry, made a mistake while simplifying the code. Both mistakes have been fixed in the post now.

Sign in to comment.

Sign in to answer this question.

Answer 1

Matt J on 22 Jan 2025

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/2173173-how-do-i-set-up-fmincon-correctly-for-parallel-computation-supplied-objective-function-must-return#answer_1557937

Edited: Matt J on 22 Jan 2025

I want to parallelize the computation, either by calculating several gradients at the same time, or solve the problem for several in1.

The UseParallel option, which you are already invoking, does parallelize the gradient computation. If there is an analytical algorithm for computing the gradient, however, that will often be quicker.

The error message about fmincon returning a scalar should be easy for you to investigate just by calling the objective function on the initial point x0. If obj_fun(x0) gives you a non-scalar value, then you have a bug, and you must trace it.

If you are deliberately having it return a non-scalar, then you have a misconception about how fmincon works. fmincon cannot perform multiple optimizations in a single call.

25 Comments
Show 23 older commentsHide 23 older comments

Shao-Yu on 23 Jan 2025

Edited: Shao-Yu on 23 Jan 2025

Open in MATLAB Online

DES here is simply a struct loaded from a .mat file, which is necessary for expensive_function. It should not have been an input argument because it's the same regardless of OD. Sorry for the confusion. I've moved it inside the function and now expensive_function requires only OD, which is a struct with parameters as arrays. For example

OD.in1 = [0, 1, 2, 3]
OD.in2 = [10, 9, 8, 7]
OD.PLA = [x1, x2, x3, x4] % At the start they're all set to x0

expensive_function then calculates each column of combinations of OD, for example [in1, in2] = [0, 10], which produces say an param_obj of 99, so the output would look like this

[OD.OUT, OD.IN] = expensive_function( OD );
obj = - sum(OD.OUT.param_obj);   % OD.OUT.param_obj = [99, 15, 66, -2]

I want to find the x1, x2, x3, and x4 that result in the maximum OD.OUT.param_obj.

Unfortunately even with DES out of way, x is still not being adjusted. So from the output message of expensive_function, I can see that all x values are stuck at x0 and the iteration isn't going anywhere.

I could only get it to work when all arrays here were single doubles instead. Then expensive_function was called several times with different OD.PLA until OD.OUT.param_obj (also a double) reaches maximum.

Shao-Yu on 24 Jan 2025

Edited: Shao-Yu on 24 Jan 2025

Open in MATLAB Online

Oh, the code in my post was already the vectorized version. To clarify again, the code below worked, just took very long. It found the PLA that resulted in the minimal obj, before moving on to the the next set of OD.in.

in1 = [0, 1, 2];    % This is read from a file
in2 = [10, 9, 8]    % This is read from a file
x0 = 75;
for i = 1:3
    OD.in1 = in1(i);    % OD.in1 is a scalar
    OD.in2 = in2(i);
    [PLA_max, obj_min,exitflag,~,~]	= fmincon( @(x)obj_fun(x), x0, A, b, Aeq, beq, lb, ub, @(x)con_fun(), options );
end
function [obj] = obj_fun( PLA )
    OD.PLA        = PLA;    % PLA is a scalar
    [OD.OUT, OD.IN] = expensive_function(OD);
    obj = - OD.OUT.param_obj;   % OD.OUT.param_obj is a double
end

While this vectorized version would just keep iterating the same PLA.

in1 = [0, 1, 2];    % This is read from a file
in2 = [10, 9, 8]    % This is read from a file
x0 = [75, 75, 75];
OD.in1 = in1;    % OD.in1 is an array
OD.in2 = in2;
[PLA_max, obj_min,exitflag,~,~]	= fmincon( @(x)obj_fun(x), x0, A, b, Aeq, beq, lb, ub, @(x)con_fun(), options );
function [obj] = obj_fun( PLA )
    OD.PLA        = PLA;    % PLA is an array
    [OD.OUT, OD.IN] = expensive_function(OD);
    obj = - sum(OD.OUT.param_obj);   
    % OD.OUT.param_obj is an array of doubles that each correspond to each
    % combination of [in1(1), in2(1)], [in1(2), in2(2)] and so on
end

The values of in1 and in2 are read from a data file, so they're always the same. The difference is only if I load it into OD.in1 one by one in a for-loop, or 20 at the same time in batches.

In terms of F(x1,x2,x3,x4) == f(x1) + f(x2) + f(x3) + f(x4). There is a relative difference to the scale of 10e-8. I suspect the reason is that there is also an iteration scheme in expensive_function, and when you supply an array of in1 instead of a scaler, it also bundles the overall error up in some way. It is a complex legacy code that I have not much control over, but I can assure you f(x) is built for vectorized inputs and has been used and tested for many years.

Torsten on 24 Jan 2025

Edited: Torsten on 24 Jan 2025

Open in MATLAB Online

It's just a test whether your "expensive_function" works correctly.

If the code

function [obj] = obj_fun( PLA )
    in1 = OD.in1;
    in2 = OD.in2;
    tmp = zeros(size(PLA));
    for i = 1:numel(PLA)
        OD.PLA = PLA(i);
        OD.in1 = in1(i);
        OD.in2 = in2(i);
        [OD.OUT, OD.IN] = expensive_function(OD);
        tmp(i) = OD.OUT.param_obj;
    end
    obj = - sum(tmp);   
    OD.in1 = in1;
    OD.in2 = in2;
    % OD.OUT.param_obj is an array of doubles that each correspond to each
    % combination of [in1(1), in2(1)], [in1(2), in2(2)] and so on
end

works and the code

function [obj] = obj_fun( PLA )
    OD.PLA        = PLA;    % PLA is an array
    [OD.OUT, OD.IN] = expensive_function(OD);
    obj = - sum(OD.OUT.param_obj);   
    % OD.OUT.param_obj is an array of doubles that each correspond to each
    % combination of [in1(1), in2(1)], [in1(2), in2(2)] and so on
end

does not work, the problem must lie in your "expensive_function".

It seems that evaluation of array inputs to "expensive_function" for in1, in2 and PLA give different results in param_obj as if "expensive_function" were called for each array element separately.

I would test this outside the optimization environment.

Matt J on 24 Jan 2025

Edited: Matt J on 24 Jan 2025

Seems like the only options left are making better initial guesses for PLA and approximating the derivatives.

And also Torsten's suggestion of using a parfor loop. That might be the simplest.

So you're saying I should make another function to calculate E(x_i) and E(x_i + δ)

The gradient computations must be done inside your objective functions as described here,

https://www.mathworks.com/help/optim/ug/writing-scalar-objective-functions.html#bu2w6h1

Obviously, you are free to out-source parts of that computation with calls to other functions, if you wish.

Should I just set δ to be something very small, or is there a better way to do it?

You will need to experiment with δ. The CheckGradients fmincon option can help you see if the derivatives you've provided agree with fmincon's own numerical approximation.

Shao-Yu on 27 Jan 2025

Ok thanks a lot @Matt J amd @Torsten! I will try out your suggestions.

Sign in to comment.

Answer 2

Catalytic on 24 Jan 2025

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/2173173-how-do-i-set-up-fmincon-correctly-for-parallel-computation-supplied-objective-function-must-return#answer_1558133

There are other acceleration strategies that may be superior to parallelization.

I notice in your examples that the in1,in2 pairs change by small amounts. You also appear to believe that the final solutions for PLA will be close to each other, otherwise why initialize the optimization with the same x0? Those being the case, it may be worthwile to go back to the strategy of optimizing one PLA at a time. However, you would use the optimized PLA from the N-th optimization as the x0 for the next optimization. This might cut down on the number of iterations you need to get the solution, and reduce the number of calls to expensive function.

1 Comment
Show -1 older commentsHide -1 older comments

Shao-Yu on 24 Jan 2025

The in1, in2 shown here are not the real values. In reality there are more variables and some vary quite a lot. I set x0 to the same only because the final optimized values are hard to predict (in gerneral between 80 and 105), but you're right. I should try to give PLA a better initial guess.

Sign in to comment.

How do I set up fmincon correctly for parallel computation? "Supplied objective function must return a scalar value"

2 Comments
Show NoneHide None

Accepted Answer

25 Comments
Show 23 older commentsHide 23 older comments

More Answers (1)

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

How do I set up fmincon correctly for parallel computation? "Supplied objective function must return a scalar value"

2 Comments Show NoneHide None

Accepted Answer

25 Comments Show 23 older commentsHide 23 older comments

More Answers (1)

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

2 Comments
Show NoneHide None

25 Comments
Show 23 older commentsHide 23 older comments

1 Comment
Show -1 older commentsHide -1 older comments