Optmization with fmincon using an external software

Hello,
I have a problem and I do not know how to solve it. I’ll try my best in explaining.
I am optimizing an objective function to fit my data to experimental values by using fmincon. I have boundaries and equality constraints. My objective function (sum of squares of the differences) calls a function that calculates the data to compare to the experimental points.
At first I wanted to learn how fmincon worked so I coded a function “pts_model” that calculates the data. I used fmincon with this function and it fits the data to the experimental points perfectly. But now I am using an external software that does the same thing that “pts_model” (which I will have to use for more complex problems that “pts_model” cannot solve).
Before trying the optimization using the external software I tested the results given by “pts_model” and the ones given by the external software when matlab calls it and the results are the same (difference less than 1E-01 for each point).
However, when I launch fmincon with the external software it does not give the same results than with “pts_model”. Fmincon + external software does an optimization but it is not near as good as the one from fmincon + “pts_model”. The more variables there are the less similar the results are.
To try to understand what the problem is I checked the gradient and search direction during the optimization of both cases but I do not know how to interpret them.
I have attached a file with the results from an optimization with 24 decision variables. I didn't put the results from all iterations but just the first and last.
I am lost at to what to do, I do not understand why I have different results. Both start at the same point and give the same results at that point but somehow the search direction is not the same for both cases on some variables.
I do not know if the information I've given here is enough to understand my problem or get an idea of where the problem might come from but I can provide further information.
If anyone would be kind enough to help I would really appreciate it.
Thank you

5 Comments

Not surprising the two results are different nor that results are less desirable with more variables.
The "real" model is undoubtedly much more complicated than the simpler model you coded and the effects thereof on the iterative process internal to fmincon are such that the solution found with the real model simply isn't the same as that one without the additional effects incorporated -- iow, the two models on the finer scale don't produce identical results and that gets reflected in the iterative solver results.
You can try some variations with the optional parameters -- one could be to use 'FiniteDifferenceType','central' that uses central differences at higher computational cost than the default 'forward' option.
Another could be to try 'Honorbounds','false' that lets the bounds be violated on occasion during the iteration process but that might lead to a better estimation in the end by better representing the underlying functional form on the way to the final solution.
Then again, some problems are just inherently hard...
I had tried 'FiniteDifferenceType','central' and the results were better but still not something I could consider good enough. I wil try it with 'Honorbounds' set to 'false', thank you.
May I ask though why the first calculation of the gradient is not the same? As in, in my model with the external software for some variables it is 0 whereas it is not with pts_model. I thought the gradient was dobjfun/dxi at the points xi, where objfun is the objective function. So if the decision vector is the same at the beginning, shouldn't the gradient in theory be the same? (The expression of the objective function is the same so if we were to calculate an analytical expression of the gradient it would be the same expression for both cases) I know in reality it is not calculated like that and it depends on the results given by the external software and pts_model, but shouldn't they be at least of the same magnitude? Unless I am completely misunderstanding everything here.
Is it because as you said the two models on the finer scale don't produce identical results and that gets reflected in the iterative solver results? I am just trying to see if I have misundertood what the gradient is. They should, in theory, be close enough? And what I really don't get is why even if the gradient is different, for the cases the search direction is the same for some variables.
I did try to read about what these intermediate results mean but I still don't completely understand and I do not understand how they are determined exactly.
The external model is a black box; we don't know how it operates internally.
Unless you can provide an analytic expression for the gradient, it is computed from the models by finite differences and just because the two models may give the same result for a nominal point doesn't mean the gradient at that point is the same between the two models if,
  1. there are effects in the external one that aren't incorporated in the ML version, or
  2. the manner in which the two compute those results isn't the same.
In theory, yes, one would hope the two models would have the same gradient, but if the external model is using a numerical solution while yours is analytic, then it's probable the iteration scheme will not return identically the same values and could, if there are issues about convergence, say, in the detail model it may be that it can't actually compute a reliable gradient for very small deltas around a given point.
You can do some experimenting with the model itself and see how it behaves; it may be that you would need to find another way to compute the derivatives rather than by just blindly calling the model.
I tried central differences with 'Honorbounds','false' with the interior-point algorithm and this time I get satisfactory results, thank you!
The only problem is that the computational cost is really high (one call to my external software takes around 2-3 seconds) so something that I can solve with my code in seconds can take up to one hour with matlab+external software (I know it could be sped up if I provided the gradient but in my case finding an expression for it is rather difficult). But anyways, at least now I get the desired results.
Thank you for your help!
Yes, that's an expected (albeit unfortunate) side effect.
You could try other permutations; the obvious one is to leave 'Honorbounds','false' but revert to default for the differences and see if it was, indeed, just the bounding limits causing the difficulties.
Alternatively, it's likely you would have to find another way to estimate the derivatives besides the numerical calculation and for many models that is very difficult to do in general, granted.

Sign in to comment.

Answers (0)

Products

Release

R2017b

Asked:

AM
on 29 Nov 2018

Commented:

dpb
on 30 Nov 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!