Best choice of fmincon algorithm for high-cost 1D convex problem with "stochastic" gradient information?

Question

Hayden Scheiber on 3 Dec 2021

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/1602495-best-choice-of-fmincon-algorithm-for-high-cost-1d-convex-problem-with-stochastic-gradient-informat

Edited: John D'Errico on 3 Dec 2021

I am working on finding the melting point of a molecular dynamics simulation model by recasting it as a minimization problem. I have developed a high-cost stochastic blackbox function that I want to minimize.

Behind the blackbox, I'm running a molecular dynamics simulation of a solid-liquid interface at a specific input temperature T and detecting how the fraction of liquid vs solid changes in time. The MD simulation runs until the system completely freezes, completely melts, or a time limit is reached.
If the simulation freezes or melts, the blackbox function returns a number where is the time it takes for the system to melt or freeze (pc = phase change).
The blackbox function also returns a "gradient", equal to where if the system freezes or if the system melts.
This isn't an quite an exact gradient for the function as there is some stochasticity in , but I want to provide fmincon with information about the direction to take for the next step.
The time it takes to melt or freeze also increases as you approach the melting point, so is a natural choice of an approximate gradient and is a natural choice for the blackbox function value. As goes to infinity, goes to 0.
If the simulation reaches without melting or freeze, I declare that the melting point is reached and return and . I have fmincon 'FunctionTolerance' set to 0 so that the algorithm stops at the melting point.
Running this simulation is quite costly and takes ~30 minutes on 32 high-performance compute cores for each iteration, so I want to reduce the number of function evaluations as much as possible.
The problem is only 1-dimensional (input variable is the temperature) and should be convex, although is "stochastically convex" near the melting point (the exact time to freeze or melt depends on the particular simulation trajectory, which is chosen randomly from an ensemble). Still, the direction of the gradient is always correct, so it is useful to keep this information.

My question is which of the five fmincon algorithms is most suitable for my case? Or perhaps there is another MATLAB function better suited than fmincon? I don't have a strong grasp on the details of the algorithms in terms of their efficiency to my problem, even after reading through the documentation. Again, I really want to reduce the number of function evaluations as much as possible but also use the gradient-direction information that I have.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

John D'Errico on 3 Dec 2021

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/1602495-best-choice-of-fmincon-algorithm-for-high-cost-1d-convex-problem-with-stochastic-gradient-informat#answer_847050

Edited: John D'Errico on 3 Dec 2021

A 1-d problem, so ONE unknown? Don't use fmincon at all!

And if you wish to supply the gradient, then you cannot supply only an approximate gradient. That will just cause problems. Fmincon does not have an option where you can give it hints.

You MIGHT consider fminbnd, a tool designed to solve 1-d problems.

And of course, since you have a stochastic problem, in the sense that it will not even be differnetiable near the solution, again, fmincon is a terrible choice of tool. fmincon will assume differentiability. It sill assume your objective will return a consistent value, so if you evaluate the function a second time at the same point, it should return the same value. So even fminbnd will have issues due to a stochastic objective.

I would probably just write my own code, where I use the last few objective evauations to form a local polynomial approximation to the problem. At the same time, you can use that modeling to determine an approximation for the level of noise to be expected. And that will tell the algorithm when to quit, that further searching will not yield a better approximation to the solution.

Finally, could you use such a tool to include prior information on the approximate gradient as a hint in the search? Well, yes, you could. Now the problem will start to look like a Kalman filter. Estimate the coefficients of a local quadratic polynomial, given some prior information about the derivative of said polynomial. Then find the minimum of the polynomial as your next iteration. Since you also have information about the uncertainty in your coefficients from the kalman filter, that will tell you how much you can trust that next step, and tell you when to stop looking. This is code you would write yourself of course.

4 Comments
Show 2 older commentsHide 2 older comments

Hayden Scheiber on 3 Dec 2021

Edited: Hayden Scheiber on 3 Dec 2021

Yes, one unknown and with simple boundaries on the search domain. I considered fminbnd, but that does not include derivative information so it would likely require more iterations than fmincon for this problem. I've done some preliminary calculations using the default 'interior-point'algorithm and it actually works fairly well, I can get melting points for many models, but sometimes the algorithm takes a lot of small steps. Thankfully, although my function is stochastic, the level of noise is quite small becuase I use a fairly large simulation under the hood and I only need melting points accurate to ~ 5 K.

Of course I could write my own code but... I really want to avoid getting into the business of writing optimization code as there's so many pre-built functions that should in principle work better than anything I can make (I'm a computational chemist not a mathematician or computer scientist), but if that's the only reasonable suggestion then I suppose I'll go that route. I'm just surprised that matlab optimization/global optimization toolboxes don't have anything to offer for a problem like this.

Hayden Scheiber on 3 Dec 2021

Edited: Hayden Scheiber on 3 Dec 2021

Thank you for your responses John, they have been helpful.

I would classify my problem as only "slightly stochastic". With repeated simulations, it seems to seem to reproduce the same evaluation for a given input temperature with only a small error. In the thermodynamic limit, my function would approach fully deterministic. The stochastic derivatives are close enough to deterministic such that, in preliminary tests so far, I can get a good answer using the deterministic fmincon default algorithm in most cases. I need to check just how noisy it actually is.

I realize that using non-exact derivatives will cost me compared to a case with exact derivatives. However, I think it remains to be seen whether my slightly stochastic derivatives speed up convergence compared to not using them (direct search). It all depends on how much error is really in these blackboxfunction evaluations, which I have not yet quantified. I think I'll test this with repeated calls at the same temperature to build up a distribution of outcomes.

Regarding your third paragraph that's simply not true. For example, stochastic bayesian optimization is very well suited for stochastic objectives with a high cost objective function, and apparently someone has recently developed a bayesian optimization algorithm for use with noisy deriatives, although this feature is not implemented in Matlab's bayesopt unfortunately. In any case, I think I'm just going to try a few solvers including bayesopt, fminbnd, fmincon with different algorithms, and patternsearch to explore the problem deeper.

My main question here was which of the fmincon algorithms is best suited for a 1D convex problem like mine where derivatives are available (ignoring the issue of stochastistity).

Thanks for the suggestion about response surface optimization, it sounds similar to surrogate optimization, which is basically how bayesian optimization functions.

John D'Errico on 3 Dec 2021

Edited: John D'Errico on 3 Dec 2021

While there will certainly be tools in existence, they are not provided in the optimization TB. I lack experience with tools like bayesopt to help you in that respect, sicne I lack that TB.

Sign in to comment.

Best choice of fmincon algorithm for high-cost 1D convex problem with "stochastic" gradient information?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

4 Comments
Show 2 older commentsHide 2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Best choice of fmincon algorithm for high-cost 1D convex problem with "stochastic" gradient information?

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

4 Comments Show 2 older commentsHide 2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

4 Comments
Show 2 older commentsHide 2 older comments