# Duplicate points evaluated in Bayesian Optimization

13 views (last 30 days)
Jin Yan on 23 Aug 2019
Answered: Sean de Wolski on 26 Aug 2019
I'm using Bayesian Optimization to solve a problem where each input variable can be either 0 or 1, and I also set 'IsObjectiveDeterministic' to be true and 'UseParallel' to be false for bayesopt. However, I found that there are duplicate points in results.XTrace. Is this possible?

Don Mathis on 26 Aug 2019
It's true that re-evaluating a point when the function is deterministic adds no information. The duplication occurs because of an approximation in our implementation. A perfect implementation of the "Expected Improvement" acquisition function would not choose a duplicate point in a deterministic setting because the expected improvement at an observed point would be zero, and any unobserved point would have an expected improvement greater than zero (but possibly tiny).
However, our Gaussian Process modeling function (fitrgp) does not exactly support deterministic functions. Instead, for deterministic functions our implementation assumes a tiny positive noise level, which results in a tiny positive expected improvement, even for observed points.
For a duplicate point to be chosen, the estimated objective function at all unobserved points would need to be very poor, for such a tiny expected improvement at the observed point to win out. Still, that's a limitation of our current implementation. This is something that could in principle be fixed and we'll look into it.
Jin Yan on 26 Aug 2019
Thank you very much! This explanation makes perfect sense.

Alan Weiss on 26 Aug 2019
As you can see from the algorithm description, there is nothing that prevents multiple evaluations of the same points. So it is not only possible, but expected behavior. I know that it seems wasteful. Sorry.
Alan Weiss
MATLAB mathematical toolbox documentation
Jin Yan on 26 Aug 2019
Thanks for the reply! I also noticed that the documentation does not mention explicitly that the same points won't be evaluated multiple times, but I still don't quite convinced why it is an expected behavior for deterministic problem.
If a point is already evaluated in previous iterations which means it is not the optimal point yet, how can it be selected as the next point to be evaluated?
If a point is already evaluated in previous iterations and for some reason it is selected again as the next point to be evaluated, it will not provide any new information to the Gaussian Process model, right?

Sean de Wolski on 26 Aug 2019
You may want to look at the memoize function to cache the initial call so subsequent ones can just use the cached value.

R2019a

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!