bayesopt() fails to find any feasible solutions when feasible proportion of search space < ~ 10^-5 %
10 views (last 30 days)
I have been conducting experiments on using bayesopt() to optimise model parameters. There is a reasonably simple constraint function which defines the feasible region of the search space. bayesopt has performed well for problems with varying numbers of dimensions and solutions in the search space and feasible region. However, I have noticed that when the feasible proportion of the search space crosses a threshold (<10^-5 %) bayesopt stops working well. Closer examination of all the solutions evaluated reveals that after evaluating Initial X, bayesopt exclusively evaluates solutions which break the constraint function. This seems unusual behaviour and I want to discover whether this is due to bayesopt's coding or an inherent limitation in the Bayesian Optimisation approach.
I have discovered issues in the past where bayesopt failed to initiate because it thought that the constraint function was unsatisfiable (even though it was) Previous bayesopt constraint function thread . This was because to assess whether a constraint function is satisfiable bayesopt generates 10^5 random solutions and tests if any of them satisfy the constraint function. I was given a way to remove this check.
I wonder if the problem I am encountering now is similar? i.e. at each iteration bayesopt is creating a large array of possible solutions, but none of them are feasible because the likelihood of selecting a feasible solution at random is too low. However, this does not fit with my understanding of Bayesian Optimisation. i.e. after some data of feasible solutions is in the Gaussian Model, bayesopt should not be choosing new candidates at random?
Any discussion appreciated. Very happy to clarify further if necessary.
Don Mathis on 26 Mar 2018
Gaussian Process-based constrained Bayesian optimization does not work well when the feasible region occupies such a small fraction of the search space. The GPs will have trouble modeling the feasible region. Also, our implementation will have numerical difficulties finding feasible points. In this situation we recommend that you try to redefine your space so that the feasible region is a larger fraction of the space.
For optimization with "unknown constraints" (where constraints must be learned from function evaluations), bayesopt() uses the algorithm of Gelbart, M., J. Snoek, R. P. Adams. Bayesian Optimization with Unknown Constraints. http://arxiv.org/abs/1403.5607, 2014.
At every step, to choose the next point, it tries to maximize the constraint-weighted acquisition function, which is basically the acquisition function multiplied by the probability the the constraint is satisfied according to the current GP constraint model. This is a global optimization problem. Our implementation chooses 10,000 random points from the space, evaluate the constraint-weighted acquisition function, then chooses the best 10 and does a local optimization of the same function from each point. If the proportion of the space in which the constraint is satisfied is 10^-5, then the probability that any of the 10,000 points will be feasible is less than 0.1. That's one problem.
A second problem is that it may be difficult for a global GP model over the space to model the feasible region since it occupies such a small fraction of the space. The GP kernel can only handle constraint function changes up to a certain steepness. If your function is very steep it may not be learnable by the GP.