Heteroskedastic modeling
Bayesian Optimization with Heteroskedastic Noise GP Modeling¶
In this tutorial we demonstrate the use of Xopt to preform Bayesian Optimization on a simple test problem. The problem exibits non-uniform (heteroskedastic) noise which we account for in the GP model. This requires explicit specification of the measurement variance.
Define the test problem¶
Here we define a simple optimization problem, where we attempt to minimize the sin function in the domian [0,2*pi]. Note that the function used to evaluate the objective function takes a dictionary as input and returns a dictionary as the output.
from xopt.vocs import VOCS
from xopt.evaluator import Evaluator
from xopt.generators.bayesian import UpperConfidenceBoundGenerator
from xopt import Xopt
import math
import numpy as np
import torch
# define variables and function objectives
vocs = VOCS(
variables={"x": [0, 2 * math.pi]},
objectives={"f": "MINIMIZE"},
)
Specifying measurement variance¶
We specify variance in the objective function by appending _var
to it. This info
will collected by the model constructor to make a heteroskedastic model.
# define a test function to optimize
# the test function also returns an estimation of the variance, which is
# used to create a Heteroskedastic noise model for the gp
def sin_function(input_dict):
return {"f": np.sin(input_dict["x"]), "f_var": 0.001 * input_dict["x"]}
Create Xopt objects¶
Create the evaluator to evaluate our test function and create a generator that uses the Upper Confidence Bound acquisition function to perform Bayesian Optimization.
evaluator = Evaluator(function=sin_function)
generator = UpperConfidenceBoundGenerator(vocs=vocs)
X = Xopt(evaluator=evaluator, generator=generator, vocs=vocs)
Generate and evaluate initial points¶
To begin optimization, we must generate some random initial data points. The first call
to X.step()
will generate and evaluate a number of randomly points specified by the
generator. Note that if we add data to xopt before calling X.step()
by assigning
the data to X.data
, calls to X.step()
will ignore the random generation and
proceed to generating points via Bayesian optimization.
# call X.random_evaluate() to generate + evaluate 3 initial points
X.random_evaluate(4)
# inspect the gathered data
X.data
x | f | f_var | xopt_runtime | xopt_error | |
---|---|---|---|---|---|
0 | 1.488744 | 0.996636 | 0.001489 | 0.000009 | False |
1 | 3.022200 | 0.119109 | 0.003022 | 0.000003 | False |
2 | 2.658750 | 0.464299 | 0.002659 | 0.000002 | False |
3 | 4.547385 | -0.986418 | 0.004547 | 0.000001 | False |
Do bayesian optimization steps¶
To perform optimization we simply call X.step()
in a loop. This allows us to do
intermediate tasks in between optimization steps, such as examining the model and
acquisition function at each step (as we demonstrate here).
n_steps = 5
# test points for plotting
test_x = torch.linspace(*X.vocs.bounds.flatten(), 50).double()
for i in range(n_steps):
# get the Gaussian process model from the generator
model = X.generator.train_model()
# visualize model
fig, ax = X.generator.visualize_model(n_grid=len(test_x))
# plot true function
true_f = sin_function({"x": test_x})["f"]
ax[0, 0].plot(test_x, true_f, "C1--")
# do the optimization step
X.step()
# access the collected data
X.data
x | f | f_var | xopt_runtime | xopt_error | |
---|---|---|---|---|---|
0 | 1.488744 | 9.966356e-01 | 0.001489 | 0.000009 | False |
1 | 3.022200 | 1.191092e-01 | 0.003022 | 0.000003 | False |
2 | 2.658750 | 4.642991e-01 | 0.002659 | 0.000002 | False |
3 | 4.547385 | -9.864178e-01 | 0.004547 | 0.000001 | False |
4 | 6.283185 | -2.449294e-16 | 0.006283 | 0.000009 | False |
5 | 4.723356 | -9.999399e-01 | 0.004723 | 0.000009 | False |
6 | 4.685555 | -9.996400e-01 | 0.004686 | 0.000009 | False |
7 | 4.654580 | -9.983295e-01 | 0.004655 | 0.000009 | False |
8 | 4.599864 | -9.936757e-01 | 0.004600 | 0.000008 | False |
Getting the optimization result¶
To get the best point (without evaluating it) we ask the generator to predict the optimum based on the posterior mean.
X.generator.get_optimum()
x | |
---|---|
0 | 4.641035 |
Customizing optimization¶
Each generator has a set of options that can be modified to effect optimization behavior
X.generator.dict()
{'model': ModelListGP( (models): ModuleList( (0): XoptHeteroskedasticSingleTaskGP( (likelihood): _GaussianLikelihoodBase( (noise_covar): HeteroskedasticNoise( (noise_model): SingleTaskGP( (likelihood): GaussianLikelihood( (noise_covar): HomoskedasticNoise( (noise_prior): SmoothedBoxPrior() (raw_noise_constraint): GreaterThan(1.000E-04) ) ) (mean_module): ConstantMean() (covar_module): RBFKernel( (lengthscale_prior): LogNormalPrior() (raw_lengthscale_constraint): GreaterThan(2.500E-02) ) (outcome_transform): Log() ) (_noise_constraint): GreaterThan(1.000E-04) ) ) (mean_module): ConstantMean() (covar_module): RBFKernel( (lengthscale_prior): LogNormalPrior() (raw_lengthscale_constraint): GreaterThan(2.500E-02) ) (input_transform): Normalize() (outcome_transform): Standardize() ) ) (likelihood): LikelihoodList( (likelihoods): ModuleList( (0): _GaussianLikelihoodBase( (noise_covar): HeteroskedasticNoise( (noise_model): SingleTaskGP( (likelihood): GaussianLikelihood( (noise_covar): HomoskedasticNoise( (noise_prior): SmoothedBoxPrior() (raw_noise_constraint): GreaterThan(1.000E-04) ) ) (mean_module): ConstantMean() (covar_module): RBFKernel( (lengthscale_prior): LogNormalPrior() (raw_lengthscale_constraint): GreaterThan(2.500E-02) ) (outcome_transform): Log() ) (_noise_constraint): GreaterThan(1.000E-04) ) ) ) ) ), 'n_monte_carlo_samples': 128, 'turbo_controller': None, 'use_cuda': False, 'gp_constructor': {'name': 'standard', 'use_low_noise_prior': True, 'covar_modules': {}, 'mean_modules': {}, 'trainable_mean_keys': [], 'transform_inputs': True, 'custom_noise_prior': None, 'use_cached_hyperparameters': False}, 'numerical_optimizer': {'name': 'LBFGS', 'n_restarts': 20, 'max_iter': 2000, 'max_time': None}, 'max_travel_distances': None, 'fixed_features': None, 'computation_time': training acquisition_optimization 0 0.240819 0.023323 1 0.310560 0.043018 2 0.285550 0.040264 3 0.226251 0.045517 4 0.275017 0.062930, 'custom_objective': None, 'n_interpolate_points': None, 'n_candidates': 1, 'beta': 2.0}