TuRBO Bayesian Optimization¶
In this tutorial we demonstrate the use of Xopt to preform Trust Region Bayesian
Optimization (TuRBO) on a simple test problem. During optimization of high
dimensional input spaces off the shelf BO tends to over-emphasize exploration which
severely degrades optimization performance. TuRBO attempts to prevent this by
maintaining a surrogate model over a local (trust) region centered on the best
observation so far and restricting optimization inside that local region. The trust
region is expanded and contracted based on the number of successful
(observations
that improve over the best observed point) or unsuccessful
(no improvement)
observations in a row. See https://botorch.org/tutorials/turbo_1 for details.
Define the test problem¶
Here we define a simple optimization problem, where we attempt to minimize a function in the domian [0,2*pi]. Note that the function used to evaluate the objective function takes a dictionary as input and returns a dictionary as the output.
from xopt.evaluator import Evaluator
from xopt.generators.bayesian import UpperConfidenceBoundGenerator
from xopt import Xopt
from xopt.vocs import VOCS
import math
import numpy as np
import pandas as pd
import torch
import matplotlib.pyplot as plt
# define variables and function objectives
vocs = VOCS(
variables={"x": [0, 2 * math.pi]},
objectives={"f": "MINIMIZE"},
)
# define a test function to optimize
def sin_function(input_dict):
x = input_dict["x"]
return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
Create Xopt objects¶
Create the evaluator to evaluate our test function and create a generator that uses the Upper Confidence Bound acquisition function to perform Bayesian Optimization.
evaluator = Evaluator(function=sin_function)
generator = UpperConfidenceBoundGenerator(vocs=vocs, turbo_controller="optimize")
X = Xopt(evaluator=evaluator, generator=generator, vocs=vocs)
X
Xopt ________________________________ Version: 2.4.6.dev5+ga295b108.d20250107 Data size: 0 Config as YAML: dump_file: null evaluator: function: __main__.sin_function function_kwargs: {} max_workers: 1 vectorized: false generator: beta: 2.0 computation_time: null custom_objective: null fixed_features: null gp_constructor: covar_modules: {} custom_noise_prior: null mean_modules: {} name: standard trainable_mean_keys: [] transform_inputs: true use_cached_hyperparameters: false use_low_noise_prior: true log_transform_acquisition_function: false max_travel_distances: null memory_length: null model: null n_candidates: 1 n_interpolate_points: null n_monte_carlo_samples: 128 name: upper_confidence_bound numerical_optimizer: max_iter: 2000 max_time: null n_restarts: 20 name: LBFGS supports_batch_generation: true turbo_controller: batch_size: 1 best_value: null center_x: null dim: 1 failure_counter: 0 failure_tolerance: 2 length: 0.25 length_max: 2.0 length_min: 0.0078125 name: optimize restrict_model_data: true scale_factor: 2.0 success_counter: 0 success_tolerance: 2 use_cuda: false max_evaluations: null serialize_inline: false serialize_torch: false strict: true vocs: constants: {} constraints: {} objectives: f: MINIMIZE observables: [] variables: x: - 0.0 - 6.283185307179586
Generate and evaluate initial points¶
To begin optimization, we must generate some random initial data points. The first call
to X.step()
will generate and evaluate a number of randomly points specified by the
generator. Note that if we add data to xopt before calling X.step()
by assigning
the data to X.data
, calls to X.step()
will ignore the random generation and
proceed to generating points via Bayesian optimization.
X.evaluate_data(pd.DataFrame({"x": [3.0, 1.75, 2.0]}))
# inspect the gathered data
X.data
x | f | xopt_runtime | xopt_error | |
---|---|---|---|---|
0 | 3.00 | -1.021664 | 0.000015 | False |
1 | 1.75 | 0.312362 | 0.000005 | False |
2 | 2.00 | -0.272011 | 0.000003 | False |
# determine trust region from gathered data
X.generator.train_model()
X.generator.turbo_controller.update_state(X.generator)
X.generator.turbo_controller.get_trust_region(X.generator)
tensor([[2.2146], [3.7854]], dtype=torch.float64)
Define plotting utility¶
def plot_turbo(X):
# get the Gaussian process model from the generator
model = X.generator.train_model()
# get trust region
trust_region = X.generator.turbo_controller.get_trust_region(generator).squeeze()
scale_factor = X.generator.turbo_controller.length
region_width = trust_region[1] - trust_region[0]
best_value = X.generator.turbo_controller.best_value
# get number of successes and failures
n_successes = X.generator.turbo_controller.success_counter
n_failures = X.generator.turbo_controller.failure_counter
# get acquisition function from generator
acq = X.generator.get_acquisition(model)
# calculate model posterior and acquisition function at each test point
# NOTE: need to add a dimension to the input tensor for evaluating the
# posterior and another for the acquisition function, see
# https://botorch.org/docs/batching for details
# NOTE: we use the `torch.no_grad()` environment to speed up computation by
# skipping calculations for backpropagation
with torch.no_grad():
posterior = model.posterior(test_x.unsqueeze(1))
acq_val = acq(test_x.reshape(-1, 1, 1))
# get mean function and confidence regions
mean = posterior.mean
L, u = posterior.mvn.confidence_region()
# plot model and acquisition function
fig, ax = plt.subplots(2, 1, sharex="all")
# add title for successes and failures
ax[0].set_title(
f"n_successes: {n_successes}, n_failures: {n_failures}, "
f"scale_factor: {scale_factor}, region_width: {region_width:.2}, "
f"best_value: {best_value:.4}"
)
# plot model posterior
ax[0].plot(test_x, mean, label="Posterior mean")
ax[0].fill_between(test_x, L, u, alpha=0.25, label="Confidence region")
# add data to model plot
ax[0].plot(X.data["x"], X.data["f"], "C1o", label="Training data")
# plot true function
true_f = sin_function({"x": test_x})["f"]
ax[0].plot(test_x, true_f, "--", label="Ground truth")
# plot acquisition function
ax[1].plot(test_x, acq_val.flatten())
ax[0].set_ylabel("f")
ax[0].set_ylim(-12, 10)
ax[1].set_ylabel(r"$\alpha(x)$")
ax[1].set_xlabel("x")
# plot trust region
for a in ax:
a.axvline(trust_region[0], c="r", label="Trust region boundary")
a.axvline(trust_region[1], c="r")
# add legend
ax[0].legend(fontsize="x-small")
fig.tight_layout()
return fig, ax
Do bayesian optimization steps¶
Notice that when the number of successive successes or failures reaches 2 the trust region expands or contracts and counters are reset to zero. Counters are also reset to zero during alternate successes/failures. Finally, the model is most accurate inside the trust region, which supports our goal of local optimization.
# test points for plotting
test_x = torch.linspace(*X.vocs.bounds.flatten(), 500).double()
for i in range(15):
# plot trust region analysis
fig, ax = plot_turbo(X)
# take optimization state
X.step()
/tmp/ipykernel_4569/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0) return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
/tmp/ipykernel_4569/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0) return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
/tmp/ipykernel_4569/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0) return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
/tmp/ipykernel_4569/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0) return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
/tmp/ipykernel_4569/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0) return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
/tmp/ipykernel_4569/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0) return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
/tmp/ipykernel_4569/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0) return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
/tmp/ipykernel_4569/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0) return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
/tmp/ipykernel_4569/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0) return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
/tmp/ipykernel_4569/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0) return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
/tmp/ipykernel_4569/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0) return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
/tmp/ipykernel_4569/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0) return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
/tmp/ipykernel_4569/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0) return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
/tmp/ipykernel_4569/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0) return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
/tmp/ipykernel_4569/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0) return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
# access the collected data
X.generator.turbo_controller
OptimizeTurboController(vocs=VOCS(variables={'x': [0.0, 6.283185307179586]}, constraints={}, objectives={'f': 'MINIMIZE'}, constants={}, observables=[]), dim=1, batch_size=1, length=0.25, length_min=0.0078125, length_max=2.0, failure_counter=1, failure_tolerance=2, success_counter=0, success_tolerance=2, center_x={'x': 3.1432475513431872}, scale_factor=2.0, restrict_model_data=True, name='optimize', best_value=-10.001398885584534)
X.data
x | f | xopt_runtime | xopt_error | |
---|---|---|---|---|
0 | 3.000000 | -1.021664 | 0.000015 | False |
1 | 1.750000 | 0.312362 | 0.000005 | False |
2 | 2.000000 | -0.272011 | 0.000003 | False |
3 | 3.392699 | -0.493621 | 0.000012 | False |
4 | 2.830559 | 0.499310 | 0.000013 | False |
5 | 3.075151 | -6.267951 | 0.000013 | False |
6 | 3.110863 | -9.022391 | 0.000013 | False |
7 | 3.210576 | -6.382496 | 0.000013 | False |
8 | 3.503562 | -0.485802 | 0.000012 | False |
9 | 3.148044 | -9.974594 | 0.000011 | False |
10 | 3.143248 | -10.001399 | 0.000013 | False |
11 | 3.535947 | -0.460342 | 0.000012 | False |
12 | 3.143810 | -10.000627 | 0.000012 | False |
13 | 3.143774 | -10.000696 | 0.000012 | False |
14 | 2.357849 | -0.350616 | 0.000013 | False |
15 | 3.143760 | -10.000721 | 0.000030 | False |
16 | 3.928646 | 0.356467 | 0.000012 | False |
17 | 3.143760 | -10.000722 | 0.000012 | False |