TuRBO Bayesian Optimization¶

In this tutorial we demonstrate the use of Xopt to preform Trust Region Bayesian Optimization (TuRBO) on a simple test problem. During optimization of high dimensional input spaces off the shelf BO tends to over-emphasize exploration which severely degrades optimization performance. TuRBO attempts to prevent this by maintaining a surrogate model over a local (trust) region centered on the best observation so far and restricting optimization inside that local region. The trust region is expanded and contracted based on the number of successful (observations that improve over the best observed point) or unsuccessful (no improvement) observations in a row. See https://botorch.org/tutorials/turbo_1 for details.

Define the test problem¶

Here we define a simple optimization problem, where we attempt to minimize a function in the domian [0,2*pi]. Note that the function used to evaluate the objective function takes a dictionary as input and returns a dictionary as the output.

In [1]:

Copied!





from xopt.evaluator import Evaluator
from xopt.generators.bayesian import UpperConfidenceBoundGenerator
from xopt import Xopt
from xopt.vocs import VOCS
import math
import numpy as np
import pandas as pd
import torch
import matplotlib.pyplot as plt

# define variables and function objectives
vocs = VOCS(
    variables={"x": [0, 2 * math.pi]},
    objectives={"f": "MINIMIZE"},
)
from xopt.evaluator import Evaluator
from xopt.generators.bayesian import UpperConfidenceBoundGenerator
from xopt import Xopt
from xopt.vocs import VOCS
import math
import numpy as np
import pandas as pd
import torch
import matplotlib.pyplot as plt

# define variables and function objectives
vocs = VOCS(
    variables={"x": [0, 2 * math.pi]},
    objectives={"f": "MINIMIZE"},
)

In [2]:

Copied!





# define a test function to optimize
def sin_function(input_dict):
    x = input_dict["x"]
    return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}
# define a test function to optimize
def sin_function(input_dict):
    x = input_dict["x"]
    return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

Create Xopt objects¶

Create the evaluator to evaluate our test function and create a generator that uses the Upper Confidence Bound acquisition function to perform Bayesian Optimization.

In [3]:

Copied!

evaluator = Evaluator(function=sin_function)
generator = UpperConfidenceBoundGenerator(vocs=vocs, turbo_controller="optimize")
X = Xopt(evaluator=evaluator, generator=generator, vocs=vocs)
evaluator = Evaluator(function=sin_function)
generator = UpperConfidenceBoundGenerator(vocs=vocs, turbo_controller="optimize")
X = Xopt(evaluator=evaluator, generator=generator, vocs=vocs)

In [4]:

Copied!

X
X

Out[4]:

            Xopt
________________________________
Version: 2.5.5.dev7+gf34967e8.d20250221
Data size: 0
Config as YAML:
dump_file: null
evaluator:
  function: __main__.sin_function
  function_kwargs: {}
  max_workers: 1
  vectorized: false
generator:
  beta: 2.0
  computation_time: null
  custom_objective: null
  fixed_features: null
  gp_constructor:
    covar_modules: {}
    custom_noise_prior: null
    mean_modules: {}
    name: standard
    trainable_mean_keys: []
    transform_inputs: true
    use_cached_hyperparameters: false
    use_low_noise_prior: true
  max_travel_distances: null
  model: null
  n_candidates: 1
  n_interpolate_points: null
  n_monte_carlo_samples: 128
  name: upper_confidence_bound
  numerical_optimizer:
    max_iter: 2000
    max_time: null
    n_restarts: 20
    name: LBFGS
  supports_batch_generation: true
  turbo_controller:
    batch_size: 1
    best_value: null
    center_x: null
    dim: 1
    failure_counter: 0
    failure_tolerance: 2
    length: 0.25
    length_max: 2.0
    length_min: 0.0078125
    name: optimize
    restrict_model_data: true
    scale_factor: 2.0
    success_counter: 0
    success_tolerance: 2
  use_cuda: false
max_evaluations: null
serialize_inline: false
serialize_torch: false
strict: true
vocs:
  constants: {}
  constraints: {}
  objectives:
    f: MINIMIZE
  observables: []
  variables:
    x:
    - 0.0
    - 6.283185307179586

Generate and evaluate initial points¶

To begin optimization, we must generate some random initial data points. The first call to X.step() will generate and evaluate a number of randomly points specified by the generator. Note that if we add data to xopt before calling X.step() by assigning the data to X.data, calls to X.step() will ignore the random generation and proceed to generating points via Bayesian optimization.

In [5]:

Copied!

X.evaluate_data(pd.DataFrame({"x": [3.0, 1.75, 2.0]}))

# inspect the gathered data
X.data
X.evaluate_data(pd.DataFrame({"x": [3.0, 1.75, 2.0]}))

# inspect the gathered data
X.data

Out[5]:

	x	f	xopt_runtime	xopt_error
0	3.00	-1.021664	0.000017	False
1	1.75	0.312362	0.000005	False
2	2.00	-0.272011	0.000003	False

In [6]:

Copied!





# determine trust region from gathered data
X.generator.train_model()
X.generator.turbo_controller.update_state(X.generator)
X.generator.turbo_controller.get_trust_region(X.generator)
# determine trust region from gathered data
X.generator.train_model()
X.generator.turbo_controller.update_state(X.generator)
X.generator.turbo_controller.get_trust_region(X.generator)

Out[6]:

tensor([[2.2146],
        [3.7854]], dtype=torch.float64)

Define plotting utility¶

In [7]:

Copied!





def plot_turbo(X):
    # get the Gaussian process model from the generator
    model = X.generator.train_model()

    # get trust region
    trust_region = X.generator.turbo_controller.get_trust_region(generator).squeeze()
    scale_factor = X.generator.turbo_controller.length
    region_width = trust_region[1] - trust_region[0]
    best_value = X.generator.turbo_controller.best_value

    # get number of successes and failures
    n_successes = X.generator.turbo_controller.success_counter
    n_failures = X.generator.turbo_controller.failure_counter

    # get acquisition function from generator
    acq = X.generator.get_acquisition(model)

    # calculate model posterior and acquisition function at each test point
    # NOTE: need to add a dimension to the input tensor for evaluating the
    # posterior and another for the acquisition function, see
    # https://botorch.org/docs/batching for details
    # NOTE: we use the `torch.no_grad()` environment to speed up computation by
    # skipping calculations for backpropagation
    with torch.no_grad():
        posterior = model.posterior(test_x.unsqueeze(1))
        acq_val = acq(test_x.reshape(-1, 1, 1))

    # get mean function and confidence regions
    mean = posterior.mean
    L, u = posterior.mvn.confidence_region()

    # plot model and acquisition function
    fig, ax = plt.subplots(2, 1, sharex="all")

    # add title for successes and failures
    ax[0].set_title(
        f"n_successes: {n_successes}, n_failures: {n_failures}, "
        f"scale_factor: {scale_factor}, region_width: {region_width:.2}, "
        f"best_value: {best_value:.4}"
    )

    # plot model posterior
    ax[0].plot(test_x, mean, label="Posterior mean")
    ax[0].fill_between(test_x, L, u, alpha=0.25, label="Confidence region")

    # add data to model plot
    ax[0].plot(X.data["x"], X.data["f"], "C1o", label="Training data")

    # plot true function
    true_f = sin_function({"x": test_x})["f"]
    ax[0].plot(test_x, true_f, "--", label="Ground truth")

    # plot acquisition function
    ax[1].plot(test_x, acq_val.flatten())

    ax[0].set_ylabel("f")
    ax[0].set_ylim(-12, 10)
    ax[1].set_ylabel(r"$\alpha(x)$")
    ax[1].set_xlabel("x")

    # plot trust region
    for a in ax:
        a.axvline(trust_region[0], c="r", label="Trust region boundary")
        a.axvline(trust_region[1], c="r")

    # add legend
    ax[0].legend(fontsize="x-small")

    fig.tight_layout()

    return fig, ax
def plot_turbo(X):
    # get the Gaussian process model from the generator
    model = X.generator.train_model()

    # get trust region
    trust_region = X.generator.turbo_controller.get_trust_region(generator).squeeze()
    scale_factor = X.generator.turbo_controller.length
    region_width = trust_region[1] - trust_region[0]
    best_value = X.generator.turbo_controller.best_value

    # get number of successes and failures
    n_successes = X.generator.turbo_controller.success_counter
    n_failures = X.generator.turbo_controller.failure_counter

    # get acquisition function from generator
    acq = X.generator.get_acquisition(model)

    # calculate model posterior and acquisition function at each test point
    # NOTE: need to add a dimension to the input tensor for evaluating the
    # posterior and another for the acquisition function, see
    # https://botorch.org/docs/batching for details
    # NOTE: we use the `torch.no_grad()` environment to speed up computation by
    # skipping calculations for backpropagation
    with torch.no_grad():
        posterior = model.posterior(test_x.unsqueeze(1))
        acq_val = acq(test_x.reshape(-1, 1, 1))

    # get mean function and confidence regions
    mean = posterior.mean
    L, u = posterior.mvn.confidence_region()

    # plot model and acquisition function
    fig, ax = plt.subplots(2, 1, sharex="all")

    # add title for successes and failures
    ax[0].set_title(
        f"n_successes: {n_successes}, n_failures: {n_failures}, "
        f"scale_factor: {scale_factor}, region_width: {region_width:.2}, "
        f"best_value: {best_value:.4}"
    )

    # plot model posterior
    ax[0].plot(test_x, mean, label="Posterior mean")
    ax[0].fill_between(test_x, L, u, alpha=0.25, label="Confidence region")

    # add data to model plot
    ax[0].plot(X.data["x"], X.data["f"], "C1o", label="Training data")

    # plot true function
    true_f = sin_function({"x": test_x})["f"]
    ax[0].plot(test_x, true_f, "--", label="Ground truth")

    # plot acquisition function
    ax[1].plot(test_x, acq_val.flatten())

    ax[0].set_ylabel("f")
    ax[0].set_ylim(-12, 10)
    ax[1].set_ylabel(r"$\alpha(x)$")
    ax[1].set_xlabel("x")

    # plot trust region
    for a in ax:
        a.axvline(trust_region[0], c="r", label="Trust region boundary")
        a.axvline(trust_region[1], c="r")

    # add legend
    ax[0].legend(fontsize="x-small")

    fig.tight_layout()

    return fig, ax

Do bayesian optimization steps¶

Notice that when the number of successive successes or failures reaches 2 the trust region expands or contracts and counters are reset to zero. Counters are also reset to zero during alternate successes/failures. Finally, the model is most accurate inside the trust region, which supports our goal of local optimization.

In [8]:

Copied!





# test points for plotting
test_x = torch.linspace(*X.vocs.bounds.flatten(), 500).double()

for i in range(15):
    # plot trust region analysis
    fig, ax = plot_turbo(X)

    # take optimization state
    X.step()
# test points for plotting
test_x = torch.linspace(*X.vocs.bounds.flatten(), 500).double()

for i in range(15):
    # plot trust region analysis
    fig, ax = plot_turbo(X)

    # take optimization state
    X.step()

/tmp/ipykernel_8535/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)
  return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

/tmp/ipykernel_8535/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)
  return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

/tmp/ipykernel_8535/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)
  return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

/tmp/ipykernel_8535/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)
  return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

/tmp/ipykernel_8535/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)
  return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

/tmp/ipykernel_8535/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)
  return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

/tmp/ipykernel_8535/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)
  return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

/tmp/ipykernel_8535/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)
  return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

/tmp/ipykernel_8535/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)
  return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

/tmp/ipykernel_8535/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)
  return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

/tmp/ipykernel_8535/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)
  return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

/tmp/ipykernel_8535/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)
  return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

/tmp/ipykernel_8535/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)
  return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

/tmp/ipykernel_8535/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)
  return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

/tmp/ipykernel_8535/2233511880.py:4: DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)
  return {"f": -10 * np.exp(-((x - np.pi) ** 2) / 0.01) + 0.5 * np.sin(5 * x)}

No description has been provided for this image

In [9]:

Copied!

# access the collected data
X.generator.turbo_controller
# access the collected data
X.generator.turbo_controller

Out[9]:

OptimizeTurboController(vocs=VOCS(variables={'x': [0.0, 6.283185307179586]}, constraints={}, objectives={'f': 'MINIMIZE'}, constants={}, observables=[]), dim=1, batch_size=1, length=0.125, length_min=0.0078125, length_max=2.0, failure_counter=0, failure_tolerance=2, success_counter=1, success_tolerance=2, center_x={'x': 3.143247469427937}, scale_factor=2.0, restrict_model_data=True, name='optimize', best_value=-10.001398951845198)

In [10]:

Copied!

X.data
X.data

Out[10]:

	x	f	xopt_runtime	xopt_error
0	3.000000	-1.021664	0.000017	False
1	1.750000	0.312362	0.000005	False
2	2.000000	-0.272011	0.000003	False
3	3.392699	-0.493621	0.000014	False
4	2.830560	0.499310	0.000014	False
5	3.075151	-6.267991	0.000016	False
6	3.110864	-9.022426	0.000013	False
7	3.210579	-6.382271	0.000014	False
8	3.503563	-0.485802	0.000013	False
9	3.148044	-9.974590	0.000014	False
10	3.143247	-10.001399	0.000014	False
11	3.535947	-0.460343	0.000013	False
12	3.143810	-10.000627	0.000013	False
13	3.143774	-10.000696	0.000013	False
14	2.357849	-0.350616	0.000013	False
15	3.928646	0.356467	0.000012	False
16	3.143778	-10.000689	0.000013	False
17	3.143707	-10.000816	0.000015	False