Basic Optimization using BAX¶

In this notebook we demonstrate the use of Xopt to perform Bayesian Algorithm Execution (BAX) as a means of minimizing the output of a simple test function. BAX is a generalization of Bayesian Optimization that seeks to acquire observations that provide our model with maximal information about our property of interest. In this simple example, our property of interest is the minimum function output and its location in input-space. See https://arxiv.org/pdf/2209.04587.pdf for details.

Imports and random seeding for reproducibility¶

In [1]:

Copied!





import torch

from xopt import Xopt
from xopt.vocs import VOCS
from xopt.generators.bayesian.bax_generator import BaxGenerator
from xopt.generators.bayesian.bax.algorithms import GridOptimize
from xopt.evaluator import Evaluator
from xopt.generators.bayesian.bax.visualize import visualize_virtual_objective

import numpy as np
import random

import os
import math
import matplotlib.pyplot as plt

# Ignore all warnings
import warnings

warnings.filterwarnings("ignore")


os.environ["KMP_DUPLICATE_LIB_OK"] = "True"

# random seeds for reproducibility
rand_seed = 2

torch.manual_seed(rand_seed)
np.random.seed(rand_seed)  # only affects initial random observations through Xopt
random.seed(rand_seed)
import torch

from xopt import Xopt
from xopt.vocs import VOCS
from xopt.generators.bayesian.bax_generator import BaxGenerator
from xopt.generators.bayesian.bax.algorithms import GridOptimize
from xopt.evaluator import Evaluator
from xopt.generators.bayesian.bax.visualize import visualize_virtual_objective

import numpy as np
import random

import os
import math
import matplotlib.pyplot as plt

# Ignore all warnings
import warnings

warnings.filterwarnings("ignore")


os.environ["KMP_DUPLICATE_LIB_OK"] = "True"

# random seeds for reproducibility
rand_seed = 2

torch.manual_seed(rand_seed)
np.random.seed(rand_seed)  # only affects initial random observations through Xopt
random.seed(rand_seed)

Define the test problem¶

Here we define a simple optimization problem, where we attempt to minimize the sin function in the domian [0,2*pi]. Note that the function used to evaluate the objective function takes a dictionary as input and returns a dictionary as the output.

In [2]:

Copied!





# define variables and function objectives
vocs = VOCS(
    variables={"x": [0, 2 * math.pi]},
    observables=["y1"],
)
# define variables and function objectives
vocs = VOCS(
    variables={"x": [0, 2 * math.pi]},
    observables=["y1"],
)

In [3]:

Copied!

# define a test function to optimize
def sin_function(input_dict):
    return {"y1": np.sin(input_dict["x"])}
# define a test function to optimize
def sin_function(input_dict):
    return {"y1": np.sin(input_dict["x"])}

Prepare BAX generator for Xopt¶

Create a generator that uses the ExpectedInformationGain (InfoBAX) acquisition function to perform Bayesian Optimization. Note that we use minimization on a grid, so specifying the number of mesh points can negatively impact decision making time (especially in higher dimensional feature spaces). Note that because we are optimizing a problem with no noise we set use_low_noise_prior=True in the GP model constructor.

In [4]:

Copied!





# Prepare BAX algorithm and generator options
algorithm = GridOptimize(n_mesh_points=50)  # NOTE: default is to minimize

# construct BAX generator
generator = BaxGenerator(vocs=vocs, algorithm=algorithm)
generator.gp_constructor.use_low_noise_prior = True
# Prepare BAX algorithm and generator options
algorithm = GridOptimize(n_mesh_points=50)  # NOTE: default is to minimize

# construct BAX generator
generator = BaxGenerator(vocs=vocs, algorithm=algorithm)
generator.gp_constructor.use_low_noise_prior = True

Create Evaluator and Xopt objects¶

Create the Evaluator (which allows Xopt to interface with our test function) and finish constructing our Xopt object.

In [5]:

Copied!

# construct evaluator
evaluator = Evaluator(function=sin_function)

# construct Xopt optimizer
X = Xopt(evaluator=evaluator, generator=generator, vocs=vocs)
# construct evaluator
evaluator = Evaluator(function=sin_function)

# construct Xopt optimizer
X = Xopt(evaluator=evaluator, generator=generator, vocs=vocs)

Generate and evaluate initial points¶

To begin optimization, we must generate some random initial data points. The first call to X.step() will generate and evaluate a number of randomly points specified by the generator. Note that if we add data to xopt before calling X.step() by assigning the data to X.data, calls to X.step() will ignore the random generation and proceed to generating points via Bayesian optimization.

In [6]:

Copied!

# evaluate initial points
X.random_evaluate(3)

# inspect the gathered data
X.data
# evaluate initial points
X.random_evaluate(3)

# inspect the gathered data
X.data

Out[6]:

	x	y1	xopt_runtime	xopt_error
0	3.543749	-0.391403	0.000008	False
1	6.120286	-0.162180	0.000003	False
2	2.829554	0.306999	0.000002	False

Define plotting utility¶

Define a plotting function that plots the GP model, samples from the GP model, and the execution paths (red crosses).

In [7]:

Copied!





def plot_bax(X):
    # get the Gaussian process model from the generator
    model = X.generator.train_model()

    # get acquisition function from generator
    acq = X.generator.get_acquisition(model)

    # calculate model posterior and acquisition function at each test point
    # NOTE: need to add a dimension to the input tensor for evaluating the
    # posterior and another for the acquisition function, see
    # https://botorch.org/docs/batching for details
    # NOTE: we use the `torch.no_grad()` environment to speed up computation by
    # skipping calculations for backpropagation
    with torch.no_grad():
        posterior = model.posterior(test_x.unsqueeze(1))
        acq_val = acq(test_x.reshape(-1, 1, 1))

    # get mean function and confidence regions
    mean = posterior.mean
    L, u = posterior.mvn.confidence_region()

    # plot model and acquisition function
    fig, ax = plt.subplots(3, 1, sharex="all")
    fig.set_size_inches(8, 6)

    # plot model posterior
    ax[0].plot(test_x, mean, label="Posterior mean")
    ax[0].fill_between(test_x, L, u, alpha=0.25, label="Posterior confidence region")

    # add data to model plot
    ax[0].plot(X.data["x"], X.data["y1"], "C1o", label="Training data")

    # plot true function
    true_f = sin_function({"x": test_x})["y1"]
    ax[0].plot(test_x, true_f, "--", label="Ground truth")

    # plot the function samples and their optima found by BAX
    test_points = X.generator.algorithm_results["test_points"]
    posterior_samples = X.generator.algorithm_results["posterior_samples"]
    execution_paths = X.generator.algorithm_results["execution_paths"]

    label1 = "Function Samples"
    label2 = "Sample Optima"
    for i in range(X.generator.algorithm.n_samples):
        (samples,) = ax[1].plot(
            test_points, posterior_samples[i], c="C0", alpha=0.3, label=label1
        )
        ax[1].scatter(
            *execution_paths[i], c="r", marker="x", s=80, label=label2, zorder=10
        )
        label1 = None
        label2 = None

    # plot acquisition function
    ax[2].plot(test_x, acq_val.flatten())

    ax[0].set_ylabel("y1")
    ax[1].set_ylabel("y1")
    ax[2].set_ylabel(r"$\alpha(x)$")
    ax[2].set_xlabel("x")

    return fig, ax
def plot_bax(X):
    # get the Gaussian process model from the generator
    model = X.generator.train_model()

    # get acquisition function from generator
    acq = X.generator.get_acquisition(model)

    # calculate model posterior and acquisition function at each test point
    # NOTE: need to add a dimension to the input tensor for evaluating the
    # posterior and another for the acquisition function, see
    # https://botorch.org/docs/batching for details
    # NOTE: we use the `torch.no_grad()` environment to speed up computation by
    # skipping calculations for backpropagation
    with torch.no_grad():
        posterior = model.posterior(test_x.unsqueeze(1))
        acq_val = acq(test_x.reshape(-1, 1, 1))

    # get mean function and confidence regions
    mean = posterior.mean
    L, u = posterior.mvn.confidence_region()

    # plot model and acquisition function
    fig, ax = plt.subplots(3, 1, sharex="all")
    fig.set_size_inches(8, 6)

    # plot model posterior
    ax[0].plot(test_x, mean, label="Posterior mean")
    ax[0].fill_between(test_x, L, u, alpha=0.25, label="Posterior confidence region")

    # add data to model plot
    ax[0].plot(X.data["x"], X.data["y1"], "C1o", label="Training data")

    # plot true function
    true_f = sin_function({"x": test_x})["y1"]
    ax[0].plot(test_x, true_f, "--", label="Ground truth")

    # plot the function samples and their optima found by BAX
    test_points = X.generator.algorithm_results["test_points"]
    posterior_samples = X.generator.algorithm_results["posterior_samples"]
    execution_paths = X.generator.algorithm_results["execution_paths"]

    label1 = "Function Samples"
    label2 = "Sample Optima"
    for i in range(X.generator.algorithm.n_samples):
        (samples,) = ax[1].plot(
            test_points, posterior_samples[i], c="C0", alpha=0.3, label=label1
        )
        ax[1].scatter(
            *execution_paths[i], c="r", marker="x", s=80, label=label2, zorder=10
        )
        label1 = None
        label2 = None

    # plot acquisition function
    ax[2].plot(test_x, acq_val.flatten())

    ax[0].set_ylabel("y1")
    ax[1].set_ylabel("y1")
    ax[2].set_ylabel(r"$\alpha(x)$")
    ax[2].set_xlabel("x")

    return fig, ax

Do bayesian optimization steps¶

To perform optimization we simply call X.step() in a loop. This allows us to do intermediate tasks in between optimization steps, such as examining the model and acquisition function at each step (as we demonstrate here).

In [8]:

Copied!





n_steps = 3

# test points for plotting
test_x = torch.linspace(*X.vocs.bounds.flatten(), 50).double()

for i in range(5):
    # plot model and bax information
    fig, ax = plot_bax(X)

    if i == 0:
        ax[0].legend(ncols=2)
        ax[1].legend()

    # do the optimization step
    X.step()
n_steps = 3

# test points for plotting
test_x = torch.linspace(*X.vocs.bounds.flatten(), 50).double()

for i in range(5):
    # plot model and bax information
    fig, ax = plot_bax(X)

    if i == 0:
        ax[0].legend(ncols=2)
        ax[1].legend()

    # do the optimization step
    X.step()

No description has been provided for this image

In [9]:

Copied!

# access the collected data
X.data
# access the collected data
X.data

Out[9]:

	x	y1	xopt_runtime	xopt_error
0	3.543749	-0.391403	0.000008	False
1	6.120286	-0.162180	0.000003	False
2	2.829554	0.306999	0.000002	False
3	3.974919	-0.740172	0.000008	False
4	4.482643	-0.973724	0.000007	False
5	4.685216	-0.999631	0.000008	False
6	4.733818	-0.999770	0.000007	False
7	4.737061	-0.999696	0.000007	False

In [10]:

Copied!

# plot the virtual objective (which in this case is simply the observable model for y1) via posterior sampling

visualize_virtual_objective(X.generator, n_samples=1000)
# plot the virtual objective (which in this case is simply the observable model for y1) via posterior sampling

visualize_virtual_objective(X.generator, n_samples=1000)

Out[10]:

(<Figure size 400x370 with 1 Axes>,
 <Axes: xlabel='x', ylabel='Virtual Objective'>)