Multi-fidelity Multi-objective Bayesian Optimization¶
Here we attempt to solve for the constrained Pareto front of the TNK multi-objective optimization problem using Multi-Fidelity Multi-Objective Bayesian optimization. For simplicity we assume that the objective and constraint functions at lower fidelities is exactly equal to the functions at higher fidelities (this is obviously not a requirement, although for the best results lower fidelity calculations should correlate with higher fidelity ones). The algorithm should learn this relationship and use information gathered at lower fidelities to gather samples to improve the hypervolume of the Pareto front at the maximum fidelity.
TNK function $n=2$ variables: $x_i \in [0, \pi], i=1,2$
Objectives:
- $f_i(x) = x_i$
Constraints:
- $g_1(x) = -x_1^2 -x_2^2 + 1 + 0.1 \cos\left(16 \arctan \frac{x_1}{x_2}\right) \le 0$
- $g_2(x) = (x_1 - 1/2)^2 + (x_2-1/2)^2 \le 0.5$
# set values if testing
import os
from copy import deepcopy
import pandas as pd
import numpy as np
import torch
from xopt import Xopt, Evaluator
from xopt.generators.bayesian import MultiFidelityGenerator
from xopt.resources.test_functions.tnk import evaluate_TNK, tnk_vocs
import matplotlib.pyplot as plt
# Ignore all warnings
import warnings
warnings.filterwarnings("ignore")
SMOKE_TEST = os.environ.get("SMOKE_TEST")
N_MC_SAMPLES = 1 if SMOKE_TEST else 128
NUM_RESTARTS = 1 if SMOKE_TEST else 20
BUDGET = 0.02 if SMOKE_TEST else 10
evaluator = Evaluator(function=evaluate_TNK)
print(tnk_vocs.dict())
{'variables': {'x1': [0.0, 3.14159], 'x2': [0.0, 3.14159]}, 'constraints': {'c1': ['GREATER_THAN', 0.0], 'c2': ['LESS_THAN', 0.5]}, 'objectives': {'y1': 'MINIMIZE', 'y2': 'MINIMIZE'}, 'constants': {'a': 'dummy_constant'}, 'observables': []}
Set up the Multi-Fidelity Multi-objective optimization algorithm¶
Here we create the Multi-Fidelity generator object which can solve both single and multi-objective optimization problems depending on the number of objectives in VOCS. We specify a cost function as a function of fidelity parameter $s=[0,1]$ as $C(s) = s^{3.5}$ as an example from a real life multi-fidelity simulation problem.
my_vocs = deepcopy(tnk_vocs)
my_vocs.constraints = {}
generator = MultiFidelityGenerator(vocs=my_vocs, reference_point={"y1": 1.5, "y2": 1.5})
# set cost function according to approximate scaling of laser plasma accelerator
# problem, see https://journals.aps.org/prresearch/abstract/10.1103/PhysRevResearch.5.013063
generator.cost_function = lambda s: s**3.5
generator.numerical_optimizer.n_restarts = NUM_RESTARTS
generator.n_monte_carlo_samples = N_MC_SAMPLES
generator.gp_constructor.use_low_noise_prior = True
X = Xopt(generator=generator, evaluator=evaluator, vocs=my_vocs)
# evaluate at some explicit initial points
X.evaluate_data(pd.DataFrame({"x1": [1.0, 0.75], "x2": [0.75, 1.0], "s": [0.0, 0.1]}))
X
Xopt ________________________________ Version: 0.1.dev1947+g7831d28.d20250426 Data size: 2 Config as YAML: dump_file: null evaluator: function: xopt.resources.test_functions.tnk.evaluate_TNK function_kwargs: raise_probability: 0 random_sleep: 0 sleep: 0 max_workers: 1 vectorized: false generator: computation_time: null custom_objective: null fixed_features: null gp_constructor: covar_modules: {} custom_noise_prior: null mean_modules: {} name: standard trainable_mean_keys: [] transform_inputs: true use_cached_hyperparameters: false use_low_noise_prior: true max_travel_distances: null model: null n_candidates: 1 n_interpolate_points: null n_monte_carlo_samples: 128 name: multi_fidelity numerical_optimizer: max_iter: 2000 max_time: 5.0 n_restarts: 20 name: LBFGS reference_point: s: 0.0 y1: 1.5 y2: 1.5 supports_batch_generation: true supports_constraints: true supports_multi_objective: true turbo_controller: null use_cuda: false use_pf_as_initial_points: false max_evaluations: null serialize_inline: false serialize_torch: false strict: true vocs: constants: a: dummy_constant constraints: {} objectives: s: MAXIMIZE y1: MINIMIZE y2: MINIMIZE observables: [] variables: s: - 0 - 1 x1: - 0.0 - 3.14159 x2: - 0.0 - 3.14159
Run optimization routine¶
Instead of ending the optimization routine after an explict number of samples we end optimization once a given optimization budget has been exceeded. WARNING: This will slightly exceed the given budget
budget = BUDGET
while X.generator.calculate_total_cost() < budget:
X.step()
print(
f"n_samples: {len(X.data)} "
f"budget used: {X.generator.calculate_total_cost():.4} "
f"hypervolume: {X.generator.calculate_hypervolume():.4}"
)
n_samples: 3 budget used: 0.1923 hypervolume: 0.0375
n_samples: 4 budget used: 0.1923 hypervolume: 0.0375
n_samples: 5 budget used: 0.2062 hypervolume: 0.6636
n_samples: 6 budget used: 0.7216 hypervolume: 0.6636
n_samples: 7 budget used: 0.7903 hypervolume: 0.6636
n_samples: 8 budget used: 1.3 hypervolume: 0.6636
n_samples: 9 budget used: 2.013 hypervolume: 0.6638
n_samples: 10 budget used: 2.809 hypervolume: 0.9135
n_samples: 11 budget used: 3.05 hypervolume: 0.9135
n_samples: 12 budget used: 3.05 hypervolume: 0.9135
n_samples: 13 budget used: 3.067 hypervolume: 0.9135
n_samples: 14 budget used: 3.81 hypervolume: 1.101
n_samples: 15 budget used: 3.81 hypervolume: 1.101
n_samples: 16 budget used: 4.413 hypervolume: 1.101
n_samples: 17 budget used: 4.418 hypervolume: 1.101
n_samples: 18 budget used: 4.446 hypervolume: 1.102
n_samples: 19 budget used: 4.493 hypervolume: 1.102
n_samples: 20 budget used: 4.973 hypervolume: 1.102
n_samples: 21 budget used: 5.35 hypervolume: 1.102
n_samples: 22 budget used: 5.475 hypervolume: 1.191
n_samples: 23 budget used: 5.614 hypervolume: 1.192
n_samples: 24 budget used: 5.761 hypervolume: 1.192
n_samples: 25 budget used: 5.99 hypervolume: 1.192
n_samples: 26 budget used: 6.695 hypervolume: 1.192
n_samples: 27 budget used: 6.727 hypervolume: 1.192
n_samples: 28 budget used: 7.276 hypervolume: 1.192
n_samples: 29 budget used: 7.276 hypervolume: 1.192
n_samples: 30 budget used: 7.276 hypervolume: 1.192
n_samples: 31 budget used: 7.639 hypervolume: 1.192
n_samples: 32 budget used: 7.911 hypervolume: 1.717
n_samples: 33 budget used: 7.95 hypervolume: 1.717
n_samples: 34 budget used: 7.951 hypervolume: 1.717
n_samples: 35 budget used: 7.983 hypervolume: 1.717
n_samples: 36 budget used: 8.141 hypervolume: 1.717
n_samples: 37 budget used: 8.146 hypervolume: 1.717
n_samples: 38 budget used: 9.146 hypervolume: 2.25
n_samples: 39 budget used: 9.655 hypervolume: 2.25
n_samples: 40 budget used: 10.21 hypervolume: 2.25
Show results¶
X.data
x1 | x2 | s | a | y1 | y2 | c1 | c2 | xopt_runtime | xopt_error | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1.000000 | 0.750000 | 0.000000 | dummy_constant | 1.000000 | 0.750000 | 0.626888 | 0.312500 | 0.000222 | False |
1 | 0.750000 | 1.000000 | 0.100000 | dummy_constant | 0.750000 | 1.000000 | 0.626888 | 0.312500 | 0.000139 | False |
2 | 2.373815 | 1.781375 | 0.624015 | dummy_constant | 2.373815 | 1.781375 | 7.872348 | 5.153105 | 0.000150 | False |
3 | 0.000000 | 0.215730 | 0.000000 | dummy_constant | 0.000000 | 0.215730 | -1.053460 | 0.330809 | 0.000148 | False |
4 | 0.000000 | 0.000000 | 0.294917 | dummy_constant | 0.000000 | 0.000000 | -1.100000 | 0.500000 | 0.000149 | False |
5 | 0.148514 | 2.628637 | 0.827490 | dummy_constant | 0.148514 | 2.628637 | 5.869865 | 4.654637 | 0.000148 | False |
6 | 1.399967 | 1.659361 | 0.465155 | dummy_constant | 1.399967 | 1.659361 | 3.691817 | 2.154060 | 0.000150 | False |
7 | 2.832904 | 2.698287 | 0.824824 | dummy_constant | 2.832904 | 2.698287 | 14.213583 | 10.274908 | 0.000146 | False |
8 | 1.400857 | 1.496717 | 0.907947 | dummy_constant | 1.400857 | 1.496717 | 3.116236 | 1.804987 | 0.000147 | False |
9 | 0.950916 | 0.790889 | 0.936842 | dummy_constant | 0.950916 | 0.790889 | 0.519275 | 0.287942 | 0.000148 | False |
10 | 1.809270 | 1.186896 | 0.666058 | dummy_constant | 1.809270 | 1.186896 | 3.781264 | 2.186013 | 0.000148 | False |
11 | 2.260225 | 0.779853 | 0.013143 | dummy_constant | 2.260225 | 0.779853 | 4.660033 | 3.176709 | 0.000149 | False |
12 | 1.386312 | 3.065447 | 0.313170 | dummy_constant | 1.386312 | 3.065447 | 10.231663 | 7.367069 | 0.000146 | False |
13 | 0.421113 | 0.932039 | 0.918348 | dummy_constant | 0.421113 | 0.932039 | -0.041404 | 0.192881 | 0.000148 | False |
14 | 0.000000 | 0.883376 | 0.000000 | dummy_constant | 0.000000 | 0.883376 | -0.319647 | 0.396977 | 0.000149 | False |
15 | 0.654645 | 2.704289 | 0.865739 | dummy_constant | 0.654645 | 2.704289 | 6.820829 | 4.882806 | 0.000148 | False |
16 | 0.922045 | 2.720251 | 0.213205 | dummy_constant | 0.922045 | 2.720251 | 7.200551 | 5.107635 | 0.000149 | False |
17 | 1.436833 | 0.546492 | 0.361357 | dummy_constant | 1.436833 | 0.546492 | 1.273897 | 0.879818 | 0.000148 | False |
18 | 0.929176 | 1.027489 | 0.418189 | dummy_constant | 0.929176 | 1.027489 | 0.849663 | 0.462436 | 0.000148 | False |
19 | 0.290223 | 2.019236 | 0.810709 | dummy_constant | 0.290223 | 2.019236 | 3.226970 | 2.352084 | 0.000150 | False |
20 | 2.745490 | 1.433730 | 0.756417 | dummy_constant | 2.745490 | 1.433730 | 8.577973 | 5.914076 | 0.000148 | False |
21 | 0.983510 | 0.116758 | 0.552227 | dummy_constant | 0.983510 | 0.116758 | 0.012363 | 0.380657 | 0.000148 | False |
22 | 0.361938 | 1.399296 | 0.569726 | dummy_constant | 0.361938 | 1.399296 | 1.150547 | 0.827794 | 0.000147 | False |
23 | 1.811137 | 2.513496 | 0.577614 | dummy_constant | 1.811137 | 2.513496 | 8.682311 | 5.773248 | 0.000148 | False |
24 | 2.368455 | 3.085216 | 0.656148 | dummy_constant | 2.368455 | 3.085216 | 14.177829 | 10.174466 | 0.000148 | False |
25 | 2.381037 | 1.378933 | 0.905072 | dummy_constant | 2.381037 | 1.378933 | 6.622631 | 4.310823 | 0.000150 | False |
26 | 0.093459 | 1.710741 | 0.375693 | dummy_constant | 0.093459 | 1.710741 | 1.871135 | 1.631170 | 0.000146 | False |
27 | 1.317410 | 0.858324 | 0.842188 | dummy_constant | 1.317410 | 0.858324 | 1.570572 | 0.796556 | 0.000147 | False |
28 | 0.298516 | 0.140687 | 0.099449 | dummy_constant | 0.298516 | 0.140687 | -0.963338 | 0.169701 | 0.000149 | False |
29 | 1.886804 | 2.834586 | 0.065971 | dummy_constant | 1.886804 | 2.834586 | 10.694868 | 7.373517 | 0.000149 | False |
30 | 0.886634 | 2.032756 | 0.748338 | dummy_constant | 0.886634 | 2.032756 | 3.822612 | 2.498827 | 0.000149 | False |
31 | 0.000000 | 0.000000 | 0.689715 | dummy_constant | 0.000000 | 0.000000 | -1.100000 | 0.500000 | 0.000148 | False |
32 | 2.792588 | 2.389947 | 0.397201 | dummy_constant | 2.792588 | 2.389947 | 12.477969 | 8.827859 | 0.000149 | False |
33 | 1.425150 | 1.295618 | 0.057613 | dummy_constant | 1.425150 | 1.295618 | 2.637275 | 1.488911 | 0.000148 | False |
34 | 1.216166 | 1.010418 | 0.374660 | dummy_constant | 1.216166 | 1.010418 | 1.490368 | 0.773420 | 0.000149 | False |
35 | 0.287722 | 1.476969 | 0.590295 | dummy_constant | 0.287722 | 1.476969 | 1.364022 | 0.999531 | 0.000148 | False |
36 | 3.051779 | 2.626877 | 0.222363 | dummy_constant | 3.051779 | 2.626877 | 15.177135 | 11.035183 | 0.000149 | False |
37 | 0.000000 | 0.000000 | 1.000000 | dummy_constant | 0.000000 | 0.000000 | -1.100000 | 0.500000 | 0.000149 | False |
38 | 0.192783 | 2.607771 | 0.824442 | dummy_constant | 0.192783 | 2.607771 | 5.799606 | 4.537082 | 0.000149 | False |
39 | 0.376958 | 1.556597 | 0.843966 | dummy_constant | 0.376958 | 1.556597 | 1.644098 | 1.131538 | 0.000146 | False |
Plot results¶
Here we plot the resulting observations in input space, colored by feasibility (neglecting the fact that these data points are at varying fidelities).
fig, ax = plt.subplots()
theta = np.linspace(0, np.pi / 2)
r = np.sqrt(1 + 0.1 * np.cos(16 * theta))
x_1 = r * np.sin(theta)
x_2_lower = r * np.cos(theta)
x_2_upper = (0.5 - (x_1 - 0.5) ** 2) ** 0.5 + 0.5
z = np.zeros_like(x_1)
# ax2.plot(x_1, x_2_lower,'r')
ax.fill_between(x_1, z, x_2_lower, fc="white")
circle = plt.Circle(
(0.5, 0.5), 0.5**0.5, color="r", alpha=0.25, zorder=0, label="Valid Region"
)
ax.add_patch(circle)
history = pd.concat(
[X.data, tnk_vocs.feasibility_data(X.data)], axis=1, ignore_index=False
)
ax.plot(*history[["x1", "x2"]][history["feasible"]].to_numpy().T, ".C1")
ax.plot(*history[["x1", "x2"]][~history["feasible"]].to_numpy().T, ".C2")
ax.set_xlim(0, 3.14)
ax.set_ylim(0, 3.14)
ax.set_xlabel("x1")
ax.set_ylabel("x2")
ax.set_aspect("equal")
Plot path through input space¶
ax = history.hist(["x1", "x2", "s"], bins=20)
history.plot(y=["x1", "x2", "s"])
<Axes: >
Plot the acqusisition function¶
Here we plot the acquisition function at a small set of fidelities $[0, 0.5, 1.0]$.
# plot the acquisition function
bounds = X.generator.vocs.bounds
model = X.generator.model
# create mesh over non-fidelity parameters
n = 50
x = torch.linspace(*bounds.T[1], n)
y = torch.linspace(*bounds.T[2], n)
xx, yy = torch.meshgrid(x, y)
# plot function(s) at a single fidelity parameter
fidelities = [0.0, 0.5, 1.0]
for fidelity in fidelities:
pts = torch.hstack([ele.reshape(-1, 1) for ele in (xx, yy)]).double()
pts = torch.cat((torch.ones(pts.shape[0], 1) * fidelity, pts), dim=-1)
acq_func = X.generator.get_acquisition(model)
with torch.no_grad():
acq_pts = pts.unsqueeze(1)
acq = acq_func(acq_pts)
fig, ax = plt.subplots()
xxn, yyn = xx.numpy(), yy.numpy()
c = ax.pcolor(xxn, yyn, acq.reshape(n, n), cmap="Blues")
fig.colorbar(c)
ax.set_title(f"Acquisition function - s: {fidelity}")
ax.plot(*history[["x1", "x2"]][history["feasible"]].to_numpy().T, ".C1")
ax.plot(*history[["x1", "x2"]][~history["feasible"]].to_numpy().T, ".C2")
ax.plot(*history[["x1", "x2"]].to_numpy()[-1].T, "+")
candidate = pd.DataFrame(X.generator.generate(1), index=[0])
print(candidate[["x1", "x2"]].to_numpy())
ax.plot(*candidate[["x1", "x2"]].to_numpy()[0], "o")
[[3.04401602 2.39628251]]
[<matplotlib.lines.Line2D at 0x7fdfac6ba490>]
# examine lengthscale of the first objective
list(model.models[0].named_parameters())
[('likelihood.noise_covar.raw_noise', Parameter containing: tensor([-41.5299], dtype=torch.float64, requires_grad=True)), ('mean_module.raw_constant', Parameter containing: tensor(-0.0591, dtype=torch.float64, requires_grad=True)), ('covar_module.raw_lengthscale', Parameter containing: tensor([[ 0.7277, 46.9467, 41.1820]], dtype=torch.float64, requires_grad=True))]