Multi-fidelity Multi-objective Bayesian Optimization¶
Here we attempt to solve for the constrained Pareto front of the TNK multi-objective optimization problem using Multi-Fidelity Multi-Objective Bayesian optimization. For simplicity we assume that the objective and constraint functions at lower fidelities is exactly equal to the functions at higher fidelities (this is obviously not a requirement, although for the best results lower fidelity calculations should correlate with higher fidelity ones). The algorithm should learn this relationship and use information gathered at lower fidelities to gather samples to improve the hypervolume of the Pareto front at the maximum fidelity.
TNK function $n=2$ variables: $x_i \in [0, \pi], i=1,2$
Objectives:
- $f_i(x) = x_i$
Constraints:
- $g_1(x) = -x_1^2 -x_2^2 + 1 + 0.1 \cos\left(16 \arctan \frac{x_1}{x_2}\right) \le 0$
- $g_2(x) = (x_1 - 1/2)^2 + (x_2-1/2)^2 \le 0.5$
# set values if testing
import os
from copy import deepcopy
import pandas as pd
import numpy as np
from xopt import Xopt, Evaluator
from xopt.generators.bayesian import MultiFidelityGenerator
from xopt.resources.test_functions.tnk import evaluate_TNK, tnk_vocs
from xopt.vocs import get_feasibility_data
import matplotlib.pyplot as plt
# Ignore all warnings
import warnings
warnings.filterwarnings("ignore")
SMOKE_TEST = os.environ.get("SMOKE_TEST")
N_MC_SAMPLES = 1 if SMOKE_TEST else 128
NUM_RESTARTS = 1 if SMOKE_TEST else 20
BUDGET = 0.02 if SMOKE_TEST else 10
evaluator = Evaluator(function=evaluate_TNK)
print(tnk_vocs.dict())
{'variables': {'x1': {'dtype': None, 'default_value': None, 'domain': [0.0, 3.14159], 'type': 'ContinuousVariable'}, 'x2': {'dtype': None, 'default_value': None, 'domain': [0.0, 3.14159], 'type': 'ContinuousVariable'}}, 'objectives': {'y1': {'dtype': None, 'type': 'MinimizeObjective'}, 'y2': {'dtype': None, 'type': 'MinimizeObjective'}}, 'constraints': {'c1': {'dtype': None, 'value': 0.0, 'type': 'GreaterThanConstraint'}, 'c2': {'dtype': None, 'value': 0.5, 'type': 'LessThanConstraint'}}, 'constants': {'a': {'dtype': None, 'value': 'dummy_constant', 'type': 'Constant'}}, 'observables': {}}
Set up the Multi-Fidelity Multi-objective optimization algorithm¶
Here we create the Multi-Fidelity generator object which can solve both single and multi-objective optimization problems depending on the number of objectives in VOCS. We specify a cost function as a function of fidelity parameter $s=[0,1]$ as $C(s) = s^{3.5} + 0.001$ as an example from a real life multi-fidelity simulation problem.
my_vocs = deepcopy(tnk_vocs)
generator = MultiFidelityGenerator(vocs=my_vocs, reference_point={"y1": 1.5, "y2": 1.5})
# set cost function according to approximate scaling of laser plasma accelerator
# problem, see https://journals.aps.org/prresearch/abstract/10.1103/PhysRevResearch.5.013063
generator.cost_function = lambda s: s**3.5 + 0.001
generator.numerical_optimizer.n_restarts = NUM_RESTARTS
generator.n_monte_carlo_samples = N_MC_SAMPLES
generator.gp_constructor.use_low_noise_prior = True
X = Xopt(generator=generator, evaluator=evaluator)
# evaluate at some explicit initial points
X.evaluate_data(pd.DataFrame({"x1": [1.0, 0.75], "x2": [0.75, 1.0], "s": [0.0, 0.1]}))
X
Xopt
________________________________
Version: 0.1.dev1+gb9c9a914f
Data size: 2
Config as YAML:
dump_file: null
evaluator:
function: xopt.resources.test_functions.tnk.evaluate_TNK
function_kwargs:
raise_probability: 0
random_sleep: 0
sleep: 0
max_workers: 1
vectorized: false
generator:
computation_time: null
custom_objective: null
fixed_features: null
gp_constructor:
covar_modules: {}
custom_noise_prior: null
mean_modules: {}
name: standard
train_config: null
train_kwargs: null
train_method: lbfgs
train_model: true
trainable_mean_keys: []
transform_inputs: true
use_cached_hyperparameters: false
use_low_noise_prior: true
max_travel_distances: null
model: null
n_candidates: 1
n_interpolate_points: null
n_monte_carlo_samples: 128
name: multi_fidelity
numerical_optimizer:
discrete_max_batch_size: 2048
discrete_max_choices: 4096
max_iter: 1000
max_time: 5.0
mixed_max_discrete_configurations: 512
n_restarts: 20
name: LBFGS
reference_point:
s: 0.0
y1: 1.5
y2: 1.5
returns_id: false
supports_batch_generation: true
supports_constraints: true
supports_discrete_variables: true
supports_multi_objective: true
turbo_controller: null
use_cuda: false
use_pf_as_initial_points: false
vocs:
constants:
a:
dtype: null
type: Constant
value: dummy_constant
constraints:
c1:
dtype: null
type: GreaterThanConstraint
value: 0.0
c2:
dtype: null
type: LessThanConstraint
value: 0.5
objectives:
s:
dtype: null
type: MaximizeObjective
y1:
dtype: null
type: MinimizeObjective
y2:
dtype: null
type: MinimizeObjective
observables: {}
variables:
s:
default_value: null
domain:
- 0.0
- 1.0
dtype: null
type: ContinuousVariable
x1:
default_value: null
domain:
- 0.0
- 3.14159
dtype: null
type: ContinuousVariable
x2:
default_value: null
domain:
- 0.0
- 3.14159
dtype: null
type: ContinuousVariable
serialize_inline: false
serialize_torch: false
stopping_condition: null
strict: true
Run optimization routine¶
Instead of ending the optimization routine after an explict number of samples we end optimization once a given optimization budget has been exceeded. WARNING: This will slightly exceed the given budget
budget = BUDGET
while X.generator.calculate_total_cost() < budget:
X.step()
print(
f"n_samples: {len(X.data)} "
f"budget used: {X.generator.calculate_total_cost():.4} "
f"hypervolume: {X.generator.get_pareto_front_and_hypervolume()[-1]:.4}"
)
n_samples: 3 budget used: 0.003316 hypervolume: 0.03881
n_samples: 4 budget used: 0.004424 hypervolume: 0.03881
n_samples: 5 budget used: 0.005918 hypervolume: 0.08459
n_samples: 6 budget used: 0.009127 hypervolume: 0.08459
n_samples: 7 budget used: 0.01549 hypervolume: 0.08459
n_samples: 8 budget used: 0.03001 hypervolume: 0.08459
n_samples: 9 budget used: 0.04314 hypervolume: 0.1813
n_samples: 10 budget used: 0.04999 hypervolume: 0.1813
n_samples: 11 budget used: 0.07678 hypervolume: 0.2785
n_samples: 12 budget used: 0.08404 hypervolume: 0.3316
n_samples: 13 budget used: 0.1349 hypervolume: 0.4114
n_samples: 14 budget used: 1.136 hypervolume: 0.4114
n_samples: 15 budget used: 2.025 hypervolume: 0.4114
n_samples: 16 budget used: 2.115 hypervolume: 0.499
n_samples: 17 budget used: 2.117 hypervolume: 0.499
n_samples: 18 budget used: 2.341 hypervolume: 0.499
n_samples: 19 budget used: 2.748 hypervolume: 0.6855
n_samples: 20 budget used: 3.749 hypervolume: 0.8236
n_samples: 21 budget used: 3.851 hypervolume: 0.8236
n_samples: 22 budget used: 3.874 hypervolume: 0.8729
n_samples: 23 budget used: 4.023 hypervolume: 0.8729
n_samples: 24 budget used: 4.124 hypervolume: 0.9434
n_samples: 25 budget used: 4.479 hypervolume: 1.037
n_samples: 26 budget used: 5.48 hypervolume: 1.158
n_samples: 27 budget used: 6.481 hypervolume: 1.215
n_samples: 28 budget used: 7.219 hypervolume: 1.215
n_samples: 29 budget used: 7.377 hypervolume: 1.215
n_samples: 30 budget used: 8.378 hypervolume: 1.253
n_samples: 31 budget used: 8.736 hypervolume: 1.253
n_samples: 32 budget used: 8.739 hypervolume: 1.253
n_samples: 33 budget used: 9.74 hypervolume: 1.253
n_samples: 34 budget used: 9.829 hypervolume: 1.253
n_samples: 35 budget used: 9.839 hypervolume: 1.253
n_samples: 36 budget used: 9.862 hypervolume: 1.253
n_samples: 37 budget used: 10.86 hypervolume: 1.266
Show results¶
X.data
| x1 | x2 | s | a | y1 | y2 | c1 | c2 | xopt_runtime | xopt_error | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.000000 | 0.750000 | 0.000000 | dummy_constant | 1.000000 | 0.750000 | 0.626888 | 0.312500 | 0.002409 | False |
| 1 | 0.750000 | 1.000000 | 0.100000 | dummy_constant | 0.750000 | 1.000000 | 0.626888 | 0.312500 | 0.000275 | False |
| 2 | 0.516428 | 1.031562 | 0.011991 | dummy_constant | 0.516428 | 1.031562 | 0.289350 | 0.282828 | 0.002939 | False |
| 3 | 0.307670 | 0.254672 | 0.073507 | dummy_constant | 0.307670 | 0.254672 | -0.847208 | 0.097176 | 0.008541 | False |
| 4 | 0.971816 | 0.296655 | 0.113573 | dummy_constant | 0.971816 | 0.296655 | 0.029631 | 0.263960 | 0.005130 | False |
| 5 | 3.018309 | 0.111447 | 0.174270 | dummy_constant | 3.018309 | 0.111447 | 8.039543 | 6.492853 | 0.005774 | False |
| 6 | 0.000000 | 0.498703 | 0.224551 | dummy_constant | 0.000000 | 0.498703 | -0.851295 | 0.250002 | 0.000276 | False |
| 7 | 0.401659 | 0.939949 | 0.292403 | dummy_constant | 0.401659 | 0.939949 | -0.053582 | 0.203226 | 0.005573 | False |
| 8 | 0.896601 | 0.594887 | 0.283466 | dummy_constant | 0.896601 | 0.594887 | 0.257650 | 0.166296 | 0.002838 | False |
| 9 | 0.122126 | 2.095126 | 0.230211 | dummy_constant | 0.122126 | 2.095126 | 3.344811 | 2.687215 | 0.000270 | False |
| 10 | 0.970004 | 0.152833 | 0.351638 | dummy_constant | 0.970004 | 0.152833 | 0.044403 | 0.341428 | 0.000274 | False |
| 11 | 0.286767 | 1.077574 | 0.234691 | dummy_constant | 0.286767 | 1.077574 | 0.295743 | 0.379059 | 0.000280 | False |
| 12 | 0.616374 | 0.852573 | 0.424597 | dummy_constant | 0.616374 | 0.852573 | 0.189853 | 0.137851 | 0.000292 | False |
| 13 | 0.000000 | 0.000000 | 1.000000 | dummy_constant | 0.000000 | 0.000000 | -1.100000 | 0.500000 | 0.000300 | False |
| 14 | 1.327996 | 0.910125 | 0.966776 | dummy_constant | 1.327996 | 0.910125 | 1.690132 | 0.853779 | 0.000315 | False |
| 15 | 1.024391 | 0.090779 | 0.500312 | dummy_constant | 1.024391 | 0.090779 | 0.042021 | 0.442447 | 0.000279 | False |
| 16 | 0.160052 | 1.309814 | 0.152721 | dummy_constant | 0.160052 | 1.309814 | 0.777826 | 0.771363 | 0.000265 | False |
| 17 | 0.762878 | 0.699104 | 0.651227 | dummy_constant | 0.762878 | 0.699104 | -0.005915 | 0.108747 | 0.000262 | False |
| 18 | 1.030827 | 0.073515 | 0.773036 | dummy_constant | 1.030827 | 0.073515 | 0.026170 | 0.463667 | 0.000261 | False |
| 19 | 1.053107 | 0.138623 | 1.000000 | dummy_constant | 1.053107 | 0.138623 | 0.178223 | 0.436520 | 0.000265 | False |
| 20 | 0.827874 | 0.072607 | 0.518400 | dummy_constant | 0.827874 | 0.072607 | -0.326383 | 0.290166 | 0.000266 | False |
| 21 | 0.074167 | 1.051628 | 0.337071 | dummy_constant | 0.074167 | 1.051628 | 0.068443 | 0.485627 | 0.000270 | False |
| 22 | 0.157619 | 0.826109 | 0.579630 | dummy_constant | 0.157619 | 0.826109 | -0.193482 | 0.223572 | 0.000271 | False |
| 23 | 0.060220 | 1.032564 | 0.517887 | dummy_constant | 0.060220 | 1.032564 | 0.010198 | 0.477031 | 0.000301 | False |
| 24 | 0.103778 | 1.050874 | 0.743058 | dummy_constant | 0.103778 | 1.050874 | 0.115522 | 0.460454 | 0.000278 | False |
| 25 | 0.088757 | 1.033547 | 1.000000 | dummy_constant | 0.088757 | 1.033547 | 0.056216 | 0.453793 | 0.000274 | False |
| 26 | 0.859538 | 0.565372 | 1.000000 | dummy_constant | 0.859538 | 0.565372 | 0.157783 | 0.133541 | 0.000273 | False |
| 27 | 0.768922 | 0.236373 | 0.916558 | dummy_constant | 0.768922 | 0.236373 | -0.358828 | 0.141818 | 0.000277 | False |
| 28 | 0.451911 | 0.143020 | 0.589285 | dummy_constant | 0.451911 | 0.143020 | -0.794372 | 0.129748 | 0.000272 | False |
| 29 | 0.507321 | 0.879028 | 1.000000 | dummy_constant | 0.507321 | 0.879028 | 0.079845 | 0.143716 | 0.000264 | False |
| 30 | 0.386463 | 0.706519 | 0.745232 | dummy_constant | 0.386463 | 0.706519 | -0.336084 | 0.055541 | 0.000265 | False |
| 31 | 0.819144 | 0.202407 | 0.155618 | dummy_constant | 0.819144 | 0.202407 | -0.213804 | 0.190415 | 0.000266 | False |
| 32 | 0.665331 | 0.756139 | 1.000000 | dummy_constant | 0.665331 | 0.756139 | -0.037862 | 0.092941 | 0.000294 | False |
| 33 | 0.401750 | 0.392726 | 0.499300 | dummy_constant | 0.401750 | 0.392726 | -0.782717 | 0.021161 | 0.000263 | False |
| 34 | 0.047730 | 1.306389 | 0.265669 | dummy_constant | 0.047730 | 1.306389 | 0.625520 | 0.854811 | 0.000266 | False |
| 35 | 1.397299 | 0.010774 | 0.333132 | dummy_constant | 1.397299 | 0.010774 | 0.853320 | 1.044487 | 0.000281 | False |
| 36 | 0.307247 | 0.971863 | 1.000000 | dummy_constant | 0.307247 | 0.971863 | 0.020346 | 0.259808 | 0.000263 | False |
Plot results¶
Here we plot the resulting observations in input space, colored by feasibility (neglecting the fact that these data points are at varying fidelities).
fig, ax = plt.subplots()
theta = np.linspace(0, np.pi / 2)
r = np.sqrt(1 + 0.1 * np.cos(16 * theta))
x_1 = r * np.sin(theta)
x_2_lower = r * np.cos(theta)
x_2_upper = (0.5 - (x_1 - 0.5) ** 2) ** 0.5 + 0.5
z = np.zeros_like(x_1)
# ax2.plot(x_1, x_2_lower,'r')
ax.fill_between(x_1, z, x_2_lower, fc="white")
circle = plt.Circle(
(0.5, 0.5), 0.5**0.5, color="r", alpha=0.25, zorder=0, label="Valid Region"
)
ax.add_patch(circle)
history = pd.concat(
[X.data, get_feasibility_data(tnk_vocs, X.data)], axis=1, ignore_index=False
)
ax.plot(*history[["x1", "x2"]][history["feasible"]].to_numpy().T, ".C1")
ax.plot(*history[["x1", "x2"]][~history["feasible"]].to_numpy().T, ".C2")
ax.set_xlim(0, 3.14)
ax.set_ylim(0, 3.14)
ax.set_xlabel("x1")
ax.set_ylabel("x2")
ax.set_aspect("equal")
Plot path through input space¶
ax = history.hist(["x1", "x2", "s"], bins=20)
history.plot(y=["x1", "x2", "s"])
<Axes: >
Plot the acquisition function¶
Here we plot the acquisition function at a small set of fidelities $[0, 0.5, 1.0]$.
fidelities = [0.0, 0.5, 1.0]
for fidelity in fidelities:
X.generator.visualize_model(
variable_names=["x1", "x2"],
reference_point={"s": fidelity},
)
# examine lengthscale of the first objective
list(X.generator.model.models[0].named_parameters())
[('likelihood.noise_covar.raw_noise',
Parameter containing:
tensor([-94.2066], requires_grad=True)),
('mean_module.raw_constant',
Parameter containing:
tensor(1.0965, requires_grad=True)),
('covar_module.raw_lengthscale',
Parameter containing:
tensor([[ 0.4354, 21.7899, 41.4095]], requires_grad=True))]