Multi-Fidelity Generator¶

`MultiFidelityGenerator` ¶

Bases: MOBOGenerator

Implements Multi-fidelity Bayesian optimization.

Attributes:

Name	Type	Description
`name`	`str`	The name of the generator.
`fidelity_parameter`	`Literal['s']`	The fidelity parameter name.
`cost_function`	`Callable`	Callable function that describes the cost of evaluating the objective function.
`reference_point`	`Optional[Dict[str, float]]`	The reference point for multi-objective optimization.
`supports_multi_objective`	`bool`	Indicates if the generator supports multi-objective optimization.
`supports_batch_generation`	`bool`	Indicates if the generator supports batch candidate generation.

Methods:

Name	Description
`validate_vocs`	Validate the VOCS for the generator.
`calculate_total_cost`	Calculate the total cost of data samples using the fidelity parameter.
`get_acquisition`	Get the acquisition function for Bayesian Optimization.
`_get_acquisition`	Create the Multi-Fidelity Knowledge Gradient acquisition function.
`add_data`	Add new data to the generator.
`fidelity_variable_index`	Get the index of the fidelity variable.
`fidelity_objective_index`	Get the index of the fidelity objective.
`get_optimum`	Select the best point at the maximum fidelity.

Source code in xopt/generators/bayesian/multi_fidelity.py

class MultiFidelityGenerator(MOBOGenerator):
    """
    Implements Multi-fidelity Bayesian optimization.

    Attributes
    ----------
    name : str
        The name of the generator.
    fidelity_parameter : Literal["s"]
        The fidelity parameter name.
    cost_function : Callable
        Callable function that describes the cost of evaluating the objective function.
    reference_point : Optional[Dict[str, float]]
        The reference point for multi-objective optimization.
    supports_multi_objective : bool
        Indicates if the generator supports multi-objective optimization.
    supports_batch_generation : bool
        Indicates if the generator supports batch candidate generation.

    Methods
    -------
    validate_vocs(cls, v: VOCS) -> VOCS
        Validate the VOCS for the generator.
    calculate_total_cost(self, data: pd.DataFrame = None) -> float
        Calculate the total cost of data samples using the fidelity parameter.
    get_acquisition(self, model: torch.nn.Module) -> NMOMF
        Get the acquisition function for Bayesian Optimization.
    _get_acquisition(self, model: torch.nn.Module) -> NMOMF
        Create the Multi-Fidelity Knowledge Gradient acquisition function.
    add_data(self, new_data: pd.DataFrame)
        Add new data to the generator.
    fidelity_variable_index(self) -> int
        Get the index of the fidelity variable.
    fidelity_objective_index(self) -> int
        Get the index of the fidelity objective.
    get_optimum(self) -> pd.DataFrame
        Select the best point at the maximum fidelity.
    """

    name = "multi_fidelity"
    fidelity_parameter: Literal["s"] = Field(
        "s", description="fidelity parameter name", exclude=True
    )
    cost_function: Callable = Field(
        lambda x: x + 1.0,
        description="callable function that describes the cost "
        "of evaluating the objective function",
        exclude=True,
    )
    reference_point: Optional[Dict[str, float]] = None
    supports_multi_objective: bool = True
    supports_batch_generation: bool = True
    supports_constraints: bool = True

    __doc__ = """Implements Multi-fidelity Bayesian optimization
        Assumes a fidelity parameter [0,1]
        """

    @field_validator("vocs", mode="before")
    def validate_vocs(cls, v: VOCS) -> VOCS:
        """
        Validate the VOCS for the generator.

        Parameters
        ----------
        v : VOCS
            The VOCS to be validated.

        Returns
        -------
        VOCS
            The validated VOCS.

        Raises
        ------
        ValueError
            If constraints are present in the VOCS.
        """
        v.variables["s"] = ContinuousVariable(domain=[0, 1])
        v.objectives["s"] = MaximizeObjective()

        return v

    def __init__(self, **kwargs):
        reference_point = kwargs.pop("reference_point", None)
        vocs = kwargs.get("vocs")
        # set reference point
        if reference_point is None:
            reference_point = {}
            for name, val in vocs.objectives.items():
                if name != "s":
                    if isinstance(val, MaximizeObjective):
                        reference_point.update({name: -100.0})
                    elif isinstance(val, MinimizeObjective):
                        reference_point.update({name: 100.0})
                    else:
                        raise ValueError(
                            f"objective {name} must be MaximizeObjective or MinimizeObjective"
                        )

        reference_point.update({"s": 0.0})

        super(MultiFidelityGenerator, self).__init__(
            **kwargs, reference_point=reference_point
        )

    def calculate_total_cost(self, data: pd.DataFrame = None) -> float:
        """
        Calculate the total cost of data samples using the fidelity parameter.

        Parameters
        ----------
        data : pd.DataFrame, optional
            The data samples, by default None.

        Returns
        -------
        float
            The total cost of the data samples.
        """
        if data is None:
            data = self.data

        f_data = self.get_input_data(data)

        # apply callable function to get costs
        return self.cost_function(f_data[..., self.fidelity_variable_index]).sum()

    def get_acquisition(self, model: torch.nn.Module) -> NMOMF:
        """
        Get the acquisition function for Bayesian Optimization.

        Parameters
        ----------
        model : torch.nn.Module
            The model used for Bayesian Optimization.

        Returns
        -------
        NMOMF
            The acquisition function.
        """
        if model is None:
            raise ValueError("model cannot be None")

        # get base acquisition function
        acq = self._get_acquisition(model)
        return acq

    def _get_acquisition(self, model: torch.nn.Module) -> NMOMF:
        """
        Create the Multi-Fidelity Knowledge Gradient acquisition function.

        In order for MFKG to evaluate the information gain, it uses the model to
        predict the function value at the highest fidelity after conditioning
        on the observation. This is handled by the project argument, which specifies
        how to transform a tensor X to its target fidelity. We use a default helper
        function called project_to_target_fidelity to achieve this.

        An important point to keep in mind: in the case of standard KG, one can ignore
        the current value and simply optimize the expected maximum posterior mean of the
        next stage. However, for MFKG, since the goal is optimize information gain per
        cost, it is important to first compute the current value (i.e., maximum of the
        posterior mean at the target fidelity). To accomplish this, we use a
        FixedFeatureAcquisitionFunction on top of a PosteriorMean.

        Parameters
        ----------
        model : torch.nn.Module
            The model used for Bayesian Optimization.

        Returns
        -------
        NMOMF
            The Multi-Fidelity Knowledge Gradient acquisition function.
        """
        X_baseline = self.get_input_data(self.data)

        # wrap the cost function such that it only has to accept the fidelity parameter
        def true_cost_function(X: torch.Tensor) -> torch.Tensor:
            return self.cost_function(X[..., self.fidelity_variable_index])

        acq_func = NMOMF(
            model=model,
            X_baseline=X_baseline,
            ref_point=self.torch_reference_point,
            cost_call=true_cost_function,
            objective=self._get_objective(),
            constraints=self._get_constraint_callables(),
            cache_root=False,
            prune_baseline=True,
        )

        return acq_func

    def add_data(self, new_data: pd.DataFrame):
        """
        Add new data to the generator.

        Parameters
        ----------
        new_data : pd.DataFrame
            The new data to be added.

        Raises
        ------
        ValueError
            If the fidelity parameter is not in the new data or if the fidelity
            values are outside the range [0,1].
        """
        if self.fidelity_parameter not in new_data:
            raise ValueError(
                f"fidelity parameter {self.fidelity_parameter} must be in added data"
            )

        # overwrite add data to check for valid fidelity values
        if (new_data[self.fidelity_parameter] > 1.0).any() or (
            new_data[self.fidelity_parameter] < 0.0
        ).any():
            raise ValueError("cannot add fidelity data that is outside the range [0,1]")
        super().add_data(new_data)

    @property
    def fidelity_variable_index(self) -> int:
        """
        Get the index of the fidelity variable.

        Returns
        -------
        int
            The index of the fidelity variable.
        """
        return self.vocs.variable_names.index(self.fidelity_parameter)

    @property
    def fidelity_objective_index(self) -> int:
        """
        Get the index of the fidelity objective.

        Returns
        -------
        int
            The index of the fidelity objective.
        """
        return self.vocs.objective_names.index(self.fidelity_parameter)

    def get_optimum(self) -> pd.DataFrame:
        """
        Select the best point at the maximum fidelity.

        Returns
        -------
        pd.DataFrame
            The best point at the maximum fidelity.
        """
        # define single objective based on vocs
        weights = torch.zeros(self.vocs.n_outputs, **self.tkwargs)
        for idx, ele in enumerate(self.vocs.objective_names):
            if isinstance(self.vocs.objectives[ele], MinimizeObjective):
                weights[idx] = -1.0
            elif isinstance(self.vocs.objectives[ele], MaximizeObjective):
                weights[idx] = 1.0

        def obj_callable(
            Z: torch.Tensor, X: Optional[torch.Tensor] = None
        ) -> torch.Tensor:
            return torch.matmul(Z, weights.reshape(-1, 1)).squeeze(-1)

        c_posterior_mean = ConstrainedMCAcquisitionFunction(
            self.model,
            qUpperConfidenceBound(
                model=self.model, beta=0.0, objective=GenericMCObjective(obj_callable)
            ),
            self._get_constraint_callables(),
        )

        max_fidelity_c_posterior_mean = FixedFeatureAcquisitionFunction(
            c_posterior_mean,
            self.vocs.n_variables,
            [self.fidelity_variable_index],
            [1.0],
        )

        boundst = self._get_bounds().T
        fixed_bounds = torch.cat(
            (
                boundst[: self.fidelity_variable_index],
                boundst[self.fidelity_variable_index + 1 :],
            )
        ).T

        result = self.numerical_optimizer.optimize(
            max_fidelity_c_posterior_mean, fixed_bounds, 1
        )

        vnames = deepcopy(self.vocs.variable_names)
        del vnames[self.fidelity_variable_index]
        df = pd.DataFrame(result.detach().cpu().numpy(), columns=vnames)
        df[self.fidelity_parameter] = 1.0

        return convert_dataframe_to_inputs(self.vocs, df)

`fidelity_objective_index` `property` ¶

Get the index of the fidelity objective.

Returns:

Type	Description
`int`	The index of the fidelity objective.

`fidelity_variable_index` `property` ¶

Get the index of the fidelity variable.

Returns:

Type	Description
`int`	The index of the fidelity variable.

`model_input_names` `property` ¶

variable names corresponding to trained model

`add_data(new_data)` ¶

Add new data to the generator.

Parameters:

Name	Type	Description	Default
`new_data`	`DataFrame`	The new data to be added.	required

Raises:

Type	Description
`ValueError`	If the fidelity parameter is not in the new data or if the fidelity values are outside the range [0,1].

Source code in xopt/generators/bayesian/multi_fidelity.py

def add_data(self, new_data: pd.DataFrame):
    """
    Add new data to the generator.

    Parameters
    ----------
    new_data : pd.DataFrame
        The new data to be added.

    Raises
    ------
    ValueError
        If the fidelity parameter is not in the new data or if the fidelity
        values are outside the range [0,1].
    """
    if self.fidelity_parameter not in new_data:
        raise ValueError(
            f"fidelity parameter {self.fidelity_parameter} must be in added data"
        )

    # overwrite add data to check for valid fidelity values
    if (new_data[self.fidelity_parameter] > 1.0).any() or (
        new_data[self.fidelity_parameter] < 0.0
    ).any():
        raise ValueError("cannot add fidelity data that is outside the range [0,1]")
    super().add_data(new_data)

`calculate_total_cost(data=None)` ¶

Calculate the total cost of data samples using the fidelity parameter.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`	The data samples, by default None.	`None`

Returns:

Type	Description
`float`	The total cost of the data samples.

Source code in xopt/generators/bayesian/multi_fidelity.py

def calculate_total_cost(self, data: pd.DataFrame = None) -> float:
    """
    Calculate the total cost of data samples using the fidelity parameter.

    Parameters
    ----------
    data : pd.DataFrame, optional
        The data samples, by default None.

    Returns
    -------
    float
        The total cost of the data samples.
    """
    if data is None:
        data = self.data

    f_data = self.get_input_data(data)

    # apply callable function to get costs
    return self.cost_function(f_data[..., self.fidelity_variable_index]).sum()

`generate(n_candidates)` ¶

Generate candidates using Bayesian Optimization.

Parameters:

Name	Type	Description	Default
`n_candidates`	`int`	The number of candidates to generate in each optimization step.	required

Returns:

Type	Description
`List[Dict]`	A list of dictionaries containing the generated candidates.

Raises:

Type	Description
`NotImplementedError`	If the number of candidates is greater than 1, and the generator does not support batch candidate generation.
`RuntimeError`	If no data is contained in the generator, the 'add_data' method should be called to add data before generating candidates.

Notes

This method generates candidates for Bayesian Optimization based on the provided number of candidates. It updates the internal model with the current data and calculates the candidates by optimizing the acquisition function. The method returns the generated candidates in the form of a list of dictionaries.

Source code in xopt/generators/bayesian/bayesian_generator.py

def generate(self, n_candidates: int):
    """
    Generate candidates using Bayesian Optimization.

    Parameters
    ----------
    n_candidates : int
        The number of candidates to generate in each optimization step.

    Returns
    -------
    List[Dict]
        A list of dictionaries containing the generated candidates.

    Raises
    ------
    NotImplementedError
        If the number of candidates is greater than 1, and the generator does not
        support batch candidate generation.

    RuntimeError
        If no data is contained in the generator, the 'add_data' method should be
        called to add data before generating candidates.

    Notes
    -----
    This method generates candidates for Bayesian Optimization based on the
    provided number of candidates. It updates the internal model with the current
    data and calculates the candidates by optimizing the acquisition function.
    The method returns the generated candidates in the form of a list of dictionaries.
    """

    self.n_candidates = n_candidates
    if n_candidates > 1 and not self.supports_batch_generation:
        raise NotImplementedError(
            "This Bayesian algorithm does not currently support parallel candidate "
            "generation"
        )

    # if no data exists raise error
    if self.data is None:
        raise RuntimeError(
            "no data contained in generator, call `add_data` "
            "method to add data, see also `Xopt.random_evaluate()`"
        )

    else:
        # dict to track runtimes
        timing_results = {}

        # update internal model with internal data
        start_time = time.perf_counter()
        model = self.train_model(self.get_training_data(self.data))
        timing_results["training"] = time.perf_counter() - start_time

        # propose candidates given model
        start_time = time.perf_counter()
        candidates = self.propose_candidates(model, n_candidates=n_candidates)
        timing_results["acquisition_optimization"] = (
            time.perf_counter() - start_time
        )

        # post process candidates
        result = self._process_candidates(candidates)

        # append timing results to dataframe (if it exists)
        if self.computation_time is not None:
            self.computation_time = pd.concat(
                (
                    self.computation_time,
                    pd.DataFrame(timing_results, index=[0]),
                ),
                ignore_index=True,
            )
        else:
            self.computation_time = pd.DataFrame(timing_results, index=[0])

        if self.n_interpolate_points is not None:
            if self.n_candidates > 1:
                raise RuntimeError(
                    "cannot generate interpolated points for "
                    "multiple candidate generation"
                )
            else:
                assert len(result) == 1
                result = interpolate_points(
                    pd.concat(
                        (self.data.iloc[-1:][self.vocs.variable_names], result),
                        axis=0,
                        ignore_index=True,
                    ),
                    num_points=self.n_interpolate_points,
                )

        return result.to_dict("records")

`get_acquisition(model)` ¶

Get the acquisition function for Bayesian Optimization.

Parameters:

Name	Type	Description	Default
`model`	`Module`	The model used for Bayesian Optimization.	required

Returns:

Type	Description
`NMOMF`	The acquisition function.

Source code in xopt/generators/bayesian/multi_fidelity.py

def get_acquisition(self, model: torch.nn.Module) -> NMOMF:
    """
    Get the acquisition function for Bayesian Optimization.

    Parameters
    ----------
    model : torch.nn.Module
        The model used for Bayesian Optimization.

    Returns
    -------
    NMOMF
        The acquisition function.
    """
    if model is None:
        raise ValueError("model cannot be None")

    # get base acquisition function
    acq = self._get_acquisition(model)
    return acq

`get_input_data(data)` ¶

Convert input data to a torch tensor.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`	The input data in the form of a pandas DataFrame.	required

Returns:

Type	Description
`Tensor`	A torch tensor containing the input data.

Notes

This method takes a pandas DataFrame as input data and converts it into a torch tensor. It specifically selects columns corresponding to the model's input names (variables), and the resulting tensor is configured with the data type and device settings from the generator.

Source code in xopt/generators/bayesian/bayesian_generator.py

def get_input_data(self, data: pd.DataFrame) -> torch.Tensor:
    """
    Convert input data to a torch tensor.

    Parameters
    ----------
    data : pd.DataFrame
        The input data in the form of a pandas DataFrame.

    Returns
    -------
    torch.Tensor
        A torch tensor containing the input data.

    Notes
    -----
    This method takes a pandas DataFrame as input data and converts it into a
    torch tensor. It specifically selects columns corresponding to the model's
    input names (variables), and the resulting tensor is configured with the data
    type and device settings from the generator.
    """
    return torch.tensor(
        data[self.model_input_names].to_numpy().copy(), **self.tkwargs
    )

`get_optimum()` ¶

Select the best point at the maximum fidelity.

Returns:

Type	Description
`DataFrame`	The best point at the maximum fidelity.

Source code in xopt/generators/bayesian/multi_fidelity.py

def get_optimum(self) -> pd.DataFrame:
    """
    Select the best point at the maximum fidelity.

    Returns
    -------
    pd.DataFrame
        The best point at the maximum fidelity.
    """
    # define single objective based on vocs
    weights = torch.zeros(self.vocs.n_outputs, **self.tkwargs)
    for idx, ele in enumerate(self.vocs.objective_names):
        if isinstance(self.vocs.objectives[ele], MinimizeObjective):
            weights[idx] = -1.0
        elif isinstance(self.vocs.objectives[ele], MaximizeObjective):
            weights[idx] = 1.0

    def obj_callable(
        Z: torch.Tensor, X: Optional[torch.Tensor] = None
    ) -> torch.Tensor:
        return torch.matmul(Z, weights.reshape(-1, 1)).squeeze(-1)

    c_posterior_mean = ConstrainedMCAcquisitionFunction(
        self.model,
        qUpperConfidenceBound(
            model=self.model, beta=0.0, objective=GenericMCObjective(obj_callable)
        ),
        self._get_constraint_callables(),
    )

    max_fidelity_c_posterior_mean = FixedFeatureAcquisitionFunction(
        c_posterior_mean,
        self.vocs.n_variables,
        [self.fidelity_variable_index],
        [1.0],
    )

    boundst = self._get_bounds().T
    fixed_bounds = torch.cat(
        (
            boundst[: self.fidelity_variable_index],
            boundst[self.fidelity_variable_index + 1 :],
        )
    ).T

    result = self.numerical_optimizer.optimize(
        max_fidelity_c_posterior_mean, fixed_bounds, 1
    )

    vnames = deepcopy(self.vocs.variable_names)
    del vnames[self.fidelity_variable_index]
    df = pd.DataFrame(result.detach().cpu().numpy(), columns=vnames)
    df[self.fidelity_parameter] = 1.0

    return convert_dataframe_to_inputs(self.vocs, df)

`get_pareto_front_and_hypervolume()` ¶

Get the pareto front and hypervolume of the current data.

Returns:

Name	Type	Description
`pareto_front_variables`	`Tensor`	The pareto front variable data.
`pareto_front_objectives`	`Tensor`	The pareto front objective data.
`pareto_mask`	`Tensor`	A mask indicating which points are part of the pareto front.
`hv`	`float`	The hypervolume of the pareto front.

Source code in xopt/generators/bayesian/bayesian_generator.py

def get_pareto_front_and_hypervolume(
    self,
) -> tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None, float]:
    """
    Get the pareto front and hypervolume of the current data.

    Returns
    -------
    pareto_front_variables : torch.Tensor
        The pareto front variable data.
    pareto_front_objectives : torch.Tensor
        The pareto front objective data.
    pareto_mask : torch.Tensor
        A mask indicating which points are part of the pareto front.
    hv : float
        The hypervolume of the pareto front.
    """

    # get scaled data
    # note that the objective data is scaled by +/- 1
    # based on maximization / minimization
    variable_data, objective_data, weights = self._get_scaled_data(data=self.data)

    # if there are no valid points skip PF calculation and return None
    if len(variable_data) == 0:
        return None, None, None, 0.0

    pareto_front_variables, pareto_front_objectives, pareto_mask, hv = (
        compute_hypervolume_and_pf(
            variable_data,
            objective_data,
            self.torch_reference_point,
        )
    )

    # scale the pareto front objectives back to original space
    if pareto_front_objectives is not None:
        pareto_front_objectives = pareto_front_objectives / weights

    return (
        pareto_front_variables,
        pareto_front_objectives,
        pareto_mask,
        hv,
    )

`get_training_data(data)` ¶

Get training data used to train the GP model.

If a turbo controller is specified with the flag restrict_model_data this will return a subset of data that is inside the trust region.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`	The data in the form of a pandas DataFrame.	required

Returns:

Name	Type	Description
`data`	`DataFrame`	A subset of data used to train the model form of a pandas DataFrame.

Source code in xopt/generators/bayesian/bayesian_generator.py

def get_training_data(self, data: pd.DataFrame) -> pd.DataFrame:
    """
    Get training data used to train the GP model.

    If a turbo controller is specified with the flag `restrict_model_data` this
    will return a subset of data that is inside the trust region.

    Parameters
    ----------
    data : pd.DataFrame
        The data in the form of a pandas DataFrame.

    Returns
    -------
    data : pd.DataFrame
        A subset of data used to train the model form of a pandas DataFrame.

    """
    if self.turbo_controller is not None:
        if self.turbo_controller.restrict_model_data:
            data = self.turbo_controller.get_data_in_trust_region(data, self)
            if data.empty:
                raise FeasibilityError(
                    "No training data available to build model, because ",
                    "no points in the dataset are within the TuRBO trust region. ",
                )
    return data

`model_dump(*args, **kwargs)` ¶

overwrite model dump to remove faux class attrs

Source code in xopt/generator.py

def model_dump(self, *args: Any, **kwargs: Any) -> dict[str, Any]:
    """overwrite model dump to remove faux class attrs"""

    res = super().model_dump(*args, **kwargs)

    res.pop("supports_batch_generation", None)
    res.pop("supports_multi_objective", None)

    return res

`propose_candidates(model, n_candidates=1)` ¶

Propose candidates using Bayesian Optimization.

Parameters:

Name	Type	Description	Default
`model`	`Module`	The trained Bayesian model.	required
`n_candidates`	`int`	The number of candidates to propose (default is 1).	`1`

Returns:

Type	Description
`Tensor`	A tensor containing the proposed candidates.

Notes

This method proposes candidates for Bayesian Optimization by numerically optimizing the acquisition function using the trained model. It updates the state of the Turbo controller if used and calculates the optimization bounds.

Source code in xopt/generators/bayesian/bayesian_generator.py

def propose_candidates(self, model: Module, n_candidates: int = 1) -> Tensor:
    """
    Propose candidates using Bayesian Optimization.

    Parameters
    ----------
    model : Module
        The trained Bayesian model.
    n_candidates : int, optional
        The number of candidates to propose (default is 1).

    Returns
    -------
    Tensor
        A tensor containing the proposed candidates.

    Notes
    -----
    This method proposes candidates for Bayesian Optimization by numerically
    optimizing the acquisition function using the trained model. It updates the
    state of the Turbo controller if used and calculates the optimization bounds.
    """
    # update TurBO state if used with the last `n_candidates` points
    if self.turbo_controller is not None:
        self.turbo_controller.update_state(self, n_candidates)

    # calculate optimization bounds
    bounds = self._get_optimization_bounds()

    # get acquisition function
    acq_funct = self.get_acquisition(model)

    # get initial candidates to start acquisition function optimization
    initial_points = self._get_initial_conditions(n_candidates)

    # get candidates -- grid optimizer does not support batch_initial_conditions
    if isinstance(self.numerical_optimizer, GridOptimizer):
        candidates = self.numerical_optimizer.optimize(
            acq_funct, bounds, n_candidates
        )
    else:
        candidates = self.numerical_optimizer.optimize(
            acq_funct, bounds, n_candidates, batch_initial_conditions=initial_points
        )
    return candidates

`train_model(data=None, update_internal=True)` ¶

Train a Bayesian model for Bayesian Optimization.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`	The data to be used for training the model. If not provided, the internal data of the generator is used.	`None`
`update_internal`	`bool`	Flag to indicate whether to update the internal model of the generator with the trained model (default is True).	`True`

Returns:

Type	Description
`Module`	The trained Bayesian model.

Raises:

Type	Description
`ValueError`	If no data is available to build the model.

Notes

This method trains a Bayesian model using the provided data or the internal data of the generator. It updates the internal model with the trained model if the 'update_internal' flag is set to True.

Source code in xopt/generators/bayesian/bayesian_generator.py

def train_model(
    self, data: pd.DataFrame | None = None, update_internal: bool = True
) -> Module:
    """
    Train a Bayesian model for Bayesian Optimization.

    Parameters
    ----------
    data : pd.DataFrame, optional
        The data to be used for training the model. If not provided, the internal
        data of the generator is used.
    update_internal : bool, optional
        Flag to indicate whether to update the internal model of the generator
        with the trained model (default is True).

    Returns
    -------
    Module
        The trained Bayesian model.

    Raises
    ------
    ValueError
        If no data is available to build the model.

    Notes
    -----
    This method trains a Bayesian model using the provided data or the internal
    data of the generator. It updates the internal model with the trained model
    if the 'update_internal' flag is set to True.
    """
    if data is None:
        data = self.get_training_data(self.data)
        if data is None:
            raise ValueError("no data available to build model")

    if data.empty:
        raise ValueError("no data available to build model")

    # get input bounds
    variable_bounds = {
        name: ele.domain for name, ele in self.vocs.variables.items()
    }

    # if turbo restrict points is true then set the bounds to the trust region
    # bounds
    if self.turbo_controller is not None:
        if self.turbo_controller.restrict_model_data:
            variable_bounds = dict(
                zip(
                    self.vocs.variable_names,
                    self.turbo_controller.get_trust_region(self).numpy().T,
                )
            )

    # add fixed feature bounds if requested
    if self.fixed_features is not None:
        # get bounds for each fixed_feature (vocs bounds take precedent)
        for key in self.fixed_features:
            if key not in variable_bounds:
                if key not in data:
                    raise KeyError(
                        "generator data needs to contain fixed feature "
                        f"column name `{key}`"
                    )
                f_data = data[key]
                bounds = [f_data.min(), f_data.max()]
                if bounds[1] - bounds[0] < 1e-8:
                    bounds[1] = bounds[0] + 1e-8
                variable_bounds[key] = bounds

    _model = self.gp_constructor.build_model(
        self.model_input_names,
        self.vocs.output_names,
        data,
        {name: variable_bounds[name] for name in self.model_input_names},
        **self.tkwargs,
    )

    if update_internal:
        self.model = _model

    return _model

`update_pareto_front_history()` ¶

Update the historical pareto front statistics in the generator.

For each row of data in self.data, compute the pareto front stats (hypervolume, number of non-dominated points) if there is no corresponding entry exists in the self.pareto_front_history DataFrame.

Source code in xopt/generators/bayesian/bayesian_generator.py

def update_pareto_front_history(self):
    """
    Update the historical pareto front statistics in the generator.

    For each row of data in self.data, compute the pareto front stats
    (hypervolume, number of non-dominated points) if there is no
    corresponding entry exists in the `self.pareto_front_history` DataFrame.
    """
    # TODO: make sure this works when manually changing the data frame
    if self.pareto_front_history is None:
        self.pareto_front_history = pd.DataFrame()

    # for each row of data, compute the cumulative pareto front stats
    for i in self.data.index:
        # check if the pareto front stats already exist
        if i in self.pareto_front_history.index:
            continue

        # get scaled data
        variable_data, objective_data, _ = self._get_scaled_data(
            data=self.data.loc[:i]
        )

        # compute the pareto front stats
        _, pareto_front_variables, _, hv = compute_hypervolume_and_pf(
            variable_data,
            objective_data,
            self.torch_reference_point,
        )

        # get the number of non-dominated points
        n_non_dominated = (
            len(pareto_front_variables) if pareto_front_variables is not None else 0
        )

        # create a new row for the pareto front stats
        new_row: dict[str, Any] = {
            "iteration": i,
            "hypervolume": hv,
            "n_non_dominated": n_non_dominated,
        }
        # add the new row to the pareto front history
        self.pareto_front_history = pd.concat(
            [
                self.pareto_front_history,
                pd.DataFrame(new_row, index=[i]),
            ],
            ignore_index=False,
        )

`validate_turbo_controller(value, info)` `classmethod` ¶

note default behavior is no use of turbo

Source code in xopt/generators/bayesian/bayesian_generator.py

@field_validator("turbo_controller", mode="before")
@classmethod
def validate_turbo_controller(cls, value: Any, info: ValidationInfo):
    """note default behavior is no use of turbo"""
    if value is None:
        return value

    compatible_turbo_controllers = [
        turbo_controller
        for turbo_controller in cls.get_compatible_turbo_controllers()
        if turbo_controller is not None
    ]

    if len(compatible_turbo_controllers) == 0:
        raise ValueError("no turbo controllers are compatible with this generator")
    else:
        return validate_turbo_controller_base(
            value, compatible_turbo_controllers, info
        )

`validate_vocs(v)` ¶

Validate the VOCS for the generator.

Parameters:

Name	Type	Description	Default
`v`	`VOCS`	The VOCS to be validated.	required

Returns:

Type	Description
`VOCS`	The validated VOCS.

Raises:

Type	Description
`ValueError`	If constraints are present in the VOCS.

Source code in xopt/generators/bayesian/multi_fidelity.py

@field_validator("vocs", mode="before")
def validate_vocs(cls, v: VOCS) -> VOCS:
    """
    Validate the VOCS for the generator.

    Parameters
    ----------
    v : VOCS
        The VOCS to be validated.

    Returns
    -------
    VOCS
        The validated VOCS.

    Raises
    ------
    ValueError
        If constraints are present in the VOCS.
    """
    v.variables["s"] = ContinuousVariable(domain=[0, 1])
    v.objectives["s"] = MaximizeObjective()

    return v

`visualize_model(**kwargs)` ¶

Display GP model predictions for the selected output(s).

The GP models are displayed with respect to the named variables. If None are given, the list of variables in vocs is used. Feasible samples are indicated with a filled orange "o", infeasible samples with a hollow red "o". Feasibility is calculated with respect to all constraints unless the selected output is a constraint itself, in which case only that one is considered.

Parameters:

Name	Type	Description	Default
`**kwargs`		Supported keyword arguments: - output_names : List[str] Outputs for which the GP models are displayed. Defaults to all outputs in vocs. - variable_names : List[str] The variables with respect to which the GP models are displayed (maximum of 2). Defaults to vocs.variable_names. - idx : int Index of the last sample to use. This also selects the point of reference in higher dimensions unless an explicit reference_point is given. - reference_point : dict Reference point determining the value of variables in vocs.variable_names, but not in variable_names (slice plots in higher dimensions). Defaults to last used sample. - show_samples : bool, optional Whether samples are shown. - show_prior_mean : bool, optional Whether the prior mean is shown. - show_feasibility : bool, optional Whether the feasibility region is shown. - show_acquisition : bool, optional Whether the acquisition function is computed and shown (only if acquisition function is not None). - n_grid : int, optional Number of grid points per dimension used to display the model predictions. - axes : Axes, optional Axes object used for plotting. - exponentiate : bool, optional Flag to exponentiate acquisition function before plotting.	`{}`

Returns:

Name	Type	Description
`result`	`tuple`	The matplotlib figure and axes objects.

Source code in xopt/generators/bayesian/bayesian_generator.py

def visualize_model(self, **kwargs):
    """Display GP model predictions for the selected output(s).

    The GP models are displayed with respect to the named variables. If None are given, the list of variables in
    vocs is used. Feasible samples are indicated with a filled orange "o", infeasible samples with a hollow
    red "o". Feasibility is calculated with respect to all constraints unless the selected output is a
    constraint itself, in which case only that one is considered.

    Parameters
    ----------
    **kwargs: dict, optional
        Supported keyword arguments:
        - output_names : List[str]
            Outputs for which the GP models are displayed. Defaults to all outputs in vocs.
        - variable_names : List[str]
            The variables with respect to which the GP models are displayed (maximum of 2).
            Defaults to vocs.variable_names.
        - idx : int
            Index of the last sample to use. This also selects the point of reference in
            higher dimensions unless an explicit reference_point is given.
        - reference_point : dict
            Reference point determining the value of variables in vocs.variable_names, but not in variable_names
            (slice plots in higher dimensions). Defaults to last used sample.
        - show_samples : bool, optional
            Whether samples are shown.
        - show_prior_mean : bool, optional
            Whether the prior mean is shown.
        - show_feasibility : bool, optional
            Whether the feasibility region is shown.
        - show_acquisition : bool, optional
            Whether the acquisition function is computed and shown (only if acquisition function is not None).
        - n_grid : int, optional
            Number of grid points per dimension used to display the model predictions.
        - axes : Axes, optional
            Axes object used for plotting.
        - exponentiate : bool, optional
            Flag to exponentiate acquisition function before plotting.

    Returns
    -------
    result : tuple
        The matplotlib figure and axes objects.
    """
    return visualize_generator_model(self, **kwargs)

`yaml(**kwargs)` ¶

serialize first then dump to yaml string

Source code in xopt/pydantic.py

def yaml(self, **kwargs):
    """serialize first then dump to yaml string"""
    output = json.loads(
        self.to_json(
            **kwargs,
        )
    )
    return yaml.dump(output)

Multi-Fidelity Generator¶