The SequentialDesign
Class¶
Base class representing a sequential experimental design
This class provides the base implementation of a class for designing experiments sequentially. This means that rather than picking all simulation points in a single step, the points are selected one by one, taking into account the information obtained by determining the true parameter value at each design point when selecting the next one. Sequential designs can be very useful when running expensive, high-dimensional simulations to ensure that a limited computational budget is used effectvely.
Instead of choosing all points at once, which is the case in a one-shot design, a sequential design does some additional computation work at each step to more carefully choose the next point. This means that sequential designs are better suited for very expensive simulations, where the additional cost of choosing the next point is small compared to the overall computational cost of running the simulations.
A sequential design is built on top of a base design (which must be a subclass of the
ExperimentalDesign
class. In addition to the base design, the class must contain information on
how many points are used in the initial design (i.e. the number of starting points used before starting
the sequential steps in the design) and the number of candidate points that are considered during each
iteration. Optionally, a function for evaluating the actual simulation can be optionally bound to the
class instance, which allows the entire design process to be automated. If such a function is not
provided, then the steps to run the design must be carried out manually, with the evaluated
simulation values provided to the class at the end of each simulation in order to determine the
next point.
To use the base class to create an experimental design, a new subclass must be created that provides
a method _eval_metric
, which considers all candidate points and returns the index of the best
candidate. Otherwise, all other code provided here allows for a generic sequential design to be
easily run and managed.
-
class
mogp_emulator.SequentialDesign.
SequentialDesign
(base_design, f=None, n_samples=None, n_init=10, n_cand=50)¶ Base class representing a sequential experimental design
This class provides the base implementation of a class for designing experiments sequentially. This means that rather than picking all simulation points in a single step, the points are selected one by one, taking into account the information obtained by determining the true parameter value at each design point when selecting the next one. Sequential designs can be very useful when running expensive, high-dimensional simulations to ensure that a limited computational budget is used effectvely.
Instead of choosing all points at once, which is the case in a one-shot design, a sequential design does some additional computation work at each step to more carefully choose the next point. This means that sequential designs are better suited for very expensive simulations, where the additional cost of choosing the next point is small compared to the overall computational cost of running the simulations.
A sequential design is built on top of a base design (which must be a subclass of the
ExperimentalDesign
class. In addition to the base design, the class must contain information on how many points are used in the initial design (i.e. the number of starting points used before starting the sequential steps in the design) and the number of candidate points that are considered during each iteration. Optionally, a function for evaluating the actual simulation can be optionally bound to the class instance, which allows the entire design process to be automated. If such a function is not provided, then the steps to run the design must be carried out manually, with the evaluated simulation values provided to the class at the end of each simulation in order to determine the next point.To use the base class to create an experimental design, a new subclass must be created that provides a method
_eval_metric
, which considers all candidate points and returns the index of the best candidate. Otherwise, all other code provided here allows for a generic sequential design to be easily run and managed.-
__init__
(base_design, f=None, n_samples=None, n_init=10, n_cand=50)¶ Create a new instance of a sequential experimental design
Creates a new instance of a sequential experimental design, which sequentially chooses points to be evaluated from a complex simulation function. It is often used for expensive computational models, where the cost of running a single evaluation is large and must be done in series due to computational limitations, and thus some additional computation done at each step to select new points is small compared to the overall cost of running a single simulation.
Sequential designs require specifying a base design using a subclass of
ExperimentalDesign
as well as information on the number of points to use in each step in the design process. Additionally, the function to evaluated can be bound to the class to allow automatic evaluation of the function at each step.Parameters: - base_design (ExperimentalDesign) – Base one-shot experimental design (must be a subclass of
ExperimentalDesign
). This contains the information on the parameter space to be sampled. - f (function or other callable) – Function to be evaluated for the design. Must take all parameter values as a single input array and return a single float or an array of length 1
- n_samples (int or None) – Number of sequential design points to be drawn. If specified, this must be
a non-negative integer. Note that this is in addition to the number of initial
points, meaning that the total design size will be
n_samples + n_init
. This can also be specified when running the full design. This parameter is optional, and defaults toNone
(meaning the number of samples is set when running the design, or that samples will be added manually). - n_init (int) – Number of points in the inital design before the sequential steps begin. Must be a positive integer. Optional, default value is 10.
- n_cand – Number of candidates to consider at each sequential design step. Must be a positive integer. Optional, default value is 50.
- base_design (ExperimentalDesign) – Base one-shot experimental design (must be a subclass of
-
generate_initial_design
()¶ Create initial design
Method to set the initial design inputs. Generates the desired number of points for the initial design by drawing from the base design. Method sets the
inputs
attribute of theSequentialDesign
instance, but also returns the initial design as a numpy array if the simulations are to be run manually. This method can be run repeatedly to draw different initial designs if the initial target values have not been set, but once the targets have been set the method will not overwrite them to prevent corruption of the design.Returns: Initial design points, a 2D numpy array with shape (n_init, n_parameters)
Return type: ndarray
-
get_base_design
()¶ Get type of base design
Returns the type of the base design. The base design must be a subclass of
ExperimentalDesign
, but any one-shot design method can be used to generate the initial design and the candidates.Returns: Base design type as a string Return type: str
-
get_batch_points
(n_points)¶ Batch version of get_next_point for a Sequential Design
This method returns a batch of design points to run from a Sequential Design. This is useful if simulations can be run in parallel, which speeds up the ability to generate designs efficiently. The method simply calls
get_next_point
the required number of times, but rather than using the true value of the simulation it instead substitutes the predicted value that is method-specific. This can be implemented in a subclass by defining the method_estimate_next_target
.Parameters: n_points (int) – Size of batch to generate for the next set of simulation points. This parameter determines the shape of the output array. Must be a positive integer. Returns: Set of batch points chosen using the batch version of the design as a numpy array with shape (n_points, n_parameters)
Return type: ndarray
-
get_candidates
()¶ Get current candidate design input points
Returns a numpy array holding the current candidate design points. The array is 2D and has shape
(n_cand, n_parameters)
. It always has the same size once it is initialized, but the values will change acros iterations as new candidate points are considered at each iteration.Returns: Current value of the candidate design inputs Return type: ndarray
-
get_current_iteration
()¶ Get number of current iteration in the experimental design
Returns the current iteration during the sequential design process. This is mostly useful if the sequential design is being updated manually to know the current iteration.
Returns: Current iteration number Return type: int
-
get_inputs
()¶ Get current design input points
Returns a numpy array holding the current design points. The array is 2D and has shape
(current_iteration, n_parameters)
(i.e. it is resized after each iteration when a new design point is chosen).Returns: Current value of the design inputs Return type: ndarray
-
get_n_cand
()¶ Get number of candidate design points
Returns the number of candidate design points used in each sequential design step. Candidates are re-drawn at each step, so this number of points will be drawn each time and all points will be considered at each iteration.
Returns: Number of candidate design points Return type: int
-
get_n_init
()¶ Get number of initial design points
Returns the number of initial design points used before beginning the sequential design steps. Note that this means that the total number of samples to be drawn for the design is
n_init + n_samples
.Returns: Number of initial design points Return type: int
-
get_n_parameters
()¶ Get number of parameters in design
Returns the number of parameters in the design (note that this is specified in the base design that must be provided when initializing the class instance).
Returns: Number of parameters in the design Return type: int
-
get_n_samples
()¶ Get number of sequential design points
Returns the number of sequential design points used in the sequential design steps. This parameter can be
None
to indicate that the number of samples will be specified when running the design, or that the samples will be updated manually. Note that the total number of samples to be drawn for the design isn_init + n_samples
.Returns: Number of sequential design points Return type: int
-
get_next_point
()¶ Evaluate candidates to determine next point
Public method for determining the next point in the design. Internally, it checks that the inputs and target arrays are as expected for correctly drawing a new point, generates prospective candidates, and then evaluates them using the desired metric in order to select the best one. It updates the
inputs
array and returns the next point to be evaluated as a 1D numpy array of lengthn_parameters
.Returns: Next design point, a 1D numpy array of length n_parameters
Return type: ndarray
-
get_targets
()¶ Get current design target points
Returns a numpy array holding the current target points. The array is 1D and has shape
(current_iteration,)
(i.e. it is resized after each iteration when a new target point is added). Note that simulation outputs must be a single number, so if considering a simulation has multiple outputs, the user must decide how to combine them to form the relevant target value for deciding which point to simulate next.Returns: Current value of the target inputs Return type: ndarray
-
has_function
()¶ Determines if class contains a function for running the simulator
This method checks to see if a function has been provided for running the simulation.
Returns: Whether or not the design has a bound function for evaluting the simulation. Return type: bool
-
load_design
(filename)¶ Load previously saved sequential design
Loads a previously saved sequential design from file. Loads the arrays for
inputs
,targets
, andcandidates
from file and sets other internal data to be consistent. It performs a few checks for consistency to ensure that the loaded design is compatible with the selected parameters, however, it does not completely check everything for consistency (in particular, it does not make any attempt to ensure that the exact base design or function are identical to what was previously used). It is up to the user to ensure that these are consistent with the previous instance of the design.Parameters: filename (str or file) – Filename or file object from which the design will be loaded Returns: None
-
run_initial_design
()¶ Run initial design
Method to run the initial design by generating the initial design, evaluating the function on all design points, and setting the target values. Note that this requires having a bound function to the class in order to evaluate the design points internally. It is a shortcut to running
generate_initial_design
, evaluating the initial design points, and then usingset_initial_targets
to set the target values, with some additional checks along the way.If the initial design has already been fully run, this method will raise an error as the method to generate the initial design checks this prior to overwriting the initial targets. Note also that this method checks that the outputs of the bound function match up with the expected array sizes and that all outputs are finite before updating the initial targets.
Returns: None Return type: None
-
run_next_point
()¶ Perform one iteration of the sequential design process
Method for performing an iteration of the sequential design process. This is a shortcut for generating and evaluating the candidates to find the best next design point, evaluating the function on the next point, and then updating the targets array with the value. This requires a function be bound to the class instance to automatically run the simulation. This will also automatically update the
current_iteration
attribute, which can be used to determine the number of sequential design steps that have been run.Returns: None Return type: None
-
run_sequential_design
(n_samples=None)¶ Run the entire sequential design
Method to run all steps of the sequential design process. Note that the class instance must have a bound function for evaluating the design points to run all steps automatically. If such a method is not provided, the design steps must be run manually.
The desired number of samples to be drawn can either be specified when initializing the class instance or when calling this method. If a number of samples is provided on both occasions, then the number provided when calling
run_sequential_design
is used.Internally, this method is a wrapper to
run_initial_design
and then callingrun_next_point
a total ofn_samples
times. Note that this means that the total number of design points isn_init + n_samples
.Parameters: n_samples (int or None) – Number of sequential design steps to be run. Optional if the number was specified upon initialization. Default is None
(default to number set when initializing). If numbers are provided on both occasions, the number set here is used. If a number is provided, must be non-negative.Returns: None Return type: None
-
save_design
(filename)¶ Save current state of the sequential design
Saves the current state of the sequential design by writing the current values of
inputs
,targets
, andcandidates
to file as a.npz
file. To re-load a saved design, use theload_design
method.Note that this method only dumps the arrays holding the inputs, targets, and candidates to a
.npz
file. It does not ensure that the function or base design are consistent, so it is up to the user to ensure that the new design parameters are the same as the parameters for the old one.Parameters: filename (str or file) – Filename or file object where design will be saved Returns: None
-
set_batch_targets
(new_targets)¶ Batch version of set_next_target for a Sequential Design
This method updates the targets array for a batch set of simulations. The input array must have shape
(n_points,)
, wheren_points
is the number of points selected when callingget_batch_points
. Disagreement between these two values will result in an error.Parameters: new_targets (ndarray) – Array holding results from the simulations. Must be an array of shape (n_points,)
, wheren_points
is set when callingget_batch_points
Returns: None
-
set_initial_targets
(targets)¶ Set initial design target values
Method to set the initial design targets. Generates the desired number of points for the initial design by drawing from the base design. Method sets the
inputs
attribute of theSequentialDesign
instance, but also returns the initial design as a numpy array if the simulations are to be run manually. This method can be run repeatedly to draw different initial designs if the initial target values have not been set, but once the targets have been set the method will not overwrite them to prevent corruption of the design.Target values must be an array with length
(n_init,)
, with values obtained by running the initial design through the simulation. Note that this means the initial design must be created prior to running this method – if this method is called prior togenerate_initial_design
, the code will raise an error.Parameters: targets (ndarray) – Initial value of targets, must be a 1D numpy array with shape (n_init,)
Returns: None Return type: None
-
set_next_target
(target)¶ Set value of next target
Updates the target array with the correct value (from running the actual simulation) of the latest design point determined using
get_next_point
. The target input must be a float or an array of length 1. The code internally checks the inputs and targets for any problems that may have occurred in updating them correctly, and if all is well then updates the target array and increments the number of iterations. If the design has not been correctly initialized, orget_next_point
has not been previously run, this method will raise an error.Parameters: target (float or length 1 array) – New target value found from evaluating the simulation on the latest design point found from the get_next_point
method.Returns: None Return type: None
-
The MICEDesign
Class¶
Class representing a Mutual Information for Computer Experiments (MICE) sequential experimental design
This class provides an implementation of the MICE algorithm, which uses Mutual Information as the criterion for selecting new points in a sequential design. The idea in MICE is to select design points based on the point that provides the most information on the function values in the entire design space. This is a straightforward application of a sequential design procedure, though the class requires a few additional parameters in order to compute the MICE criteria.
These additional parameters are nugget parameters provided to the Gaussian Process fit to
smooth the predictions when evaluating the Mutual Information criteria. Essentially, since
experimental design often requires sampling from a high dimensional space, this cannot be
done in a way that guarantees that all candidate points are equally spaced. The Mutual
Information criterion is sensitive to how these candidate points are distributed in space,
so the nugget parameter provides some smoothing that makes the criterion less dependent on
the distribution of the candidate points. Typical values of the smoothing nugget parameters
(nugget_s
in this implementation) are 1, though this may depend on the application.
Other than the smoothing parameters, the implementation follows the base procedure for a sequential design. The implementation adds methods for querying the nugget parameters and an additional helper function for computing the Mutual Information criterion, but other methods are identical.
-
class
mogp_emulator.SequentialDesign.
MICEDesign
(base_design, f=None, n_samples=None, n_init=10, n_cand=50, nugget='adaptive', nugget_s=1.0)¶ Class representing a Mutual Information for Computer Experiments (MICE) sequential experimental design
This class provides an implementation of the MICE algorithm, which uses Mutual Information as the criterion for selecting new points in a sequential design. The idea in MICE is to select design points based on the point that provides the most information on the function values in the entire design space. This is a straightforward application of a sequential design procedure, though the class requires a few additional parameters in order to compute the MICE criteria.
These additional parameters are nugget parameters provided to the Gaussian Process fit to smooth the predictions when evaluating the Mutual Information criteria. Essentially, since experimental design often requires sampling from a high dimensional space, this cannot be done in a way that guarantees that all candidate points are equally spaced. The Mutual Information criterion is sensitive to how these candidate points are distributed in space, so the nugget parameter provides some smoothing that makes the criterion less dependent on the distribution of the candidate points. Typical values of the smoothing nugget parameters (
nugget_s
in this implementation) are 1, though this may depend on the application.Other than the smoothing parameters, the implementation follows the base procedure for a sequential design. The implementation adds methods for querying the nugget parameters and an additional helper function for computing the Mutual Information criterion, but other methods are identical.
-
__init__
(base_design, f=None, n_samples=None, n_init=10, n_cand=50, nugget='adaptive', nugget_s=1.0)¶ Create new instance of a MICE sequential design
Method to initialize a new MICE design. Parameters are largely the same as for the base
SequentialDesign
class, with a few additional nugget parameters for computing the Mutual Information criterion. A base design must be provided (must be a subclass of theExperimentalDesign
class), plus optionally a function to be evaluated in the design. Additional parameters include the number of samples, the number of initial design points, the number of candidate points, the nugget parameter for the base GP, and the smoothing nugget parameter for smoothing the uncertainty predictions on the candidate design points. Note that the total number of design points isn_init + n_samples
.Parameters: - base_design (ExperimentalDesign) – Base one-shot experimental design (must be a subclass of
ExperimentalDesign
). This contains the information on the parameter space to be sampled. - f (function or other callable) – Function to be evaluated for the design. Must take all parameter values as a single input array and return a single float or an array of length 1
- n_samples (int or None) – Number of sequential design points to be drawn. If specified, this must be
a positive integer. Note that this is in addition to the number of initial
points, meaning that the total design size will be
n_samples + n_init
. This can also be specified when running the full design. This parameter is optional, and defaults toNone
(meaning the number of samples is set when running the design, or that samples will be added manually). - n_init (int) – Number of points in the inital design before the sequential steps begin. Must be a positive integer. Optional, default value is 10.
- n_cand – Number of candidates to consider at each sequential design step. Must be a positive integer. Optional, default value is 50.
- nugget (float or None) – Nugget parameter for base GP predictions. Must be a non-negative float or
None
, whereNone
indicates that the nugget parameter is selected adaptively. Optional, default value isNone
. - nugget_s (float) – Smoothing nugget parameter for smoothing the predictions on the candidate space. Must be a non-negative float. Default value is 1.
- base_design (ExperimentalDesign) – Base one-shot experimental design (must be a subclass of
-
generate_initial_design
()¶ Create initial design
Method to set the initial design inputs. Generates the desired number of points for the initial design by drawing from the base design. Method sets the
inputs
attribute of theSequentialDesign
instance, but also returns the initial design as a numpy array if the simulations are to be run manually. This method can be run repeatedly to draw different initial designs if the initial target values have not been set, but once the targets have been set the method will not overwrite them to prevent corruption of the design.Returns: Initial design points, a 2D numpy array with shape (n_init, n_parameters)
Return type: ndarray
-
get_base_design
()¶ Get type of base design
Returns the type of the base design. The base design must be a subclass of
ExperimentalDesign
, but any one-shot design method can be used to generate the initial design and the candidates.Returns: Base design type as a string Return type: str
-
get_batch_points
(n_points)¶ Batch version of get_next_point for a Sequential Design
This method returns a batch of design points to run from a Sequential Design. This is useful if simulations can be run in parallel, which speeds up the ability to generate designs efficiently. The method simply calls
get_next_point
the required number of times, but rather than using the true value of the simulation it instead substitutes the predicted value that is method-specific. This can be implemented in a subclass by defining the method_estimate_next_target
.Parameters: n_points (int) – Size of batch to generate for the next set of simulation points. This parameter determines the shape of the output array. Must be a positive integer. Returns: Set of batch points chosen using the batch version of the design as a numpy array with shape (n_points, n_parameters)
Return type: ndarray
-
get_candidates
()¶ Get current candidate design input points
Returns a numpy array holding the current candidate design points. The array is 2D and has shape
(n_cand, n_parameters)
. It always has the same size once it is initialized, but the values will change acros iterations as new candidate points are considered at each iteration.Returns: Current value of the candidate design inputs Return type: ndarray
-
get_current_iteration
()¶ Get number of current iteration in the experimental design
Returns the current iteration during the sequential design process. This is mostly useful if the sequential design is being updated manually to know the current iteration.
Returns: Current iteration number Return type: int
-
get_inputs
()¶ Get current design input points
Returns a numpy array holding the current design points. The array is 2D and has shape
(current_iteration, n_parameters)
(i.e. it is resized after each iteration when a new design point is chosen).Returns: Current value of the design inputs Return type: ndarray
-
get_n_cand
()¶ Get number of candidate design points
Returns the number of candidate design points used in each sequential design step. Candidates are re-drawn at each step, so this number of points will be drawn each time and all points will be considered at each iteration.
Returns: Number of candidate design points Return type: int
-
get_n_init
()¶ Get number of initial design points
Returns the number of initial design points used before beginning the sequential design steps. Note that this means that the total number of samples to be drawn for the design is
n_init + n_samples
.Returns: Number of initial design points Return type: int
-
get_n_parameters
()¶ Get number of parameters in design
Returns the number of parameters in the design (note that this is specified in the base design that must be provided when initializing the class instance).
Returns: Number of parameters in the design Return type: int
-
get_n_samples
()¶ Get number of sequential design points
Returns the number of sequential design points used in the sequential design steps. This parameter can be
None
to indicate that the number of samples will be specified when running the design, or that the samples will be updated manually. Note that the total number of samples to be drawn for the design isn_init + n_samples
.Returns: Number of sequential design points Return type: int
-
get_next_point
()¶ Evaluate candidates to determine next point
Public method for determining the next point in the design. Internally, it checks that the inputs and target arrays are as expected for correctly drawing a new point, generates prospective candidates, and then evaluates them using the desired metric in order to select the best one. It updates the
inputs
array and returns the next point to be evaluated as a 1D numpy array of lengthn_parameters
.Returns: Next design point, a 1D numpy array of length n_parameters
Return type: ndarray
-
get_nugget
()¶ Get value of nugget parameter for base GP
Returns the nugget value for the base GP (used to actually fit the inputs to targets). Can be a float or None (meaning fitting will adaptively add noise to stabilize matrix inversion as needed).
Returns: Nugget parameter, can be a float or None for adaptive noise addition. Return type: float or None
-
get_nugget_s
()¶ Get value of smoothing nugget parameter
Returns the value of the smoothing nugget parameter for the GP used to evaluate the mutual information criterion. This GP examines the correlation between a candidate design point and the other candidate points, which requires smoothing to ensure that the correlation measure is not biased by the distribution of the candidate points in space. This parameter must be a nonnegative float (typical values used are 1, though this may depend on the application).
Returns: Nugget parameter for smoothing predictions from candidate points made on a candidate point. Typical values are 1. Return type: float
-
get_targets
()¶ Get current design target points
Returns a numpy array holding the current target points. The array is 1D and has shape
(current_iteration,)
(i.e. it is resized after each iteration when a new target point is added). Note that simulation outputs must be a single number, so if considering a simulation has multiple outputs, the user must decide how to combine them to form the relevant target value for deciding which point to simulate next.Returns: Current value of the target inputs Return type: ndarray
-
has_function
()¶ Determines if class contains a function for running the simulator
This method checks to see if a function has been provided for running the simulation.
Returns: Whether or not the design has a bound function for evaluting the simulation. Return type: bool
-
load_design
(filename)¶ Load previously saved sequential design
Loads a previously saved sequential design from file. Loads the arrays for
inputs
,targets
, andcandidates
from file and sets other internal data to be consistent. It performs a few checks for consistency to ensure that the loaded design is compatible with the selected parameters, however, it does not completely check everything for consistency (in particular, it does not make any attempt to ensure that the exact base design or function are identical to what was previously used). It is up to the user to ensure that these are consistent with the previous instance of the design.Parameters: filename (str or file) – Filename or file object from which the design will be loaded Returns: None
-
run_initial_design
()¶ Run initial design
Method to run the initial design by generating the initial design, evaluating the function on all design points, and setting the target values. Note that this requires having a bound function to the class in order to evaluate the design points internally. It is a shortcut to running
generate_initial_design
, evaluating the initial design points, and then usingset_initial_targets
to set the target values, with some additional checks along the way.If the initial design has already been fully run, this method will raise an error as the method to generate the initial design checks this prior to overwriting the initial targets. Note also that this method checks that the outputs of the bound function match up with the expected array sizes and that all outputs are finite before updating the initial targets.
Returns: None Return type: None
-
run_next_point
()¶ Perform one iteration of the sequential design process
Method for performing an iteration of the sequential design process. This is a shortcut for generating and evaluating the candidates to find the best next design point, evaluating the function on the next point, and then updating the targets array with the value. This requires a function be bound to the class instance to automatically run the simulation. This will also automatically update the
current_iteration
attribute, which can be used to determine the number of sequential design steps that have been run.Returns: None Return type: None
-
run_sequential_design
(n_samples=None)¶ Run the entire sequential design
Method to run all steps of the sequential design process. Note that the class instance must have a bound function for evaluating the design points to run all steps automatically. If such a method is not provided, the design steps must be run manually.
The desired number of samples to be drawn can either be specified when initializing the class instance or when calling this method. If a number of samples is provided on both occasions, then the number provided when calling
run_sequential_design
is used.Internally, this method is a wrapper to
run_initial_design
and then callingrun_next_point
a total ofn_samples
times. Note that this means that the total number of design points isn_init + n_samples
.Parameters: n_samples (int or None) – Number of sequential design steps to be run. Optional if the number was specified upon initialization. Default is None
(default to number set when initializing). If numbers are provided on both occasions, the number set here is used. If a number is provided, must be non-negative.Returns: None Return type: None
-
save_design
(filename)¶ Save current state of the sequential design
Saves the current state of the sequential design by writing the current values of
inputs
,targets
, andcandidates
to file as a.npz
file. To re-load a saved design, use theload_design
method.Note that this method only dumps the arrays holding the inputs, targets, and candidates to a
.npz
file. It does not ensure that the function or base design are consistent, so it is up to the user to ensure that the new design parameters are the same as the parameters for the old one.Parameters: filename (str or file) – Filename or file object where design will be saved Returns: None
-
set_batch_targets
(new_targets)¶ Batch version of set_next_target for a Sequential Design
This method updates the targets array for a batch set of simulations. The input array must have shape
(n_points,)
, wheren_points
is the number of points selected when callingget_batch_points
. Disagreement between these two values will result in an error.Parameters: new_targets (ndarray) – Array holding results from the simulations. Must be an array of shape (n_points,)
, wheren_points
is set when callingget_batch_points
Returns: None
-
set_initial_targets
(targets)¶ Set initial design target values
Method to set the initial design targets. Generates the desired number of points for the initial design by drawing from the base design. Method sets the
inputs
attribute of theSequentialDesign
instance, but also returns the initial design as a numpy array if the simulations are to be run manually. This method can be run repeatedly to draw different initial designs if the initial target values have not been set, but once the targets have been set the method will not overwrite them to prevent corruption of the design.Target values must be an array with length
(n_init,)
, with values obtained by running the initial design through the simulation. Note that this means the initial design must be created prior to running this method – if this method is called prior togenerate_initial_design
, the code will raise an error.Parameters: targets (ndarray) – Initial value of targets, must be a 1D numpy array with shape (n_init,)
Returns: None Return type: None
-
set_next_target
(target)¶ Set value of next target
Updates the target array with the correct value (from running the actual simulation) of the latest design point determined using
get_next_point
. The target input must be a float or an array of length 1. The code internally checks the inputs and targets for any problems that may have occurred in updating them correctly, and if all is well then updates the target array and increments the number of iterations. If the design has not been correctly initialized, orget_next_point
has not been previously run, this method will raise an error.Parameters: target (float or length 1 array) – New target value found from evaluating the simulation on the latest design point found from the get_next_point
method.Returns: None Return type: None
-
The MICEFastGP
Class¶
Derived GaussianProcess class implementing the Woodbury matrix identity for fast predictions
This class implements a Gaussian Process that is used in the MICE Sequential Design. The GP is fit using all candidate points from the sequential design, and the uses the Woodbury matrix identity to correct that fit to exclude the candidate point in question. This reduces the cost of fitting the GP from O(n^3) to O(n^2), which can dramatically speed up this process for large numbers of candidate points. This is mostly used for the particular application to the MICE sequential design, but could potentially have other applications where many candidate points are to be considered one at a time.
-
class
mogp_emulator.SequentialDesign.
MICEFastGP
(inputs, targets, mean=None, kernel='SquaredExponential', priors=None, nugget='adaptive', inputdict={}, use_patsy=True)¶ Derived GaussianProcess class implementing the Woodbury matrix identity for fast predictions
This class implements a Gaussian Process that is used in the MICE Sequential Design. The GP is fit using all candidate points from the sequential design, and the uses the Woodbury matrix identity to correct that fit to exclude the candidate point in question. This reduces the cost of fitting the GP from O(n^3) to O(n^2), which can dramatically speed up this process for large numbers of candidate points. This is mostly used for the particular application to the MICE sequential design, but could potentially have other applications where many candidate points are to be considered one at a time.
-
fast_predict
(index)¶ Make a fast prediction using one input point to a fit GP
This method is used to correct a Gaussian Process fit to a set of candidate points to evaluate the uncertainty at the candidate point. It is used in the MICE sequential design procedure to examine the mutual information between candidate points by determining how well correlated the design point is in question to the remainder of the candidates. It uses the Woodbury matrix identity to correct the existing GP fit (which requires O(n^3) operations) using O(n^2) operations, speeding up the process significantly for large candidate design sizes.
The method requires a fit GP, and the index of the input point that is to be excluded. The method then corrects the GP fit and computes the uncertainty of the prediction on the excluded point returning the uncertainty as a float.
Parameters: index (int) – Index of input point to be excluded in the fit and to which the prediction will be applied. Must be an integer with 0 <= index < n (where n is the number of target points in the fit GP, or the number of candidate points when applied to the MICE procedure). Returns: Uncertainty in the corrected fit applied to the given index point Return type: float
-