The MultiOutputGP
Class¶
Implementation of a multiple-output Gaussian Process Emulator.
This class provides an interface to fit a Gaussian Process Emulator to multiple targets
using the same input data. The class creates all of the necessary sub-emulators from
the input data and provides interfaces to the learn_hyperparameters
and predict
methods of the sub-emulators. Because the emulators are all fit independently, the
class provides the option to use multiple processes to fit the emulators and make
predictions in parallel.
The emulators are stored internally in a list. Other useful information stored is the
numer of emulators n_emulators
, number of training examples n
, and number of
input parameters D
. These other variables are made available externally through
the get_n_emulators
, get_n
, and get_D
methods.
Example:
>>> import numpy as np
>>> from mogp_emulator import MultiOutputGP
>>> x = np.array([[1., 2., 3.], [4., 5., 6.]])
>>> y = np.array([[4., 6.], [5., 7.]])
>>> mogp = MultiOutputGP(x, y)
>>> print(mogp)
Multi-Output Gaussian Process with:
2 emulators
2 training examples
3 input variables
>>> mogp.get_n_emulators()
2
>>> mogp.get_n()
2
>>> mogp.get_D()
3
>>> np.random.seed(47)
>>> mogp.learn_hyperparameters()
[(5.140462159403397, array([-13.02460687, -4.02939647, -39.2203646 , 3.25809653])),
(5.322783716197557, array([-18.448741 , -5.46557813, -4.81355357, 3.61091708]))]
>>> x_predict = np.array([[2., 3., 4.], [7., 8., 9.]])
>>> mogp.predict(x_predict)
(array([[4.74687618, 6.84934016],
[5.7350324 , 8.07267051]]),
array([[0.01639298, 1.05374973],
[0.01125792, 0.77568672]]),
array([[[8.91363045e-05, 7.18827798e-01, 3.74439445e-16],
[4.64005897e-06, 3.74191346e-02, 1.94917337e-17]],
[[5.58461022e-07, 2.42945502e-01, 4.66315152e-01],
[1.24593861e-07, 5.42016666e-02, 1.04035918e-01]]]))
-
class
mogp_emulator.MultiOutputGP.
MultiOutputGP
(*args)¶ Implementation of a multiple-output Gaussian Process Emulator.
This class provides an interface to fit a Gaussian Process Emulator to multiple targets using the same input data. The class creates all of the necessary sub-emulators from the input data and provides interfaces to the
learn_hyperparameters
andpredict
methods of the sub-emulators. Because the emulators are all fit independently, the class provides the option to use multiple processes to fit the emulators and make predictions in parallel.The emulators are stored internally in a list. Other useful information stored is the numer of emulators
n_emulators
, number of training examplesn
, and number of input parametersD
. These other variables are made available externally through theget_n_emulators
,get_n
, andget_D
methods.Example:
>>> import numpy as np >>> from mogp_emulator import MultiOutputGP >>> x = np.array([[1., 2., 3.], [4., 5., 6.]]) >>> y = np.array([[4., 6.], [5., 7.]]) >>> mogp = MultiOutputGP(x, y) >>> print(mogp) Multi-Output Gaussian Process with: 2 emulators 2 training examples 3 input variables >>> mogp.get_n_emulators() 2 >>> mogp.get_n() 2 >>> mogp.get_D() 3 >>> np.random.seed(47) >>> mogp.learn_hyperparameters() [(5.140462159403397, array([-13.02460687, -4.02939647, -39.2203646 , 3.25809653])), (5.322783716197557, array([-18.448741 , -5.46557813, -4.81355357, 3.61091708]))] >>> x_predict = np.array([[2., 3., 4.], [7., 8., 9.]]) >>> mogp.predict(x_predict) (array([[4.74687618, 6.84934016], [5.7350324 , 8.07267051]]), array([[0.01639298, 1.05374973], [0.01125792, 0.77568672]]), array([[[8.91363045e-05, 7.18827798e-01, 3.74439445e-16], [4.64005897e-06, 3.74191346e-02, 1.94917337e-17]], [[5.58461022e-07, 2.42945502e-01, 4.66315152e-01], [1.24593861e-07, 5.42016666e-02, 1.04035918e-01]]]))
-
__init__
(*args)¶ Create a new multi-output GP Emulator
Creates a new multi-output GP Emulator from either the input data and targets to be fit or a file holding the input/targets and (optionally) learned parameter values.
Arguments passed to the
__init__
method must be two or three arguments which are numpy arraysinputs
andtargets
and optionallynugget
, described below, or a single argument which is the filename (string or file handle) of a previously saved emulator.inputs
is a 2D array-like object holding the input data, whose shape isn
byD
, wheren
is the number of training examples to be fit andD
is the number of input variables to each simulation. Because the model assumes all outputs are drawn from the same identical set of simulations (i.e. the normal use case is to fit a series of computer simulations with multiple outputs from the same input), the input to each emulator is identical.targets
is the target data to be fit by the emulator, also held in an array-like object. This can be either a 1D or 2D array, where the last dimension must have lengthn
. If thetargets
array is of shape(n_emulators,n)
, then the emulator fits a total ofn_emulators
to the different target arrays, while if targets has shape(n,)
, a single emulator is fit.nugget
is a list or other iterable of nugget parameters for each emulator. Its length must match the number of targets to be fit. The values must beNone
(adaptive noise addition) or a non-negative float, and the emulators can have different noise behaviors.If two or three input arguments
inputs
,targets
, and optionallynugget
are given:Parameters: - inputs (ndarray) – Numpy array holding emulator input parameters. Must be 2D with shape
n
byD
, wheren
is the number of training examples andD
is the number of input parameters for each output. - targets (ndarray) – Numpy array holding emulator targets. Must be 2D or 1D with length
n
in the final dimension. The first dimension is of lengthn_emulators
(defaults to a single emulator if the input is 1D) - nugget –
None
or list or other iterable holding values for nugget parameter for each emulator. Length must ben_emulators
. Individual values can beNone
(adaptive noise addition), or a non-negative float. This parameter is optional, and defaults toNone
If one input argument
emulator_file
is given:Parameters: emulator_file (str or file) – Filename or file object for saved emulator parameters (using the save_emulator
method)Returns: New MultiOutputGP
instanceReturn type: MultiOutputGP - inputs (ndarray) – Numpy array holding emulator input parameters. Must be 2D with shape
-
get_D
()¶ Returns number of inputs for each emulator
Returns: Number of inputs for each emulator in the object Return type: int
-
get_n
()¶ Returns number of training examples in each emulator
Returns: Number of training examples in each emulator in the object Return type: int
-
get_n_emulators
()¶ Returns the number of emulators
Returns: Number of emulators in the object Return type: int
-
get_nugget
()¶ Returns value of nugget for all emulators
Returns value of nugget for all emulators as a list. Values can be
None
, or a nonnegative float for each emulator.Returns: nugget values for all emulators (list of length n_emulators
containint floats orNone
. nugget type and values can vary across all emulators if desired.)Return type: list
-
learn_hyperparameters
(n_tries=15, theta0=None, processes=None, method='L-BFGS-B', **kwargs)¶ Fit hyperparameters for each model
Fit the hyperparameters for each emulator. Options that can be specified include the number of different initial conditions to try during the optimization step, the level of verbosity of output during the fitting, the initial values of the hyperparameters to use when starting the optimization step, and the number of processes to use when fitting the models. Since each model can be fit independently of the others, parallelization can significantly improve the speed at which the models are fit.
Returns a list holding
n_emulators
tuples, each of which contains the minimum negative log-likelihood and a numpy array holding the optimal parameters found for each model.If the method encounters an overflow (this can result because the parameter values stored are the logarithm of the actual hyperparameters to enforce positivity) or a linear algebra error (occurs when the covariance matrix cannot be inverted, even with the addition of additional “nugget” or noise added along the diagonal), the iteration is skipped. If all attempts to find optimal hyperparameters result in an error, then the method raises an exception.
Parameters: - n_tries (int) – (optional) The number of different initial conditions to try when optimizing over the hyperparameters (must be a positive integer, default = 15)
- theta0 (ndarray or None) – (optional) Initial value of the hyperparameters to use in the optimization
routine (must be array-like with a length of
D + 1
, whereD
is the number of input parameters to each model). Default isNone
. - processes (int or None) – (optional) Number of processes to use when fitting the model.
Must be a positive integer or
None
to use the number of processors on the computer (default isNone
) - method (str) – Minimization method to be used. Can be any gradient-based optimization
method available in
scipy.optimize.minimize
. (Default is'L-BFGS-B'
) - **kwargs – Additional keyword arguments to be passed to the minimization routine.
see available parameters in
scipy.optimize.minimize
for details.
Returns: List holding
n_emulators
tuples of length 2. Each tuple contains the minimum negative log-likelihood for that particular emulator and a numpy array of lengthD + 2
holding the corresponding hyperparametersReturn type: list
-
predict
(testing, do_deriv=True, do_unc=True, processes=None)¶ Make a prediction for a set of input vectors
Makes predictions for each of the emulators on a given set of input vectors. The input vectors must be passed as a
(n_predict, D)
or(D,)
shaped array-like object, wheren_predict
is the number of different prediction points under consideration andD
is the number of inputs to the emulator. If the prediction inputs array has shape(D,)
, then the method assumesn_predict == 1
. The prediction points are passed to each emulator and the predictions are collected into an(n_emulators, n_predict)
shaped numpy array as the first return value from the method.Optionally, the emulator can also calculate the uncertainties in the predictions and the derivatives with respect to each input parameter. If the uncertainties are computed, they are returned as the second output from the method as an
(n_emulators, n_predict)
shaped numpy array. If the derivatives are computed, they are returned as the third output from the method as an(n_emulators, n_predict, D)
shaped numpy array.As with the fitting, this computation can be done independently for each emulator and thus can be done in parallel.
Parameters: - testing (ndarray) – Array-like object holding the points where predictions will be made.
Must have shape
(n_predict, D)
or(D,)
(for a single prediction) - do_deriv (bool) – (optional) Flag indicating if the derivatives are to be computed.
If
False
the method returnsNone
in place of the derivative array. Default value isTrue
. - do_unc (bool) – (optional) Flag indicating if the uncertainties are to be computed.
If
False
the method returnsNone
in place of the uncertainty array. Default value isTrue
. - processes (int or None) – (optional) Number of processes to use when making the predictions.
Must be a positive integer or
None
to use the number of processors on the computer (default isNone
)
Returns: Tuple of numpy arrays holding the predictions, uncertainties, and derivatives, respectively. Predictions and uncertainties have shape
(n_emulators, n_predict)
while the derivatives have shape(n_emulators, n_predict, D)
. If thedo_unc
ordo_deriv
flags are set toFalse
, then those arrays are replaced byNone
.Return type: tuple
- testing (ndarray) – Array-like object holding the points where predictions will be made.
Must have shape
-
save_emulators
(filename)¶ Write emulators to disk
Method saves emulators to disk using the given filename or file handle. The (common) inputs to all emulators are saved, and all targets are collected into a single numpy array (this saves the data in the same format used in the two-argument
__init__
method). If the model has been assigned parameters, either manually or by fitting, those parameters are saved as well. Once saved, the emulator can be read by passing the file name or handle to the one-argument__init__
method.Parameters: filename (str or file) – Name of file (or file handle) to which the emulators will be saved. Returns: None
-
set_nugget
(nugget)¶ Sets value of nugget for all emulators
Sets value of nugget for all emulators from values provided as a list or other iterable. Values can be
None
, or a nonnegative float for each emulator. The length of the input list must have lengthn_emulators
.Parameters: nugget – List of nugget values for all emulators (must be of length n_emulators
and contain floats orNone
. Nugget type and values can vary across all emulators if desired.)
-