The GaussianProcess Class

Implementation of a Gaussian Process Emulator.

This class provides an interface to fit a Gaussian Process Emulator to a set of training data. The class can be initialized from either a pair of inputs/targets arrays, or a file holding data saved from a previous emulator instance (saved via the save_emulator method). Once the emulator has been created, the class provides methods for fitting optimal hyperparameters, changing hyperparameter values, making predictions, and other calculations associated with fitting and making predictions.

The internal emulator structure involves arrays for the inputs, targets, and hyperparameters. Other useful information are the number of training examples n and the number of input parameters D. These parameters are available externally through the get_n and get_D methods

Example:

>>> import numpy as np
>>> from mogp_emulator import GaussianProcess
>>> x = np.array([[1., 2., 3.], [4., 5., 6.]])
>>> y = np.array([4., 6.])
>>> gp = GaussianProcess(x, y)
>>> print(gp)
Gaussian Process with 2 training examples and 3 input variables
>>> gp.get_n()
2
>>> gp.get_D()
3
>>> np.random.seed(47)
>>> gp.learn_hyperparameters()
(5.140462159403397, array([-13.02460687,  -4.02939647, -39.2203646 ,   3.25809653]))
>>> x_predict = np.array([[2., 3., 4.], [7., 8., 9.]])
>>> gp.predict(x_predict)
(array([4.74687618, 6.84934016]), array([0.01639298, 1.05374973]),
array([[8.91363045e-05, 7.18827798e-01, 3.74439445e-16],
       [4.64005897e-06, 3.74191346e-02, 1.94917337e-17]]))
class mogp_emulator.GaussianProcess.GaussianProcess(*args)

Implementation of a Gaussian Process Emulator.

This class provides an interface to fit a Gaussian Process Emulator to a set of training data. The class can be initialized from either a pair of inputs/targets arrays, or a file holding data saved from a previous emulator instance (saved via the save_emulator method). Once the emulator has been created, the class provides methods for fitting optimal hyperparameters, changing hyperparameter values, making predictions, and other calculations associated with fitting and making predictions.

The internal emulator structure involves arrays for the inputs, targets, and hyperparameters. Other useful information are the number of training examples n and the number of input parameters D. These parameters are available externally through the get_n and get_D methods

Example:

>>> import numpy as np
>>> from mogp_emulator import GaussianProcess
>>> x = np.array([[1., 2., 3.], [4., 5., 6.]])
>>> y = np.array([4., 6.])
>>> gp = GaussianProcess(x, y)
>>> print(gp)
Gaussian Process with 2 training examples and 3 input variables
>>> gp.get_n()
2
>>> gp.get_D()
3
>>> np.random.seed(47)
>>> gp.learn_hyperparameters()
(5.140462159403397, array([-13.02460687,  -4.02939647, -39.2203646 ,   3.25809653]))
>>> x_predict = np.array([[2., 3., 4.], [7., 8., 9.]])
>>> gp.predict(x_predict)
(array([4.74687618, 6.84934016]), array([0.01639298, 1.05374973]),
array([[8.91363045e-05, 7.18827798e-01, 3.74439445e-16],
       [4.64005897e-06, 3.74191346e-02, 1.94917337e-17]]))
__init__(*args)

Create a new GP Emulator

Creates a new GP Emulator from either the input data and targets to be fit or a file holding the input/targets and (optionally) learned parameter values.

Arguments passed to the __init__ method must be either two arguments which are numpy arrays inputs and targets, described below, three arguments which are the same inputs and targets arrays plus a float representing the nugget parameter, or a single argument which is the filename (string or file handle) of a previously saved emulator.

inputs is a 2D array-like object holding the input data, whose shape is n by D, where n is the number of training examples to be fit and D is the number of input variables to each simulation.

targets is the target data to be fit by the emulator, also held in an array-like object. This must be a 1D array of length n.

nugget is the additional noise added to the emulator targets when fitting. This can take on values None (in which case, noise will be added adaptively to stabilize fitting), or a non-negative float (in which case, a fixed noise level will be used). If no value is specified for the nugget parameter, None is the default.

If two or three input arguments inputs, targets, and optionally nugget are given:

Parameters:
  • inputs (ndarray) – Numpy array holding emulator input parameters. Must be 2D with shape n by D, where n is the number of training examples and D is the number of input parameters for each output.
  • targets (ndarray) – Numpy array holding emulator targets. Must be 1D with length n
  • nugget – Noise to be added to the diagonal or None. A float specifies the noise level explicitly, while if None is given, the noise will set to be as small as possible to ensure stable inversion of the covariance matrix. Optional, default is None.

If one input argument emulator_file is given:

Parameters:emulator_file (str or file) – Filename or file object for saved emulator parameters (using the save_emulator method)
Returns:New GaussianProcess instance
Return type:GaussianProcess
compute_local_covariance()

Estimate local covariance matrix around the MLE parameters

This method inverts the hessian matrix to get the local covariance matrix around the MLE parameters. Note that if the MLE parameters have not been estimated, they will be found first prior to inverting the Hessian. The local Hessian should be positive definite if the MLE parameters are at a local minimum of the negative log-likelihood, so if the routine encounters a non-positive definite matrix it will raise an error. Returns the inverse of the Hessian matrix evaluated at the MLE parameter values.

Returns:Inverse of the Hessian matrix evaluated at the MLE parameter values. This is a 2D array with shape (D + 1, D + 1).
Return type:ndarray
get_D()

Returns number of inputs for the emulator

Returns:Number of inputs for the emulator object
Return type:int
get_n()

Returns number of training examples for the emulator

Returns:Number of training examples for the emulator object
Return type:int
get_nugget()

Returns emulator nugget parameter

Returns current value of the nugget parameter. If the nugget is selected adaptively, returns None.

Returns:Current nugget value, either a float or None
Return type:float or None
get_params()

Returns emulator parameters

Returns current parameters for the emulator as a numpy array if they have been fit. If no parameters have been fit, returns None.

Returns:Current parameter values (numpy array of length D + 1), or None if the parameters have not been fit.
Return type:ndarray or None
hessian(theta)

Calculate the Hessian of the negative log-likelihood

Calculate the Hessian of the negative log-likelihood with respect to the hyperparameters. Note that this function is normally used only when fitting the hyperparameters, and it is not needed to make predictions. It is also used to estimate an appropriate step size when fitting hyperparameters using the lognormal approximation or MCMC sampling.

When used in an optimization routine, the hessian method is called after evaluating the loglikelihood method. The implementation takes advantage of this by storing the inverse of the covariance matrix, which is expensive to compute and is used by the loglikelihood and partial_devs methods as well. If the function is evaluated with a different set of parameters than was previously used to set the log-likelihood, the method calls _set_params to compute the needed information. However, caling hessian does not evaluate the log-likelihood, so it does not change the cached values of the parameters or log-likelihood.

Parameters:theta (ndarray) – Value of the hyperparameters. Must be array-like with shape (D + 1,)
Returns:Hessian of the negative log-likelihood (array with shape (D + 1, D + 1))
Return type:ndarray
learn_hyperparameters(n_tries=15, theta0=None, method='L-BFGS-B', **kwargs)

Fit hyperparameters by attempting to minimize the negative log-likelihood

Fits the hyperparameters by attempting to minimize the negative log-likelihood multiple times from a given starting location and using a particular minimization method. The best result found among all of the attempts is returned, unless all attempts to fit the parameters result in an error (see below).

If the method encounters an overflow (this can result because the parameter values stored are the logarithm of the actual hyperparameters to enforce positivity) or a linear algebra error (occurs when the covariance matrix cannot be inverted, even with the addition of additional noise added along the diagonal if adaptive noise was selected by setting the nugget parameter to be None), the iteration is skipped. If all attempts to find optimal hyperparameters result in an error, then the method raises an exception.

The theta0 parameter is the point at which the first iteration will start. If more than one attempt is made, subsequent attempts will use random starting points.

The user can specify the details of the minimization method, using any of the gradient-based optimizers available in scipy.optimize.minimize. Any additional parameters beyond the method specification can be passed as keyword arguments.

The method returns the minimum negative log-likelihood found and the parameter values at which that minimum was obtained. The method also sets the current values of the hyperparameters to these optimal values and pre-computes the matrices needed to make predictions.

Parameters:
  • n_tries (int) – Number of attempts to minimize the negative log-likelihood function. Must be a positive integer (optional, default is 15)
  • theta0 (None or ndarray) – Initial starting point for the first iteration. If present, must be array-like with shape (D + 1,). If None is given, then a random value is chosen. (Default is None)
  • method (str) – Minimization method to be used. Can be any gradient-based optimization method available in scipy.optimize.minimize. (Default is 'L-BFGS-B')
  • **kwargs – Additional keyword arguments to be passed to the minimization routine. see available parameters in scipy.optimize.minimize for details.
Returns:

Minimum negative log-likelihood values and hyperparameters (numpy array with shape (D + 1,)) used to obtain those values. The method also sets the current values of the hyperparameters to these optimal values and pre-computes the matrices needed to make predictions.

Return type:

tuple containing a float and an ndarray

learn_hyperparameters_MCMC(n_samples=1000, thin=0)

Sample hyperparameters via MCMC estimation

Sample hyperparameters via MCMC estimation. Parameters are found by doing a random walk in parameter space, choosing new points via the Metropolis-Hastings algorithm. Steps are drawn from a multivariate normal distribution around the current parameters, and the steps are accepted and rejected based on the marginal log-likelihood function. The chain is started from the MLE parameter values, and the step sizes are estimated by inverting the local Hessian at the MLE solution. Because of this, the MCMC chain does not require a “burn-in” phase. Optional parameters specify the number of MCMC steps to take (must be a positive integer, default is 1000) and information about how to thin the MCMC chain to obtain uncorrelated samples.

Thinning may be specified with a non-negative integer. If a positive integer is given, the chain will be thinned by only keeping every thin steps. Note that thin = 1 means that the chain will not be thinned. If thin = 0 is given (the default value), the chain will automatically be thinned by computing the autocorrelation of the chain for each parameter separately and estimating the value needed to eliminate correlations in the chain. If the autothinning method fails (usually occurrs if the posterior is multimodal), the chain will not be thinned and a warning will be given. More details on the autothinning procedure are described in the corresponding function.

Does not return a value, but sets the samples class attribute to a 2D array with shape (n_chain, D + 1), where the first dimension indicates the different samples and the second dimension specifies the different hyperparameters. Note that n_chain will only be the same as n_samples if thin = 1 is specified or if autothinning fails. If you wish to obtain a specific number of samples in the thinned chain, you will need to modify n_samples and thin appropriately.

Note that at present, the return information from the MCMC sampler is not returned or cached. The code does give a warning if a problem arises, in particular if the acceptance rate is not within the target range of 20% to 60% or if the final MCMC chain has a first lag autocorrelation that indicates samples may not be independent. If either of these warnings occur, the MCMC chain may require further inspection. At the moment, this can only be done by re-running the MCMC samples using the function sample_MCMC in the MCMC submodule manually.

Parameters:
  • n_samples (int) – Number of MCMC steps to be taken. Must be a positive integer.
  • thin (int) – Specifies how the chain is thinned to remove correlations. Must be a non-negative integer. If a positive integer k is used, it will keep every k samples. Note that thin = 1 indicates that the chain will not be thinned. thin = 0 will attempt to autothin the chain using the autocorrelation of the MCMC chain. Default is 0.
Returns:

None

learn_hyperparameters_MLE(n_tries=15, theta0=None, method='L-BFGS-B', **kwargs)

Fit hyperparameters by attempting to minimize the negative log-likelihood

This method an alias for learn_hyperparameters to distinguish it from other methods for estimating hyperparameters.

Fits the hyperparameters by attempting to minimize the negative log-likelihood multiple times from a given starting location and using a particular minimization method. The best result found among all of the attempts is returned, unless all attempts to fit the parameters result in an error (see below).

If the method encounters an overflow (this can result because the parameter values stored are the logarithm of the actual hyperparameters to enforce positivity) or a linear algebra error (occurs when the covariance matrix cannot be inverted, even with the addition of additional noise added along the diagonal if adaptive noise was selected by setting the nugget parameter to be None), the iteration is skipped. If all attempts to find optimal hyperparameters result in an error, then the method raises an exception.

The theta0 parameter is the point at which the first iteration will start. If more than one attempt is made, subsequent attempts will use random starting points.

The user can specify the details of the minimization method, using any of the gradient-based optimizers available in scipy.optimize.minimize. Any additional parameters beyond the method specification can be passed as keyword arguments.

The method returns the minimum negative log-likelihood found and the parameter values at which that minimum was obtained. The method also sets the current values of the hyperparameters to these optimal values and pre-computes the matrices needed to make predictions.

Parameters:
  • n_tries (int) – Number of attempts to minimize the negative log-likelihood function. Must be a positive integer (optional, default is 15)
  • theta0 (None or ndarray) – Initial starting point for the first iteration. If present, must be array-like with shape (D + 1,). If None is given, then a random value is chosen. (Default is None)
  • method (str) – Minimization method to be used. Can be any gradient-based optimization method available in scipy.optimize.minimize. (Default is 'L-BFGS-B')
  • **kwargs – Additional keyword arguments to be passed to the minimization routine. see available parameters in scipy.optimize.minimize for details.
Returns:

Minimum negative log-likelihood values and hyperparameters (numpy array with shape (D + 1,)) used to obtain those values. The method also sets the current values of the hyperparameters to these optimal values and pre-computes the matrices needed to make predictions.

Return type:

tuple containing a float and an ndarray

learn_hyperparameters_normalapprox(n_samples=1000)

Sample hyperparameters via a normal approximation around MLE solution

Sample hyperparameters via a multivariate normal approximation around the MLE parameters. This method first obtains an MLE estimate of the hyperparameters, and then draws samples assuming the posterior follows an approximate normal distribution around the MLE value. This is a reasonable approximation for most unimodal posterior distributions, and it is computationally much cheaper than using full MCMC estimation. The local covariance matrix is found by inverting the Hessian around the MLE parameters, and then samples are generated using a multivariate normal distribution. Does not return a value, but sets the samples class attribute to a 2D array with shape (n_samples, D + 1), where the first dimension indicates the different samples and the second dimension specifies the different hyperparameters.

Parameters:n_samples (int) – Number of samples to be drawn. Must be a positive integer.
Returns:None
loglikelihood(theta)

Calculate the negative log-likelihood at a particular value of the hyperparameters

Calculate the negative log-likelihood for the given set of parameters. Calling this method sets the parameter values and computes the needed inverse matrices in order to evaluate the log-likelihood and its derivatives. In addition to returning the log-likelihood value, it stores the current value of the hyperparameters and log-likelihood in attributes of the object.

Parameters:theta (ndarray) – Value of the hyperparameters. Must be array-like with shape (D + 1,)
Returns:negative log-likelihood
Return type:float
partial_devs(theta)

Calculate the partial derivatives of the negative log-likelihood

Calculate the partial derivatives of the negative log-likelihood with respect to the hyperparameters. Note that this function is normally used only when fitting the hyperparameters, and it is not needed to make predictions.

During normal use, the partial_devs method is called after evaluating the loglikelihood method. The implementation takes advantage of this by storing the inverse of the covariance matrix, which is expensive to compute and is used by the loglikelihood, partial_devs, and hessian methods. If the function is evaluated with a different set of parameters than was previously used to set the log-likelihood, the method calls _set_params to compute the needed information. However, caling partial_devs does not evaluate the log-likelihood, so it does not change the cached values of the parameters or log-likelihood.

Parameters:theta (ndarray) – Value of the hyperparameters. Must be array-like with shape (D + 1,)
Returns:partial derivatives of the negative log-likelihood (array with shape (D + 1,))
Return type:ndarray
predict(testing, do_deriv=True, do_unc=True, predict_from_samples=False)

Make a prediction for a set of input vectors

Makes predictions for the emulator on a given set of input vectors. The input vectors must be passed as a (n_predict, D) or (D,) shaped array-like object, where n_predict is the number of different prediction points under consideration and D is the number of inputs to the emulator. If the prediction inputs array has shape (D,), then the method assumes n_predict == 1. The prediction is returned as an (n_predict, ) shaped numpy array as the first return value from the method.

Optionally, the emulator can also calculate the variances in the predictions and the derivatives with respect to each input parameter. If the uncertainties are computed, they are returned as the second output from the method as an (n_predict,) shaped numpy array. If the derivatives are computed, they are returned as the third output from the method as an (n_predict, D) shaped numpy array.

If predictions based on samples of the hyperparameters (drawn by either assuming a normal posterior or using MCMC sampling) are desired, hyperparameter samples must have been drawn via the learn_hyperparameters_normalapprox or learn_hyperparameters_MCMC methods. This is controlled by setting predict_from_samples = True. If samples have not been drawn, predictions fall back onto using the MLE parameters as a single set of parameters. Default behavior is to use a single set of parameters. Note that the code does not save the inverted covariance matrix for each set of hyperparameter samples, so these predictions can be expensive for large numbers of samples or large numbers of inputs as the matrix inverse must be computed for each hyperparameter samples. Predictions from a single set of parameters used the cached matrix inverse, so these predictions are much more efficient.

If predictions from a single set of parameters are desired, and the GP does not have a current set of parameters, the code raises an error. If the code does have a current set of parameters but the MLE parameters have not been estimated, it gives a warning but continues with the predictions using the current parameters. If it has current parameters as well as MLE parameters but they differ, the code issues a warning but continues with the predictions using the current parameters.

Parameters:
  • testing (ndarray) – Array-like object holding the points where predictions will be made. Must have shape (n_predict, D) or (D,) (for a single prediction)
  • do_deriv (bool) – (optional) Flag indicating if the derivatives are to be computed. If False the method returns None in place of the derivative array. Default value is True.
  • do_unc (bool) – (optional) Flag indicating if the uncertainties are to be computed. If False the method returns None in place of the uncertainty array. Default value is True.
  • predict_from_samples (bool) – (optional) Flag indicating if predictions are to be made from samples. Default is False
Returns:

Tuple of numpy arrays holding the predictions, uncertainties, and derivatives, respectively. Predictions and uncertainties have shape (n_predict,) while the derivatives have shape (n_predict, D). If the do_unc or do_deriv flags are set to False, then those arrays are replaced by None.

Return type:

tuple

save_emulator(filename)

Write emulators to disk

Method saves the emulator to disk using the given filename or file handle. The method writes the inputs and targets arrays to file. If the model has been assigned parameters, either manually or by fitting, those parameters are saved as well. Once saved, the emulator can be read by passing the file name or handle to the one-argument __init__ method.

Parameters:filename (str or file) – Name of file (or file handle) to which the emulator will be saved.
Returns:None
set_nugget(nugget)

Set the nugget parameter for the emulator

Method for changing the nugget parameter for the emulator. When a new emulator is initilized, this is set to None.

The nugget parameter controls how noise is added to the covariance matrix in order to stabilize the inversion or smooth the emulator predictions. If nugget is a non-negative float, then that particular value is used for the nugget. Note that setting this parameter to be zero enforces that the emulator strictly interpolates between points. Alternatively, if nugget is set to be None, the fitting routine will adaptively make the noise parameter as large as is needed to ensure that the emulator can be fit.

Parameters:nugget (None or float) – Controls how noise is added to the emulator. If nugget is a nonnegative float, then this manually sets the noise parameter (if negative, this will lead to an error), with nugget = 0 resulting in interpolation with no smoothing noise added. nugget = None will adaptively select the smallest value of the noise term that still leads to a stable inversion of the matrix. Default behavior is nugget = None.
Returns:None
Return type:None