The fitting
Module¶
-
mogp_emulator.fitting.
fit_GP_MAP
(*args, n_tries=15, theta0=None, method='L-BFGS-B', skip_failures=True, refit=False, **kwargs) Fit one or more Gaussian Processes by attempting to minimize the negative log-posterior
Fits the hyperparameters of one or more Gaussian Processes by attempting to minimize the negative log-posterior multiple times from a given starting location and using a particular minimization method. The best result found among all of the attempts is returned, unless all attempts to fit the parameters result in an error (see below).
The arguments to the method can either be an existing
GaussianProcess
orMultiOutputGP
instance, or a list of arguments to be passed to the__init__
method ofGaussianProcess
orMultiOutputGP
if more than one output is detected. Keyword arguments for creating a newGaussianProcess
orMultiOutputGP
object can either be passed as part of the*args
list or as keywords (if present in**kwargs
, they will be extracted and passed separately to the__init__
method).If the method encounters an overflow (this can result because the parameter values stored are the logarithm of the actual hyperparameters to enforce positivity) or a linear algebra error (occurs when the covariance matrix cannot be inverted, even with additional noise added along the diagonal if adaptive noise was selected), the iteration is skipped. If all attempts to find optimal hyperparameters result in an error, then the emulator is skipped and the parameters are reset to
None
. By default, a warning will be printed to the console if this occurs forMultiOutputGP
fitting, while an error will be raise forGaussianProcess
fitting. This behavior can be changed to raise an error forMultiOutputGP
fitting by passing the kwargskip_failures=False
. This default behavior is chosen becauseMultiOutputGP
fitting is often done in a situation where human review of all fit emulators is not possible, so the fitting routine skips over failures and then flags those that failed for further review.For
MultiOutputGP
fitting, by default the routine assumes that only GPs that currently do not have hyperparameters fit need to be fit. This behavior is controlled by therefit
keyword, which isFalse
by default. To fit all emulators regardless of their current fitting status, passrefit=True
. Therefit
argument has no effect on fitting of singleGaussianProcess
objects – standardGaussianProcess
objects will be fit regardless of the current value of the hyperparameters.The
theta0
parameter is the point at which the first iteration will start. If more than one attempt is made, subsequent attempts will use random starting points. If you are fitting Multiple Outputs, then this argument can take any of the following forms: (1) None (random start points for all emulators, which are drawn from the prior distribution for each fit parameter), (2) a list of numpy arrays orNoneTypes
with lengthn_emulators
, (3) a numpy array of shape(n_params,)
or(n_emulators, n_params)
which with either use the same start point for all emulators or the specified start point for all emulators. Note that if you us a numpy array, all emulators must have the same number of parameters, while using a list allows more flexibility.The user can specify the details of the minimization method, using any of the gradient-based optimizers available in
scipy.optimize.minimize
. Any additional parameters beyond the method specification can be passed as keyword arguments.The function returns a fit
GaussianProcess
orMultiOutputGP
instance, either the original one passed to the function, or the new one created from the included arguments.Parameters: - *args – Either a single
GaussianProcess
orMultiOutputGP
instance, or arguments to be passed to the__init__
method when creating a newGaussianProcess
orMultiOutputGP
instance. - n_tries (int) – Number of attempts to minimize the negative log-posterior function. Must be a positive integer (optional, default is 15)
- theta0 (None or ndarray) – Initial starting point for the first iteration. If
present, must be array-like with shape
(n_params,)
based on the specificGaussianProcess
being fit. If aMultiOutputGP
is being fit it must be a list of lengthn_emulators
with each entry as eitherNone
or a numpy array of shape(n_params,)
, or a numpy array with shape(n_emulators, n_params)
(note that if the various emulators have different numbers of parameters, the numpy array option will not work). IfNone
is given, then a random value is chosen. (Default isNone
) - method (str) – Minimization method to be used. Can be any
gradient-based optimization method available in
scipy.optimize.minimize
. (Default is'L-BFGS-B'
) - skip_failures (bool) – Boolean controlling how to handle failures
in
MultiOutputGP
fitting. If set toTrue
, emulator fits will fail silently without raising an error and provide information on the emulators that failed and the end of fitting. IfFalse
, any failed fit will raise aRuntimeError
. Has no effect on fitting a singleGaussianProcess
, which will always raise an error. Optional, default isTrue
. - refit (bool) – Boolean indicating if previously fit emulators
for
MultiOutputGP
objects should be fit again. Optional, default isFalse
. Has no effect onGaussianProcess
fitting, which will be fit irrespective of the current hyperparameter values. - **kwargs – Additional keyword arguments to be passed to
GaussianProcess.__init__
,MultiOutputGP.__init__
, or the minimization routine. Relevant parameters for the GP classes are automatically split out from those used in the minimization function. See available parameters in the corresponding functions for details.
Returns: Fit GP or Multi-Output GP instance
Return type: GaussianProcess or MultiOutputGP or GaussianProcessGPU or MultiOutputGP_GPU
- *args – Either a single