Alternatives: Emulator prior mean function¶

Overview¶

The process of building an emulator of a simulator involves first specifying prior beliefs about the simulator and then updating this using a training sample of simulator runs. Prior specification may be either using the fully Bayesian approach in the form of a Gaussian process or using the Bayes linear approach in the form of first and second order moments. The basics of building an emulator using these two approaches are set out in the two core threads: the thread for the analysis of core model using Gaussian process methods (ThreadCoreGP) and the thread for the Bayes linear emulation for the core model (ThreadCoreBL).

In either approach it is necessary to specify a mean function and covariance function. We consider here the various alternative forms of mean function that are dealt with in the MUCM toolkit. An extension to the case of a vector mean function as required by the thread for the analysis of a simulator with multiple outputs using Gaussian methods (ThreadVariantMultipleOutputs) can be found in a companion page to this one, dealing with alternatives for multi-output mean functions (AltMeanFunctionMultivariate).

Choosing the Alternatives¶

The mean function gives the prior expectation for the simulator output at any given set of input values. We assume here that only one output is of interest, as in the core problem.

In general, the mean function will be specified in a form that depends on a number of hyperparameters. Thus, if the vector of hyperparameters for the mean function is \(\beta\) then we denote the mean function by \(m(\cdot)\), so that \(m(x)\) is the prior expectation of the simulator output for vector \(x\) of input values.

In principle, this should entail the analyst thinking about what simulator output would be expected for every separate possible input vector \(x\). In practice, of course, this is not possible. Instead, \(m(\cdot)\) represents the general shape of how the analyst expects the simulator output to respond to changes in the inputs. The use of the unknown hyperparameters allows the emulator to learn their values from the training sample data. So the key task in specifying the mean function is to think generally about how the output will respond to the inputs.

Having specified \(m(\cdot)\), the subsequent steps involved in building and using the emulator are described in ThreadCoreGP / ThreadCoreBL.

The Nature of the Alternatives¶

The linear form¶

It is usual, and convenient in terms of subsequent building and use of the emulator, to specify a mean function of the form:

\[m(x) = \beta^T h(x)\]

where \(h(\cdot)\) is a vector of (known) functions of \(x\), known as basis functions. This is called the linear form of mean function because it corresponds to the general linear regression model in statistical analysis. When the mean function is specified to have the linear form, it becomes possible to carry out subsequent analyses more simply. The number of elements of the vector \(h(\cdot)\) will be denoted by \(q\). These elementary functions are called basis functions.

There remains the choice of \(h(\cdot)\). We illustrate the flexibility of the linear form first through some simple cases.

The simplest case is when \(q=1\) and \(h(x)=1\) for all \(x\). Then the mean function is \(m(x) = \beta\), where now \(\beta\) is a scalar hyperparameter representing an unknown overall mean for the simulator output. This choice expresses no prior knowledge about how the output will respond to variation in the inputs.
Another simple instance is when \(h(x)^T=(1,x)\), so that \(q=1+p\), where \(p\) is the number of inputs. Then \(m(x)=\beta_1 + \beta_2 x_1 + \ldots + \beta_{1+p}x_p\), which expresses a prior expectation that the simulator output will show a trend in response to each of the inputs, but there is no prior information to suggest any specific nonlinearity in those trends.
Where there is prior belief in nonlinearity of response, then quadratic or higher polynomial terms might be introduced into \(h(\cdot)\).

In principle, all of the kinds of linear regression models that are used by statisticians are available for expressing prior expectations about the simulator. Some further discussion of the choice of basis functions is given in the alternatives page for basis functions for the emulator mean (AltBasisFunctions) and the discussion page on the use of a structured mean function (DiscStructuredMeanFunction).

Other forms of mean function¶

Where prior information suggests that the simulator will respond to variation in its inputs in ways that are not captured by a regression form, then it is possible to specify any other mean function.

For example,

\[m(x) = \beta_1 / (1+\beta_2 x_1) + \exp\left(\beta_3 x_2\right)\]

expresses a belief that as the first input, \(x_1\) increases the simulator output will flatten out in the way specified in the first term, while as \(x_2\) increases the output will increase (or decrease if \(\beta_3 < 0\)) exponentially. Such a mean function might be used where the prior information about the simulator is suitably strong, but this cannot be cast as a regression form. As a result, the analysis (as required for building the emulator and using it for tasks such as uncertainty analysis) will become more complex.

Mean functions appropriate for the multivariate output setting are discussed in AltMeanFunctionMultivariate.

Additional Comments, References, and Links¶

It is important to recognise that the emulator specification does not say that the emulator will respond to its inputs in exactly the way expressed in the mean function. The Gaussian process, or its Bayes linear analogue, will allow the actual simulator output to take any form at all, and given enough training data will adapt to the true form regardless of what is specified in the prior mean. However, the emulator will perform better the more accurately the mean function reflects the actual behaviour of the simulator.

As already discussed, the form of the mean function specifies the shape that we expect the output to follow as the inputs are varied, with the hyperparameters \(\beta\) being estimated from the training data to identify the mean function fully. A fully Bayesian analysis will require a prior distribution to be specified for \(\beta\), while a Bayes linear analysis will require a slightly different form of prior information. This step is addressed in the appropriate core thread, ThreadCoreGP or ThreadCoreBL.

Table of Contents

Previous topic

Next topic

This Page