Procedure: Uncertainty Analysis for a Bayes linear emulator

Description and Background

One of the tasks that is required by users of simulators is uncertainty analysis (UA), which studies the uncertainty in model outputs that is induced by uncertainty in the inputs. Although relatively simple in concept, UA is both important and demanding. It is important because it is the primary way to quantify the uncertainty in predictions made by simulators. It is demanding because in principle it requires us to evaulate the output at all possible values of the uncertain inputs. The MUCM approach of first building an emulator is a powerful way of making UA feasible for complex and computer-intensive simulators.

This procedure considers the evaluation of the expectaion and variance of the computer model when the input at which it is evaluated is uncertain. The expressions for these quantities when the input \(x\) takes the unknown value \(x_0\) are:

\[\mu = \text{E}[f(x_0)] = \text{E}^*[ \mu(x_0) ]\]

and

\[\Sigma = \text{Var}[f(x_0)] = \text{Var}^*[ \mu(x_0) ] + \text{E}^*[ \Sigma(x_0) ]\]

where expectations and variances marked with a * are over the unknown input \(x_0\), and where we use the shorthand expressions \(\mu(x)=\text{E}_F[f(x)]\) and \(\Sigma(x)=\text{Var}_F[f(x)]\).

Inputs

  • An emulator
  • The prior emulator beliefs for the emulator in the form of \(\text{Var}[\beta]\), \(\text{Var}[w(x)]\)
  • The training sample design \(X\)
  • Second-order beliefs about \(x_0\)

Outputs

  • The expectation and variance of \(f(x)\) when \(x\) is the unknown \(x_0\)

Procedure

Definitions

Define the following quantities used in the procedure of predicting simulator output at a known input (ProcBLPredict):

  • \(\hat{\beta}=\text{E}_F[\beta]\)
  • \(B=\text{Var}[\beta]\)
  • \(\sigma^2=\text{Var}[w(x)]\)
  • \(V=\text{Var}[f(X)]\)
  • \(e=f(X)-\textrm{E}[f(X)]\)
  • \(B_F =\text{Var}_F[\beta]=B-BHV^{-1}H^TB^T\)

where \(f(x)\), \(\beta\) and \(w(x)\) are as defined in the thread for Bayes linear emulation of the core model (ThreadCoreBL).

Using these definitions, we can write the general adjusted emulator expectation and variance at a known input \(x\) (as given in ProcBLPredict) in the form:

\[\begin{split}\mu(x) &= \hat{\beta}^T h(x) + c(x) V^{-1} e \\ \Sigma(x) &= h(x)^T B_F h(x) + \sigma^2 - c(x)^T V^{-1} c(x) - h(x)^T BHV^{-1} c(x) - c(x)^T V^{-1} H^T B h(x)\end{split}\]

where the vector \(h(x)\) and the matrix \(H\) are as defined in ProcBLPredict, \(c(x)\) is the \(n\times 1\) vector such that \(c(x)^T=\text{Cov}[w(x),w(X)]\), and \(B_F\) is the adjusted variance of the \(\beta\).

To obtain the expectation, \(\mu\), and variance, \(\Sigma\), for the simulator at the unknown input \(x_0\) we take expectations and variances of these quantities over \(x\) as described below.

Calculating \(\mu\)

To calculate \(\mu\), the expectation of the simulator at the unknown input \(x_0\), we calculate the following expectation:

\[\mu=\text{E}[\mu(x_0)]=\hat{\beta}^T\text{E}[h_0]+\text{E}[c_0]^TV^{-1}d,\]

where we define \(h_0=h(x_0)\) and \(c_0^T=\text{Cov}[w(x_0),w(X)]\).

Specification of beliefs for \(h_0\) and \(c_o\) is discussed at the end of this page.

Calculating \(\Sigma\)

\(\Sigma\) is defined to be the sum of two components \(\text{Var}[\mu(x_0)]\) and \(\text{E}^*[ \Sigma(x_0) ]\). Using \(h_0\) and \(c_0\) as defined above, we can write these expressions as:

\[\begin{split}\textrm{Var}[\mu(x_0)] &= \hat{\beta}^T\textrm{Var}[h_0] \hat{\beta}+e^TV^{-1}\textrm{Var}[c_0]^TV^{-1}e + 2\hat{\beta}^T\textrm{Cov}[h_0,c_0] V^{-1}e \\ \text{E}[\Sigma(x_0)] &= \sigma^2 + \text{E}[h_0]^TB_F\text{E}[h_0] - \text{E}[c_0]^TV^{-1}\text{E}[c_0] - 2 \text{E}[h_0]^TB H V^{-1}\text{E}[c_0] \\ & + \text{tr}\left\{ \text{Var}[h_0]B_F - \text{Var}[c_0]V^{-1} -2\text{Cov}[h_0,c_0]V^{-1}H^TB\right\}\end{split}\]

Beliefs about \(g_0\) and \(c_0\)

We can see from the expressions given above, that in order to calculate \(\mu\) and \(\sigma\), we require statements on the expectations, variances, and covariances for the collection \((h_0,c_0)\). In the Bayes linear framework, it will be straightforward to obtain expectations, variances, and covariances for \(x_0\) however since \(h_0\) and \(c_0\) are complex functions of \(x_0\) it can be difficult to use our beliefs about \(x_0\) to directly obtain beliefs about \(h_0\) or \(c_0\).

In general, we rely on the following strategies:

  • Monomial \(h(\cdot)\) – When the trend basis functions take the form of simple monomials in \(x_0\), then the expectation, and (co)variance for \(h_0\) can be expressed in terms of higher-order moments of \(x_0\) and so can be found directly. These higher order moments could be specified directly, or found via lower order moments using appropriate assumptions. In some cases, where our basis functions \(h(\cdot)\) are not monomials but more complex functions, e.g. \(\text{sin}(x)\), these more complex functions may have a particular physical interpretation or relevance to the model under study. In these cases, it can be effective to consider the transformed inputs themselves and thus \(h(\cdot)\) becomes a monomial in the transformed space.
  • Exploit probability distributions – We construct a range of probability distributions for \(x_0\) which are consistent with our second-order beliefs and our general sources of knowledge about likely values of \(x_0\). We then compute the appropriate integrals over our prior for \(x_0\) to obtain the corresponding second-order moments either analytically or via simulation. When the correlation function is Gaussian, then we can obtain results analytically for certain choices of prior distribution of \(x_0\) – the procedure page on uncertainty analysis using a GP emulator (ProcUAGP) addresses this material in detail.