Procedure: Uncertainty Analysis using a GP emulator

Description and Background

One of the simpler tasks that is required by users of simulators is uncertainty analysis (UA), which studies the uncertainty in model outputs that is induced by uncertainty in the inputs. Although relatively simple in concept, UA is both important and demanding. It is important because it is the primary way to quantify the uncertainty in predictions made by simulators. It is demanding because in principle it requires us to evaulate the output at all possible values of the uncertain inputs. The MUCM approach of first building an emulator is a powerful way of making UA feasible for complex and computer-intensive simulators.

This procedure describes how to compute some of the UA measures discussed in the definition page of Uncertainty Analysis (DefUncertaintyAnalysis). In particular, we consider the uncertainty mean and variance:

\[\begin{split}\textrm{E}[f(X)] &= \int_{{\cal X}} f(x) \omega(x) \mathrm{d} x \\ \textrm{Var}[f(X)] &= \int_{{\cal X}} (f(x) - \textrm{E}[f(x)])^2 \omega(x) \mathrm{d} x\end{split}\]

Notice that it is necessary to specify the uncertainty about the inputs through a full probability distribution for the inputs. This clearly demands a good understanding of probability and its use to express personal degrees of belief. However, this specification of uncertainty often also requires interaction with relevant experts whose knowledge is being used to specify values for individual inputs. There is a considerable literature on the elicitation of expert knowledge and uncertainty in probabilistic form, and some references are given at the end of this page.

In practice, we cannot evaluate either \(\textrm{E}[f(X)]\) or \(\textrm{Var}[f(X)]\) directly from the simulator because the integrals require us to know \(f(x)\) at every \(x\). Even evaluating numerically by a Monte Carlo integration approach would require a very large number of runs of the simulator, so this is one task for which emulation is very powerful. We build an emulator from a limited training sample of simulator runs and then use the emulator to evaluate these quantities. We still cannot evaluate them exactly because of uncertainty in the emulator. We therefore present procedures here for calculating the emulator (i.e. posterior) mean of each quantity as an estimate; while the emulator variance provides a measure of accuracy of that estimate. We use \(\textrm{E}^*\) and \(\textrm{Var}^*\) to denote emulator mean and variance.

We assume here that a Gaussian process (GP) emulator has been built in the form described in the procedure page for building a GP emulator (ProcBuildCoreGP), and that we are only emulating a single output. Note that ProcBuildCoreGP gives procedures for deriving emulators in a number of different forms, and we consider here only the “linear mean and weak prior” case where the GP has a linear mean function, weak prior information is specified on the hyperparameters \(\beta\) and \(\sigma^2\) and the emulator is derived with a single point estimate for the hyperparameters \(\delta\).

Inputs

  • An emulator as defined in ProcBuildCoreGP
  • A distribution \(\omega(\cdots)\) for the uncertain inputs

Outputs

  • The expected value \(\textrm{E}^*[\textrm{E}[f(X)]]\) and variance \(\textrm{Var}^*[\textrm{E}[f(X)]]\) of the uncertainty distribution mean
  • The expected value \(\textrm{E}^*[\textrm{Var}[f(X)]]\) of the uncertainty distribution variance

Procedure

In this section we describe the calculation of the above quantities. We first give their expressions in terms of a number of integral forms, \(U_p,P_p,Q_p,S_p,U,T,R\). We then give the form of these integrals for the general case when no assumptions about the form of the distribution of the inputs \(\omega(X)\), correlation function \(c(X,X')\) and regression function \(h(X)\) are made. Finally we give their forms for two special cases.

Calculation of \(\textrm{E}^*[\textrm{E}[f(X)]]\)

\[\textrm{E}^*[\textrm{E}[f(X)]] = R\hat{\beta} + Te\]

where

\[e =A^{-1}(f(D)-H\hat{\beta})\]

Calculation of \(\textrm{Var}^*[\textrm{E}[f(X)]]\)

\[\textrm{Var}^*[\textrm{E}[f(X)]] = \hat{\sigma}^2[U - TA^{-1}T^{\mathrm{T}} + (R-TA^{-1}H)W(R-TA^{-1}H)^{\mathrm{T}}]\]

where

\[W = (H^{\mathrm{T}}A^{-1}H)^{-1}\]

Calculation of \(\textrm{E}^*[\textrm{Var}[f(X)]]\)

\[\textrm{E}^*[\textrm{Var}[f(X)]] = \textrm{E}^*[\textrm{E}[f(X)^2]] - \textrm{E}^*[\textrm{E}[f(X)]^2]\]

The first term is

\[\begin{split}\begin{array}{r l} \textrm{E}^*[\textrm{E}[f(X)^2]] & = \hat{\sigma}^2[U_p-\mathrm{tr}(A^{-1}P_p) + \mathrm{tr}\{W(Q_p-S_p A^{-1} H-H^{\mathrm{T}}A^{-1}S_p^{\mathrm{T}} + H^{\mathrm{T}}A^{-1}P_p A^{-1}H)\}] \\ & + e^{\mathrm{T}}P_pe + 2\hat{\beta}^{\mathrm{T}}S_pe + \hat{\beta}^{\mathrm{T}}Q_p\hat{\beta} \end{array}\end{split}\]

The second term is

\[\begin{split}\begin{array}{r l} \textrm{E}^*[\textrm{E}[f(X)]^2] & = \hat{\sigma}^2[U-TA^{-1}T^{\mathrm{T}} +\{R - TA^{-1}H\}W\{R - TA^{-1}H\}^\mathrm{T}] \\ & + \left(R\hat{\beta}+Te \right)^2 \end{array}\end{split}\]

Dimensions

Before describing the terms involved in the above expressions we first give their dimensions. We assume that we have \(n\) observations, \(p\) inputs and \(q\) regression functions. The dimension of the above quantities are given in the table below.

Symbol Dimension Symbol Dimension
\(\hat{\sigma}^2\) \(1 \times 1\) \(U_p\) \(1 \times 1\)
\(\hat{\beta}\) \(q \times 1\) \(P_p\) \(n \times n\)
\(e\) \(n \times 1\) \(S_p\) \(q \times n\)
\(f\) \(n \times 1\) \(Q_p\) \(q \times q\)
\(A\) \(n \times n\) \(U\) \(1 \times 1\)
\(H\) \(n \times q\) \(T\) \(1 \times n\)
\(W\) \(q \times q\) \(R\) \(1 \times q\)

The terms \(\hat{\sigma}^2\), \(\hat{\beta}\), \(f(D)\), \(A\) and \(H\) are defined in ProcBuildCoreGP, while \(e\) and \(W\) are defined above. The terms in the right hand column are inherent in uncertainty analysis and are described below.

The integral forms

General case

When no assumptions are made about the distribution of the inputs, the correlation and the regression functions we have general expressions for the \(U_p, P_p, S_p, Q_p, U, R, T\) terms. These are

\[\begin{split}U_p &= \int_{{\cal X}} c(x,x)\omega(x) \mathrm{d} x \\ P_p &= \int_{{\cal X}} t(x)t(x)^{\mathrm{T}} \omega(x) \mathrm{d} x \\ S_p &= \int_{{\cal X}} h(x)t(x)^{\mathrm{T}} \omega(x) \mathrm{d} x \\ Q_p &= \int_{{\cal X}} h(x)h(x)^{\mathrm{T}} \omega(x) \mathrm{d} x\end{split}\]

\(h(x)\) is described in the alternatives page on emulator prior mean function (AltMeanFunction). \(c(\cdot,\cdot)\) is the correlation function discussed in the alternatives page on emulator prior correlation function (AltCorrelationFunction).

Also \(t(x) = c(D,x)\), as introduced in ProcBuildCoreGP.

Finally, \(\omega(x)\) is the joint distribution of the \(x\) inputs.

For the \(U,R,T\) we have

\[\begin{split}U &= \int_{{\cal X}}\int_{{\cal X}} c(x,x')\omega(x)\omega(x') \mathrm{d} x \mathrm{d} x' \\ R &= \int_{{\cal X}} h(x)^{\mathrm{T}}\omega(x) \mathrm{d} x \\ T &= \int_{{\cal X}} t(x)^{\mathrm{T}}\omega(x) \mathrm{d} x\end{split}\]

where \(x\) and \(x'\) are two different realisations of \(x\).

Special case 1

We now show how the above integrals are transformed when we make specific choices about \(\omega(\cdot)\) \(c(\cdot,\cdot)\) and \(h(\cdot)\). We first assume that \(\omega(\cdot)\) is a normal distribution given by

\[\omega(x) = \frac{1}{(2\pi)^{d/2}|B|^{-1/2}}\exp\left[-\frac{1}{2}(x-m)^{\rm T} B (x-m)\right]\]

We also assume the generalised Gaussian correlation function with nugget (see AltCorrelationFunction)

\[c(x,x') = \nu I_{x=x'} + (1-\nu)\exp\{-(x-x')^{\rm T} C (x-x')\}\]

where \(I_{x=x'}\) equals 1 if \(x=x'\) but is otherwise zero, and \(\nu\) represents the nugget term. The case of a generalised Gaussian correlation function without nugget is simply obtained by setting \(\nu=0\).

We let both \(B,C\) be general positive definite matrices. Also, we do not make any particular assumption for \(h(x)\).

We now give the expressions for each of the integrals


\[U_p = 1\]

Note that this result is independent of the existence of a non-zero nugget \(\nu\). See the discussion page on the nugget effects in uncertainty and sensitivity (DiscUANugget) for more on this point.


\(P_p\) is an \(n \times n\) matrix, whose \((k,l)^{\mathrm{th}}\) entry is

\[P_p(k,l) = (1-\nu)^2\frac{|B|^{1/2}}{|F|^{1/2}} \exp\left\{-\frac{1}{2}\left[ r - g^{\mathrm{T}}F^{-1}g \right]\right\}\]

with

\[F = 4C+B\]

and

\[g = 2C(x_k+x_k - 2m)\]

and

\[r = (x_k - m)^{\mathrm{T}}2C(x_k - m) + (x_l - m)^{\mathrm{T}}2C(x_l - m)\]

The subscripts \(k\) and \(l\) of \(x\) denote training points.


\(S_p\) is a \(q \times n\) matrix, whose \((k,l)^{\mathrm{th}}\) entry is

\[S_p(k,l) = (1-\nu)\frac{|B|^{1/2}}{|F|^{1/2}} \exp \left\{-\frac{1}{2}\left[r - g^{\mathrm{T}}F^{-1}g\right]\right\} \textrm{E}_*[h_k(x)]\]

with

\[F = 2C+B\]

and

\[g = 2C(x_l - m)\]

and

\[r = (x_l-m)^{\mathrm{T}}2C(x_l-m)\]

The expectation \(\textrm{E}_*[\cdot]\) is w.r.t. the normal distribution \({\cal{N}}(m + F^{-1}g,F^{-1})\). Also \(h_k(x)\) is the \(k\)-th element of \(h(x)\).


\(Q_p\) is a \(q \times q\) matrix, whose \((k,l)^{\mathrm{th}}\) entry is

\[Q_s(k,l) = \textrm{E}_*[h_k(x)h_l(x)^{\rm T}]\]

where the expectation \(\textrm{E}_*[\cdot]\) is w.r.t. the normal distribution \(\omega(x)\)


\(U\) is the scalar

\[U = (1-\nu)\frac{|B|\phantom{^{1/2}}}{|F|^{1/2}}\]

with

\[\begin{split}F = \left[ \begin{array}{cc} 2C + B & -2C \\ -2C & 2C + B \end{array}\right]\end{split}\]

\(R\) is the \(1 \times q\) vector with elements the mean of the elements of \(h(x)\), w.r.t. \(\omega(x)\), i.e.,

\[R = \int_{{\cal X}} h(x)^{\mathrm{T}}\omega(x) \mathrm{d} x\]

\(T\) is a \(1 \times n\) vector, whose \(k^{\mathrm{th}}\) entry is

\[T(k) = \frac{(1-\nu)|B|^{1/2}}{|2C+B|^{1/2}} \exp\left\{-\frac{1}{2} \left[r-g^{\rm T}F^{-1}g\right]\right\}\]

with

\[F = 2C+B\]
\[g = 2C(x_k-m)\]
\[r = (x_k-m)^{\rm T} 2C(x_k-m)\]

Special case 2

In this special case, we further assume that the matrices \(B,C\) are diagonal. We also consider a special form for the vector \(h(x)\), which is the linear function described in AltMeanFunction

\[h(x) = [1,x]^{\mathrm{T}}\]

Hence \(q=p+1\). We now present the form of the integrals under the new assumptions.


\[U_p = 1\]

\(P_p\) is an \(n \times n\) matrix, whose \((k,l)^{\mathrm{th}}\) entry is

\[\begin{split}\begin{array}{r l} P_p(k,l) = & (1-\nu)^2\prod_{i=1}^p \left(\frac{B_{ii}}{4C_{ii}+B_{ii}}\right)^{1/2} \exp\left\{-\frac{1}{2}\frac{1}{4C_{ii}+B_{ii}}\right. \\ & \left[4C_{ii}^2(x_{ik}-x_{il})^2 + 2C_{ii} B_{ii}\left\{(x_{ik}-m_i)^2 + (x_{il}-m_i)^2\right\}\right]\Big\} \end{array}\end{split}\]

where the double indexed \(x_{ik}\) denotes the \(i^{\mathrm{th}}\) input of the \(k^{\mathrm{th}}\) training data.


\(S_p\) is an \(q \times n\) matrix whose \((k,l)^{\mathrm{th}}\) entry is

\[\begin{split}\begin{array}{r l} S_p(k,l) = &(1-\nu)\textrm{E}_*[h_k(x)] \\ & \prod_{i=1}^p \frac{B_{ii}^{1/2}}{(2C_{ii}+B_{ii})^{1/2}} \exp\left\{-\frac{1}{2}\frac{2C_{ii}B_{ii}}{2C_{ii}+B_{ii}} \left[(x_{il}-m_i)^2\right]\right\} \end{array}\end{split}\]

For the expectation we have

\[\begin{split}\begin{array}{ll} \textrm{E}_*[h_1(x)] = 1 & \\ \textrm{E}_*[h_{j+1}(x)] = \frac{2C_{jj}x_{jl} + m_j B_{jj}}{2C_{jj} + B_{jj}}& \qquad \mathrm{for} \quad j=1,2,\ldots,p \end{array}\end{split}\]

\(Q_p\) is the \(q \times q\) matrix,

\[\begin{split}Q_p = \left[ \begin{array}{cc} 1 &m^{\rm T} \\ m& mm^{\rm T} + B^{-1} \end{array} \right]\end{split}\]

\(U\) is the scalar

\[U = (1-\nu)\prod_{i=1}^p \left(\frac{B_{ii}}{B_{ii} + 2(2C_{ii})}\right)^{1/2}\]

\(R\) is the \(1 \times q\) vector

\[R = [1,m^{\rm T}]\]

\(T\) is a \(1 \times n\) vector, whose \(k^{\mathrm{th}}\) entry is

\[T(k) = (1-\nu) \prod_{i=1}^p \frac{B_{ii}^{1/2}}{(2C_{ii}+B_{ii})^{1/2}} \exp\left\{-\frac{1}{2}\frac{2C_{ii}B_{ii}}{2C_{ii}+B_{ii}} \left(x_{ik}-m_i\right)^2\right\} \qquad\]

References

The topic of eliciting expert judgements about uncertain quantities is dealt with fully in the book

O’Hagan, A., Buck, C. E., Daneshkhah, A., Eiser, J. R., Garthwaite, P. H., Jenkinson, D. J., Oakley, J. E. and Rakow, T. (2006). Uncertain Judgements: Eliciting Expert Probabilities. John Wiley and Sons, Chichester. 328pp. ISBN 0-470-02999-4.

For those with limited knowledge of probability theory, we recommend the SHELF package (disclaimer), which provides simple templates, software and guidance for carrying out elicitation.

Oakley, J.E., O’Hagan, A., (2002). Bayesian Inference for the Uncertainty Distribution of Computer Model Outputs, Biometrika, 89, 4, pp.769-784.

Ongoing work

We intend to provide procedures relaxing the assumption of the “linear mean and weak prior” case of ProcBuildCoreGP as part of the ongoing development of the toolkit.