Procedure: Generate random outputs from an emulator at specified inputs

Description and Background

This procedure describes how to randomly sample output values from an emulator. This, of course, requires a fully probabilistic emulator such as the Gaussian process emulator described in the core thread ThreadCoreGP, rather than a Bayes linear emulator. You should refer to ThreadCoreGP, the procedure page on building a GP emulator (ProcBuildCoreGP), the variant thread on analysing a simulator with multiple outputs using GP methods (ThreadVariantMultipleOutputs) and the procedure page on building multivariate GP emulators (ProcBuildMultiOutputGP) for definitions of the various functions used in this procedure.

Inputs

  • An emulator
  • Emulator hyperparameters \(\theta\)
  • A set of points \(D=( x_1',\ldots,x_{n'}' )\) in the input space at which randomly sampled outputs are required.

Outputs

  • A set of jointly sampled values from the distribution of \(f(D')=\{ f(x_1'),\ldots,f(x_{n'}')\}\) .

Procedure

Scalar output general case

  1. Define

    \[m^*(D')=\{m^*(x_1'),\ldots, m^*(x_{n'}')\}^{\rm T}.\]
  2. Define \(v^*(D',D')\), the \(n' \times n'\) matrix with \(i,j\) element given by \(v^*(x_i',x_j')\).

  3. Generate the random vector \(\{f(x_1'),\ldots,f(x_{n'}')\}^{\rm T}\) from \(N_{n'}\{m^*(D'), v^*(D',D')\}\), the multivariate normal distribution with mean vector \(m^*(D')\) and variance matrix \(v^*(D',D')\).

Multivariate output general case

We have a simulator \(f(.)\) that produces a vector \(f(x)\) of \(r\) outputs for a single input value \(x\). To clarify notation, we generate the random vector \(F= \{f(x_1'),\ldots,f(x_{n'}')\}^{\rm T}\), arranged as follows:

\[\begin{split}F=\left(\begin{array}{c} (\mbox{output 1 at input }x'_{1}) \\ \vdots \\ (\mbox{output 1 at input }x'_{n'}) \\ \vdots \mbox{output r at input }x'_{1}) \\ \vdots \\ (\mbox{output r at input }x'_{n'}) \end{array} \right).\end{split}\]
  1. Define

    \[m^*(D')=\{m^*(x_1')^{\rm T},\ldots, m^*(x_{n'}')^{\rm T}\}^{\rm T}.\]

    In the multivariate case this is an \(n' \times r\) matrix with each of the \(r\) columns representing one of the simulator outputs. We arrange this into an \(n' r\times 1\) column vector by stacking the \(r\) columns of the \(n' \times r\) matrix into a single column vector (the \(\mathrm{vec}\) operation):

    \[\mu^*(D')=\mathrm{vec}\{m^*(D')\}.\]
  2. Define \(V\), the \((n'r) \times (n'r)\) matrix with \(i,j\) element defined as the posterior covariance between elements \(i\) and \(j\) of \(F= \{f(x_1'),\ldots,f(x_{n'}')\}^{\rm T}\).

  3. Generate the random vector \(F\) from \(N_{n'r}\{\mu^*(D'),V\}\), the multivariate normal distribution with mean vector \(\mu^*(D')\) and variance matrix :math:` V`.

Multivariate output general case with separable covariance function

Suppose we have a separable covariance function of the form \(\Sigma c(\cdot,\cdot)\), where \(\Sigma\) is the output space covariance matrix, and \(c(\cdot,\cdot)\) is the input space covariance function. Following the notation in ProcBuildMultiOutputGP, we write the posterior covariance function as

\[\textrm{Cov}[f(x),f(x')|f(D)]=\Sigma c^*(x,x')=\Sigma\{c(x,x') -c(x)^{\rm T} A^{-1}c(x)\},\]

with \(A=c(D,D)\). The posterior variance matrix :math:` V` of \(\{f(x_1'),\ldots,f(x_{n'}')\}^{\rm T}\) can then be written as

\(V=\Sigma \otimes c^*(D',D')\), where \(\otimes\) is the kronecker product. We can now generate \(\{f(x_1'),\ldots,f(x_{n'}')\}^{\rm T}\) using the matrix normal distribution:

  1. Define \(U_{\Sigma}\) to be the lower triangular square root of \(\Sigma\)
  2. Define \(U_c\) to be the lower triangular square root of \(c^*(D',D')\)
  3. Generate \(Z_{n',r}\): an \(n'\times r\) matrix of independent standard normal random variables
  4. The random draw from the distribution of \(f(D')\) is given by \(F=U_c Z_{n',r} U_\sigma\)

\(F\) is an \(n'\times r\) matrix, arranged as follows:

\[\begin{split}F=\left(\begin{array}{ccc}(\mbox{output 1 at input }x'_{1}) & \cdots &(\mbox{output r at input }x'_{1}) \\ \vdots & & \vdots \\ (\mbox{output 1 at input }x'_{n'}) & \cdots & (\mbox{output r at input }x'_{n'}) \end{array}\right).\end{split}\]

Scalar output linear mean and weak prior case

  1. Define

    \[m^*(D')=\{m^*(x_1'),\ldots, m^*(x_{n'}')\}^{\rm T}.\]
  2. Define \(v^*(D',D')\), the \(n' \times n'\) matrix with \(i,j\) element given by \(v^*(x_i',x_j')\).

  3. Generate the random vector \(\{f(x_1'),\ldots,f(x_{n'}')\}^{\rm T}\) from a multivariate student t distribution with mean vector \(m^* (D')\), variance matrix \(V^*(D',D')\), and \(n-q\) degrees of freedom.

    As an alternative to sampling from the multivariate student t distribution, you can first first sample a random value of \(\sigma^2\) and then sample from a multivariate normal instead:

    1. Sample \(\tau^2\) from the \(\Gamma\{(n-q)/2,(n-q-2)\hat{\sigma}^2/2\}\) distribution. Note the parameterisation of the gamma distribution here: if \(W\sim Gamma(a,b)\) then the density function is \(p(w)=\frac{b^a}{\Gamma(a)}w^{a-1}\exp(-bw)\).
    2. Set \(\sigma^2=1/\tau^2\) and replace \(\hat{\sigma}^2\) by \(1/\tau^2\) in the formula for \(v^*(x_i',x_j')\).
    3. Compute \(v^*(D',D')\), the \(n' \times n'\) matrix with \(i,j\) element given by \(v^*(x_i',x_j')\).
    4. Generate the random vector \(\{f(x_1'),\ldots,f(x_{n'}')\}^{\rm T}\) from \(N_{n'}\{m^*(D'), v^*(D',D')\}\), the multivariate normal distribution with mean vector \(m^*(D')\) and variance matrix \(v^*(D',D')\).

Multivariate output linear mean and weak prior case

We have a simulator \(f(\cdot)\) that produces a vector \(f(x)\) of \(r\) outputs for a single input value \(x\). To clarify notation, we generate the random vector \(F= \{f(x_1'),\ldots,f(x_{n'}')\}^{\rm T}\), arranged as follows:

\[\begin{split}F=\left(\begin{array}{c} (\mbox{output 1 at input }x'_{1}) \\ \vdots \\ (\mbox{output 1 at input }x'_{n'}) \\ \vdots \mbox{output r at input }x'_{1}) \\ \vdots \\ (\mbox{output r at input }x'_{n'}) \end{array} \right).\end{split}\]
  1. Define

    \[m^*(D')=\{m^*(x_1')^{\rm T},\ldots, m^*(x_{n'}')^{\rm T}\}^{\rm T}.\]

    In the multivariate case this is an \(n'\times r\) matrix with each of the \(r\) columns representing one of the simulator outputs. We arrange this into an \(n' r\times 1\) column vector by performing the \(Vec\) operation:

    \[\mu^*(D')=\mathrm{vec}\{m^*(D')\}.\]
  2. Define \(V\), the \((n'r) \times (n'r)\) matrix with \(i,j\) element defined as the posterior covariance between elements \(i\) and \(j\) of \(F= \{f(x_1'),\ldots,f(x_{n'}')\}^{\rm T}\).

  3. Generate the random vector \(F\) from a multivariate student t distribution with mean vector \(\mu^*(D')\), variance matrix \(V\), and \(n-q\) degrees of freedom.

Multivariate output linear mean, weak prior, and separable covariance function case

Suppose we have a separable covariance function of the form \(\Sigma c(\cdot,\cdot)\), where \(\Sigma\) is the output space covariance matrix, and \(c(\cdot,\cdot)\) is the input space covariance function. Following the notation in ProcBuildMultiOutputGP, we write the posterior covariance function as

\[\textrm{Cov}[f(x),f(x')|f(D)]=\widehat{\Sigma} c^*(x,x')=\widehat\Sigma\,\left\{c(x,x^\prime) - c(x)^{\rm T} A^{-1} c(x^\prime) + R(x) \left( H^{\rm T} A^{-1} H\right)^{-1} R(x^\prime)^{\rm T} \right\},\]

with \(A=c(D,D)\) and \(R(x) = h(x)^{\rm T} - c(x)^{\rm T} A^{-1}H\). The posterior variance matrix \(V\) of \(\{f(x_1'),\ldots,f(x_{n'}')\}^{\rm T}\) can then be written as

\(V=\widehat{\Sigma} \otimes c^*(D',D')\), where \(\otimes\) is the kronecker product. We can now generate \(\{f(x_1'),\ldots,f(x_{n'}')\}^{\rm T}\) using the matrix student \(t\) distribution:

  1. Define \(U_{\Sigma}\) to be the lower triangular square root of \(\widehat{\Sigma}\)

  2. Define \(U_c\) to be the lower triangular square root of \(c^*(D',D')\)

  3. Generate a \(n'\times r\) matrix \(T_{n',r}\) having a multivariate t distribution with uncorrelated elements, in the following three sub-steps.

    1. Generate \(Z_{n',r}\): an \(n'\times r\) matrix of independent standard normal random variables.
    2. Generate \(W\sim Gamma\{(n-q)/2,0.5\}\).
    3. Set \(T_{n',r}=\frac{1}{\sqrt{W/(n-q)}}Z_{n',r}\).

    Note the parameterisation of the gamma distribution here: if \(W\sim Gamma(a,b)\) then the density function is \(p(w)=\frac{b^a}{\Gamma(a)}w^{a-1}\exp(-bw)\).

  1. The random draw from the distribution of \(f(D')\) is given by \(F=U_c T_{n',r} U_\sigma\)

    \(F\) is an \(n'\times r\) matrix, arranged as follows:

    \[\begin{split}F=\left(\begin{array}{ccc}(\mbox{output 1 at input }x'_{1}) & \cdots &(\mbox{output r at input }x'_{1}) \\ \vdots & & \vdots \\ (\mbox{output 1 at input }x'_{n'}) & \cdots & (\mbox{output r at input }x'_{n'}) \end{array}\right).\end{split}\]