Procedure: Predicting a function of multiple outputs

Description and Background

When we have a multiple output simulator, we will sometimes be interested in a deterministic function of one or more of the outputs. Examples include:

  • A simulator with outputs that represent amounts of rainfall at a number of locations, and we wish to predict the total rainfall over a region, which is the sum of the rainfall outputs at the various locations (or perhaps a weighted sum if the locations represent subregions of different sizes).
  • A simulator that outputs the probability of a natural disaster and its consequence in loss of lives, and we are interested in the expected loss of life, which is the product of these two outputs.
  • A simulator that outputs atmospheric \({\rm CO}_2\) concentration and global temperature, and we are interested in using them to compute the gross primary productivity of an ecosystem.

If we know that we are only interested in one particular function of the outputs, then the most efficient emulation method is to build a single output emulator for the output of that function. However, there are situations when it is better to first build a multivariate emulator of the raw outputs of the simulator

  • when we are interested in both the raw outputs and one or more functions of the outputs;
  • when we are interested in function(s) that depend not just on the raw outputs of the simulator, but also on some other auxiliary variables.

In such situations we build the multivariate emulator by following ThreadVariantMultipleOutputs. The multivariate emulator can then be used to predict any function of the outputs, at any set of auxiliary variable values, by following the procedure given here.

We consider a simulator \(f(\cdot)\) that has \(r\) outputs, and a function of the outputs \(g(\cdot)\). The procedure for predicting \(g(f(\cdot))\) is based on generating random samples of output values from the emulator, using the procedure ProcOutputSample.

Inputs

  • A multivariate emulator, which is either a multivariate GP obtained using the procedure ProcBuildMultiOutputGP, or a multivariate t-process obtained using the procedure ProcBuildMultiOutputGPSep, conditional on hyperparameters.
  • \(s\) sets of hyperparameter values.
  • A single point \(x^\prime\) or a set of \(n^\prime\) points \(x^\prime_1, x^\prime_2,\ldots,x^\prime_{n^\prime}\) at which predictions are required for the function \(g(.)\).
  • \(N\), the size of the random sample to be generated.

Outputs

  • Predictions of \(g(\cdot)\) at \(x^\prime_1, x^\prime_2,\ldots,x^\prime_{n^\prime}\) in the form of a sample of size \(N\) of values from the predictive distribution of \(g(\cdot)\).

Procedure

For \(j=1,...,N\),

  1. Pick a set of hyperparameter values at random from the \(s\) sets that are available.
  2. Generate a \(n^\prime r \times 1\) random vector \(F^{j}\) from the emulator, using the procedure set out in the ‘Multivariate output general case’ section of ProcOutputSample.
  3. Form the \(r \times n^\prime\) matrix \(M^{j}\) such that \(\mathrm{vec}[M^{jT}]=F^{j}\).
  4. For \(\ell=1,...,n^\prime\), let \(m^{j}_\ell\) be the \(\ell\) and let \(G^{j}_\ell=g(m_\ell)\)

The sample is then \(\{G^{j} : j=1,...,N\}\), where \(G^{j}=(G^{j}_1,...,G^{j}_{n^\prime})\).