Discussion: Uncertainty analysis

In uncertainty analysis, we wish to quantify uncertainty about simulator outputs due to uncertainty about simulator inputs. We define \(X\) to be the uncertain true inputs, and \(f(X)\) to be the corresponding simulator output(s). In the emulator framework, \(f(.)\) is also treated as an uncertain function, and it is important to consider both uncertainty in \(X\) and uncertainty in \(f(.)\) when investigating uncertainty about \(f(X)\). In particular, it is important to distinguish between the unconditional distribution of \(f(X)\), and the distribution of \(f(X)\) conditional on \(f(.)\). For example:

  1. \(\textrm{E}[f(X)]\) is the expected value of \(f(X)\), where the expectation is taken with respect to both \(f(.)\) and \(X\). The value of this expectation can, in principle, be obtained for any emulator and input distribution.
  2. \(\textrm{E}[f(X)|f(.)]\) is the expected value of \(f(X)\), where the expectation is taken with respect to \(X\) only as \(f(.)\) is given. If \(f(.)\) is a computationally cheap function, we could, for example, obtain the value of this expectation using Monte Carlo, up to an arbitrary level of precision. However, when \(f(.)\) is computationally expensive such that we require an emulator for \(f(.)\), this expectation is an uncertain quantity. We are uncertain about the value of \(\textrm{E}[f(X)|f(.)]\), because we are are uncertain about \(f(.)\).

There is no sense in which \(\textrm{E}[f(X)]\) can be ‘wrong’: it is simply a probability statement resulting from a choice of emulator (good or bad) and input distribution. But an estimate of \(\textrm{E}[f(X)|f(.)]\) obtained using an emulator can be poor if we have a poor emulator (in the validation sense) for \(f(.)\). Alternatively, we may be very uncertain about \(\textrm{E}[f(X)|f(.)]\) if we don’t have sufficient training data for the emulator of \(f(.)\). Hence in practice, the distinction is important for considering whether we have enough simulator runs for our analysis of interest.