.. _DiscKarhunenLoeveExpansion: Discussion: the Karhunen-Loeve expansion for Gaussian processes =============================================================== Description and Background -------------------------- Let :math:`\{Z(x)\}` be a second order stochastic process with zero mean and continuous covariance function defined on an interval :math:`[a,b]` : .. math:: R(s,t)=Cov(Z(s),Z(t))=E(Z(s)Z(t)). The Karhunen-Loeve (K-L) expansion is a commonly used result of Mercer's theorem and Reproducing Kernel Hilbert Space (RKHS). Using K-L expansion of the covariance function :math:`R(s,t)` is :math:`R(s,t)=\sum_{i=0}^{\infty} \lambda_i\phi_i(s)\phi_i(t) \label{kernel}` where the :math:`\{\lambda_i,i\in N\}` and :math:`\{\phi_i,i\in N\}` are respectively the nonzero eigenvalues and the eigenfunctions of the following integral equation known as "Fredholm integral equation of the second kind". .. math:: \int_a^b R(s,t)\phi_i(t)dt=\lambda_i \phi_i(s). Theory shows that the K-L expansion of :math:`R(s,t)` converges uniformly in both variables :math:`s` and :math:`t`. The stochastic process :math:`Z(x)` itself can then be expressed as .. math:: Z(x) = \sum_{i=0}^{\infty} \lambda_i \zeta_i \phi_i(x) Where we assume that the :math:`\phi_i` orthonormal functions with respect to uniform measure on :math:`[a,b]` and the :math:`\zeta_i` are independent, zero mean, unit variance Gaussian random variables. Discussion: Karhunen-Loeve for the spatial (emulator) case ---------------------------------------------------------- For the spatial case, a given covariance function between :math:`s,t` represented as :math:`R(s,t)=Cov [Z(s_1,s_2, \cdots, s_d),Z(t_1,t_2, \cdots, t_d)]` can be approximated using the truncated K-L expansion (see below). As in the one-dimensional case, a set of eigenfunctions can be used to represent the covariance function and the process. The basic idea, for use in emulation and optimal design, is that the behaviour of the process will be captured by the first :math:`p` eigenvalues and eigenfunctions. In the case that :math:`R(s,t)` is separable let the eigenvalues and eigenfunctions of the correlation function in the :math:`k`-th dimension be :math:`\{\lambda_k^{(i)}\}` and :math:`\{\phi_k^{(i)}(x_k)\}`. Then .. math:: Z(x)=Z(x_1,x_2,\cdots, x_d)=\sum_i\sqrt{\lambda_i}\zeta_i\phi_i(x_1,x_2,\cdots, x_d) where .. math:: \lambda_i=\Pi_{k=1}^{d}\lambda_k^{(i_k)},\;\; \phi_i(x_1,x_2,\cdots, x_d)=\Pi_{k=1}^{d}\phi_k^{(i_k)}(x_k),\; i=(i_1, \cdots, i_d) and the sum is over all such :math:`i`. That is to say we take all products of all one dimensional expansions. For the proposed application to the spatial sampling represented by the GP emulator we replace the problem of estimating the covariance parameters by converting the problem (approximately) to a standard Bayes optimal design problem by truncating the K-L expansion: .. math:: R(s,t)=\sum_{i=0}^{p} \lambda_i \phi_i(s) \phi_i(t) In its simplest form take the same :math:`\lambda` to be the (prior) variances for a Bayesian random regression obtained by the equivalent truncation of the process itself: .. math:: Z(x)=\sum_{i=1}^{p}\sqrt{\lambda_i}\zeta_i \phi(x) Theory shows that the K-L expansion truncated to depth :math:`p` has the minimum integrated mean squared approximation to the full process, among all depth :math:`p` orthogonal expansions. This reduces the problem to finding the first :math:`p` eigenvalues and eigenfunctions of the Fredholm equation. The truncation point :math:`p` should depend on the required level of accuracy in the reconstruction of the covariance function. Because the reconstructed covariance function will typically be singular if the number of design points :math:`n` is greater than :math:`p` one should add a :ref:`nugget` term in this case. One way to interpret this is that all the higher frequency terms hidden in the original covariance function are absorbed into the nugget. To summarise, the original output process in matrix terms, :math:`Y=X\beta + Z`, where :math:`Z` is from the Gaussian process, is replaced by :math:`Y = X \beta + \Phi \gamma + \varepsilon`, where :math:`\Phi` is the design matrix from the :math:`\phi_1, \cdots, \phi_p` and :math:`\varepsilon \sim N(0, \sigma^2)`, with small :math:`\sigma`, is the nugget term. The default covariance matrix for the :math:`\gamma` parameters is to take the diagonal matrix with entries :math:`\lambda_1,\cdots, \lambda_p`. When the approximation is satisfactory and with a suitable nugget attached the problem is reduced to a Bayesian random regression optimal design method as in :ref:`AltOptimalCriteria`, with an appropriate criterion. The principle at work is that the :math:`\phi_j` for larger :math:`j` express the higher frequency components of the process and conversely the smaller :math:`j` the lower frequency components. It is expected that this will be a useful platform for a fully adaptive version of :ref:`ProcASCM`, where the variances attached to the :math:`\gamma` parameters should adapt to the smoothness of the process. Finding the eigenvalues and the eigenfunctions needs numerical solution of the eigen-equations (Fredholm integral equation of the second kind). Analytical solutions only exist for some a limited range of processes. Fast numerical solutions are described in the alternatives page :ref:`AltNumericalSolutionForKarhunenLoeveExpansion`. Additional Comments, References, and Links ------------------------------------------ Although the methods to approximate the K-L expansion, particularly the Haar method, are fast, the number of basis functions in the K-L expansion in :math:`d` dimensions, when the product form described above is taken, can be large: :math:`p^d` if we approximate to order :math:`p` in every dimension. To partly avoid this blow up one can take, not every product basis function, but limit the "degree" of the approximations across all dimensions: e.g. :math:`0 \leq \sum_{j=1}^d p_j \leq p`. This is the analogue of taking a polynomial basis up to a particular "total degree" as in a quadratic response surface. An alternative method is to order the products of the eigenvalues and truncate by the value. Another point to note is that for very smooth function :math:`p` need not be large because the :math:`\lambda_i` decline rapidly. Values of :math:`p` from 3 to 5 are effective. S. P. Huang, S. T. Quek and K. K. Phoon: Convergence study of the truncated Karhunen-Loeve expansion for simulation of stochastic processes, *Intl J. for Numerical Methods in Engineering*, 52:1029-1043, 2001.