Sorry, an error occurred while loading the content.

## 1941Re: [ai-geostats] question about kriging with skewed distribution

Expand Messages
• Mar 5, 2005
• 0 Attachment
> hello,
> I have a question about what is/should be typically done when kriging is
> used for spatial interpolation of a process X(z) where z gives spatial
> location (e.g. z=(x,y) with cartesian coordinates x,y) and X(z) has a
> skewed continuous distribution with nonnegative support. For instance
> lognormal.

As far as i know, traditional geostatistics as originated in Matheron is
distribution-free. The analysis does not require a pre-experimental
probability model for the data, thus it does not rely on any
post-experimental likelihood function. So it does not matter, for kriging,
if the data are skewed or has any shape whatsoever. That is the theory at
least. People may still want to work with symmetrical distributions
because they may not be entirely confortable with the theory?

> Now,
> if all data are in the form of point samples, X(z)'s can obviously be
> transformed by taking logs to Y(z)=log(X(z)) which are exactly (with
> lognormal X's) or approximately Gaussian, so that kriging can be done
> comfortably (and the result backtransformed with easy correction for the
> fact that E f(X) is generally not equal to f(E X), based on the formula
> for lognormal expected value or Taylor expansion).

Yes, though the data may only be a little lognormal. If it is exactly
lognormal then the parameter of the Box-Cox transformation is 0, but
values like -0.1 or +0.1 can produce more symetrical distributions . This
parameter can be estimated along with spatial correlation function
parameters to let the data decide what precise transformation makes it
look more Gaussian. For this you would need to set up a formal statistical
model for the data instead of following the traditional distribution-free
methodology. Check the info on geoR, a contributred package to R.

> If at least some data are not point samples, but correspond to the
> regional averages, then problem occurs due to the facts that: i) sum
> of lognormals is not lognormal, ii) the log of the sum (or average)
> of lognormals is not normal.

If you don't have raw data but averages then within the likelihood-based
approach you may want to think of a marginal likelihood model to carry
over the uncertainty associated with the averaging into the final
analysis. I think this is rather complicated. On the other hand, maybe
there is no such problem within the traditional distribution-free school
because the uncertainty associated to the fitting of the spatial model
ususally is ignored.

[snip the rest for brevity]

Ruben
• Show all 3 messages in this topic