A summary of replies about my question concerning that there is
uncertainty in our experimental variograms, I also raised the point that
this uncertainty is low for some depositional (mining) types
(e.g. sedimentary) and high for other data types (e.g. high nugget effect
trangressive gold deposit) and what data would look like for different
variograms (this question also relating to stationarity and the intrinsic
hypothesis often referred to in books about geostatistics).
Answers thanks to Lorenz Dobler, Ruben Roa, Tom Mueller,
Sean McKenna, Jeff Myers, Donald Myers and Julian Oritz.
Jeff Myers and Julian Oritz also supplied papers on the subject.
Lorenz Dobler wrote:
maybe you should have a look at the ph-thesisi of william l. wingle:
"evaluating subsurface uncertainty using modified geostatistical techniques"
and the manual of hie softwarw UNCERT.
Ruben Roa wrote:
This might be relevant to your question:
Lahiri, Lee, and Cressie. 2002. On asymptotic distribution and asymptotic
efficiency of least squares estimators of spatial variogram parameters.
Journal of Statistical Planning and Inference 103:65-85.
Tom Mueller wrote:
In a paper (Mueller, T.G., F.J. Pierce, O. Schabenberger, and D.D. Warncke.
2001. Map quality for site-specific management. Soil Sci. Soc. Am. J. 65:
1547-1558), I suggest that the variance of the semivariogram at each lag may
explain why map quality was poorer than expected in this study. I also
suggest that this variance, particularly at earlier lags may be a good
indicator of map quality. Bottom line is that semivariogram models are
still just field average models of spatial autocorrelation. Since its not a
Gaussian distribution (more like a Chi square distribution) it may not be
appropriate to use CV's do describe the semivariogram cloud. Something I've
been looking at recently is Hawkins and Cressie's robust estimator. It
transforms the semivariogram distribution into a normal distribution to
calculate the semivariogram and then rescales it back. I've found RMSE
values are smaller with this approach more often than not with jack-knife
Sean McKenna wrote:
Digby, the following paper looked at variogram uncetainty as a function of
the data by sequentially removing one data point at a time, recalculating
the variogram and then examining the final distribution of gamma values at
each lag. It may not be exactly what you are looking for, but you may find
Wingle, W.L., and E.P. Poeter, 1993, Uncertainty Associated with
Semivariograms Used for Site Simulation, Ground Water, Vol. 31, No. 5, pp.
Jeff Myers wrote:
This is a bit off the subject, but still related to the variogram questions
you've been asking. I've attached a paper that describes a typecasting of
error approach to select optimum variogram/kriging parameters. It can be
used either to select the best variogram model or to select between
different types of models (inverse distance, kriging, etc.) as long as you
have a cutoff value. With this approach, it doesn't matter how good the
model looks, just how well it performs. Apologies if I've sent this before.
Donald Myers wrote:
You have raised some interesting and provocative questions. Let me add a
couple of observations
1. The data always consitutes an incomplete picture. While we would like to
believe (or at least we often act as though we would) that there is a unique
choice of a variogram or covariance for any given problem, that is clearly
not the case.
2. Whether you use the ordinary sample variogram or the Cressie-Hawkins
estimator or something else, it is at most an estimator.
Moreover, for a given data set there is not even a unique choice for the
sample variogram. That is, one must make choices about the lag distances and
the tolerances. These always represent a compromise, to best determine the
shape of the variogram model it is desirable to have as many plotted points
as possible but on the other hand one would like each plotted point to be as
"reliable" as possible. These two always work against each other (for a
given data set).
Strictly speaking, all of the various variogram estimators (or
covariance estimators) replace an "ensemble" average with a "spatial"
average, the validity of this depends on the ergodicity assumption which is
not testable for a given data set.
It is obvious that there is some degree of interdependence between the
plotted values for one lag versus the plotted values for another lag (e.g.,
the same data location is used for multiple lags). There was a paper in Math
Geology several years ago that computed a more "appropriate" number of
pairs for each lag, however this was dependent on an assumption of
It is pretty common practice to not plot the sample variogram for a
distance that exceeds half the maximum distance across the data location
set. In fact in many cases we model the variogram against an even shorter
distance (and for various reasons, some practical and some theoretical)
3. The kriging equations (whichever form you are using) do not use the
variogram (or covariance) estimator, i.e., its values but rather use a
model. The kriging equations use directly (and indirectly) only two pieces
of information, the spatial correlation function and the mean (the latter
only indirectly). As we all know, the mean and the variance are not
sufficient to characterize a probability distribution so it is not
surprising that the variogram and the mean do not always do a good or
adequate job for a given data set.
4. Some of the points or questions you raise pertain more to how well the
variogram estimator works for a given data set rather than as a property of
the variogram model.
5. Strictly speaking, the plotted values of the sample variogram (or other
estimator) are averages and hence we should consider comparing them to
averaged values of the model. Several years ago I had a student look at this
and found that it didn't seem to make much difference. It is common practice
to plot the sample variogram value against the center point of the tolerance
interval but there is not theoretical reason why this has to always be done.
6. Scale or support in the data always plays a role and it certainly can
affect the C.V.
I agree with you that it would be interesting to be able to associate in
some way particular variogram model types or variogram model parameters with
given types of data sets, i.e., data from particular phenomena. First of
all, I am sure that it has never been done (there was a paper about this
idea at a meeting in Paris in the spring of 1995, the meeting was held at
the UNESCO, not sure it was published) but I really question whether it is
possible. Part of the problem is that even if you collected examples from
all the published papers there is no standardization as to how data was
collected nor how it was treated.
Donald E. Myers
Julian Ortiz wrote:
You could be interested in looking at a paper we published with Dr.
Clayton Deutsch on
Math Geology, regarding the calculation of the uncertainty in the
We proposed two different methods (that ended up giving the same
A simulation approach to evaluate the uncertainty (somehow related to a
and a analytical approach, both based on the multi-Gaussian hypothesis.
Hope this helps.
Ortiz C., J. and C. V. Deutsch. "Calculation of uncertainty in the
variogram", Mathematical Geology,
V.34, N.2, Feb. 2002, pp.169-183.
* To post a message to the list, send it to ai-geostats@...
* As a general service to the users, please remember to post a summary of any useful responses to your questions.
* To unsubscribe, send an email to majordomo@...
with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org