GEOSTATS: Cross-validations: summary
- Greetings again,
here is a summary of the answers I received to my question about papers on
Cross-validation techniques in spatial statistics.
Thanks a lot to Frank Hardisty, Dan Cornford, Tom Nolan, Maciej Tomczak and
Ana F. Militino
Papers about Cross-validations:
1) Owosina, A., U. Lall, T. Sangoyomi, and K. Bosworth, Methods for Assessing
the Space and Time Variability of Groundwater Data, Utah Water Res. Lab., Utah
State Univ., 1992.
Owosina et al.  compare two multivariate kernel regression estimators,
MARS, LOESS, TPSS and Kriging for reconstructing spatial surfaces from a
variety of irregularly sampled synthetic (with varying signal to noise ratios)
and ground water data sets. Model parameters were chosen automatically using
cross validatory measures in all cases. In terms of RMSE and Mean Absolute
Deviation, overall algorithm ordering (best to worst) across the data sets was
TPSS, LOESS, KERNEL, MARS, KRIGING. The differences between the best and worst
algorithm were dramatic in some cases. Methods for interpolating ground water
data irregularly sampled in space and time were also illustrated.
(Found at http://earth.agu.org/revgeophys/lall01/node7.html)
2) Davis, B.M., 1987, "Uses and abuses of cross-validation in geostatistics,"
Mathematical Geology, v.19, n.3, p. 241-248.
It discusses some common misconceptions concerning cross-validation. For
example, use of statistical criteria supposedly yields an optimal
semivariogram from among competing models. But Davis states that the
semivariogram is only "best" with respect to "choice of discrepancy measure,
partition set size, predictive function, and number of models to be
3) Isaaks & Srivastava. (1989. Applied Geostatistics) are discussing the use
of Cross-validations pages 533 & 534. A more applied discussion on CV can be
found in the pages 352-368.
4) Ana F. Militino (militino@...) and Lola Ugarte have recently
submitted a closely related paper titled "Assessing the covariance function in
geostatistics". The method proposed improves in case of unequally spaced data
the traditional use of cross-validation in this field.
Comments on the error measures that should be minimised
The choice of the error measures will depend on the application and the data.
If the data is Gaussian then the standard sum of squares (root mean square
error) is probably a good measure to use, but this depends on the cost
function which is derived from what one wants to achieve.
I also received few comments on the use of prior knowledge in defining the
parameters to be used but more discussion about it is outside the scoop of
this summary (but welcome of course on the mailing list).
Thank you again very much for the kind help.
Section of Earth Sciences
Institute of Mineralogy and Petrography
University of Lausanne
Currently detached in Italy
Get free email and a permanent address at http://www.netaddress.com/?N=1
*To post a message to the list, send it to ai-geostats@....
*As a general service to list users, please remember to post a summary
of any useful responses to your questions.
*To unsubscribe, send email to majordomo@... with no subject and
"unsubscribe ai-geostats" in the message body.
DO NOT SEND Subscribe/Unsubscribe requests to the list!