AI-GEOSTATS: summary: analysis, interpretation, software questions
- This is a summary of responses to my long, multipart question.
Thanks a lot to everyone who responded. I really appreciate it.
Some general sources suggested:
(Look under spatial statistics and the geostatistics course)
Practical Geostatistics 2000
JC Davis' book on Statistics in Geology
Mathematical Geology vol 31 No 4 1999 p.375-391
"An Experimental Comparison of Ordinary and Universal Kriging and
Inverse Distance Weighting" by Zimmermann et. al.
Below, answers are pasted after the question they refer to. dashed line
indicates a different person responding.
> Hello. I have several fairly basic questions about geostatisticalIf you have a trend, this shows as a parabolic rise in
> analysis, interpretation and software. I received great advice from this
> list on a methods question about a year ago (now I'm analyzing the data),
> so I thought I'd try the list on these questions. I've floundered through
> the archives, and some introductory books, but I'm still have some
> questions. I don't think it should be relevant for these questions, but
> just in case: I'm studying the ecology, distribution, dispersal of a
> parasitic plant and the data are on host characteristics, parasite
> prevalence etc. If you can help with even one of these questions (or
> correct any erroneous assumptions I have), I would appreciate hearing. I
> will summarize responses.
> 1) In R (S plus clone), I can calculated trend surfaces and used
> the residuals to calculate detrended variograms or correlograms. In
> Geoeas and Variowin I can calculate directional (setting angels and
> tolerance) or omnidirectional variograms or correlograms.
> How does a detrended variogram compare to an omnidirectional or
> directional variogram? What are the pros and cons of these methods? (the
> variograms from the two methods look quite different, although the
> correlograms look quite similar)
the semi-variogram graph. Directional semi-variograms
are for picking up differences in continuity in
different directions AFTER the removal of trend. It is
possibel to have a trend (a change in mean value) and
anisotropic continuity in the residuals from the
As an example, imagine you are looking at the
characteristics of deposition on a river bed. You
would expect a trend because values closer to the
banks will not be the same as those in the main flow
bed. After removing this, you will still have more
rapid changes (variability) across the flow of water
than you would expect in the direction of flow. These
are two different parts of the structure.
what package do you use for the analysis in R?
The MASS by Ripley/Venables or sgeostat??
My guess is you calculate a trend surface as a polynominal of the (x,y)
coordinates. If so, a problem arise in the fitting since you expect the
residuals to be correlated. The package nlme is more appropriate since it
fits a model while allowing for correlated errors.
The choice on using a "detrended" variogram (i.e. universal kriging?) or
some other type is a choice
of model. In the latter case you assume that spatially close observations
are similar. The fist case assumes that the variable of interest is
influenced by variates that don't have to be similar in a spatial sense
but yet their influence on the variable will be spatially similar.
BUT, one question in case you model the trend surface as a function of the
(x,y) coordinates: is this a valid model assumption for your data?? I
mean, does the plant under investigation change its frequency of
occurrence by coordinates (which could perfectly be in case of a study area
will hills and thus different relative humidity and/or precipitation).
In case the answer is no: do you have a sample of your regressors as host
characteristics for the points to be predicted? If yes, nlme with the nlme
function is again your choice of software because you can then flexibly
model the trend as a function of said regressors.
You can also extract empirical variograms of the residuals (~ a detrended
> 2) What exactly is the difference/ pros and cons of correlogramsThere is no difference between correlograms and
> versus variograms? My correlograms look like almost pure nugget (except
> perhaps at very small lags), but the detrended variograms have interesting
> shapes - decreasing, upside down U, typical rise with plateau - how do I
> interpret this? Does one give me more information than the other?
semi-variograms, but you have to detrend before you do
EITHER. Correlograms are based on slightly more severe
assumptions than semi-variograms, but both demand the
absence of trend.
> 3) Where does kriging fit in? My understanding is that this is forYou are absolutely right. The kriging stage is for
> prediction (and uses the variogram). I am mainly interested in description
> of patterns at this point.
predicting values where you don't have samples. The
trend/semi-variogra/correlogram stage is for
diagnosing patterns and structures.
> 4) How do cross variograms work? I have variables that areThere are two kinds of cross variograms called
> correlated. Would cross variograms be appropriate?
co-located and non-colocated. If you have multiple
measurements on each sample, you can use both. If your
observations on one variable are at different
locations than the other, you have to use
non-co-located. This is planned for volume 2! The
short answer to your question in yes, it is the best
way to study the value and spatial relationship
between two variables.
> 5) How does one determine an appropriate degree polynomial to usethere are soma packages for geostatistical analysis using R:
> for trend surface analysis? Recommended reading for a beginner in trend
> 6) Software: Is there a way in R to determine how many pairs are
> in each bin? Or to truncate variograms (it prints the entire distance and
> it is my understanding that one shouldn't calculate variograms beyond half
> the maximum distance).
The last one is not (yet?) available at CRAN.
However some of its funcionalities can answer some of your questions.
it's available at:
for snapshots and see some (but not all!) features:
If you use sgeostat simply type the object name at the prompt.
It will give you a table like output of the data. You can also set the
max-distance and the bin-width while calculating the empirical variogram.
In addition, you can even specify directional variograms, just plow
through the manual of sgeostat.
> Can SAS do spatial statistics? Is it part of the basicJuliann: Yes, you can do much of this in SAS. You may use MIXED and the
> package or is there some sort of add on? If you've tried it, do you
> recommend it?
SAS macro GLIMMIX (which calls SAS but under generalized distributional
assumptions) to model spatially correlated outcomes as functions of
other predictors (including trend). You can use PROC G3D to plot raw data
and residuals in 3-D, and PROC VARIOGRAM and PROC GPLOT to generate
empirical correlogram and semivariance info, and plot it,
I'd start by reading the SAS chapter on performing spatial statistics in
PROC MIXED: Littell, RC, GA Milliken, WW Stroup and RD Wolfinger. 1996.
Spatial Variability. In Littell, RC, GA Milliken, WW Stroup and RD
Wolfinger, SAS System for Mixed Models. Cary, NC: SAS Institute, pp.
303-330. Since you're working with prevalence, you may want to assume a
marginal binary or binomial distribution--which can be modeled using
SAS's GLIMMIX macro (some, I believe, on this macro is in the book I've
referenced--altho I have not read that chapter; more, including primary
references, is located in the GLIMMIX macro code/documentation which
you likely loaded when you loaded STAT (if not, search for it on sas.com)).
note that you should use the version of GLIMMIX that corresponds to the
version of SAS you are using. the latest version of GLIMMIX
glmm800.sas is dated 4/11/00
> How do you make figures for publications? I like Geoeas andyou may be able to do a
> Variowin because I can see what they are doing and I can change parameters
> easily, but I have not been able to export figures. I find R frustrating
> because I don't feel like I can't see what it is doing (I'm a serious
> novice in R), but I can export the graphics. Any suggestions? (I don't have
> funds to buy software and I'm running out of time/patience for learning a
> lot of new software for this particular project).
quick export of the graphics by holding down the Alt key and hitting the
Print Screen key. Open up Word or PowerPoint and put the cursor at the
point you would like to insert the graphic. Click Edit, Paste. You should
see a rastor picture of whatever program was open on screen. Use the crop
tool to get rid of junk around the edges.
if you import into Powerpoint, you can "Save As"
the current slide as a .jpg or .gif, as well as a number of other rastor
formats. If you can "Select all" graphics in the geostats program, you
may be able to paste the results into Powerpoint or Corel Draw or Adobe
Illustrator and do more editing in these formats.
I cannot afford any commercial software and couldn't get Geoeas running
but have tried Gstat on my Linux machine. The variogram modeling is kind
of interactive and the graphs are readily exported as postscript files.
In addition, Gstat does kriging and cross-validation. The manual that
comes with it covers the material fairly well and provides a cookbook of
is the URL.
* To post a message to the list, send it to ai-geostats@...
* As a general service to the users, please remember to post a summary of any useful responses to your questions.
* To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org