## AI-GEOSTATS: Fwd: Re: GEOSTATS: spatial sampling

Expand Messages
• Horatio: I m with you 100%. We ve had this discussion before, and I saved the responses. The one below is the one I thought was most comprehensive, from
Message 1 of 1 , Mar 30, 2001
Horatio:

I'm with you 100%. We've had this discussion before, and I saved
the responses. The one below is the one I thought was most comprehensive,
from Philippe Aubry. I'm with him until the last paragraph, where he
says that classical statistical tests require spatial independence--
I disagree. There are implicit assumptions that the sample represents
the population, but as long as the correlation or other statistic is
drawn appropriately from the population, why should spatial autocorrelation
matter? Spatial location is simply an unmeasured attribute or variable
describing the sample unit. There may be many other unmeasured attributes
as well. Who are we to tell scientists that this one -- spatial location --
must be considered in every statistical test, regardless of the question
of interest?

Remember all those examples in introductory statistics with
marbles being drawn out of a hat? What, now we can't do the test unless
we know where in the hat they are? (in other words, if you've lost

If someone has a better explanation/argument, I'd like to hear it.

Yetta

------- Forwarded Message
Date: Mon, 02 Nov 1998 17:00:22 +0100 (MET)
From: paubry@...-lyon1.fr (Philippe Aubry)
Subject: Re: GEOSTATS: spatial sampling
To: ai-geostats@...

Dear colleagues (I apologize for my poor grammar and vocabulary)

In response to Yetta:

>Let me ask this though. I hear this often, that using data that may be
>spatially autocorrelated
>violates the independence assumption. Maybe I'm wrong, but my
>understanding is that the
>correlation structure of the sample has no bearing on the independence
>assumption of the
>sampling. The main purpose of the independence assumption is to ensure
>that the sample
>is representative of the population that it will be used to make
>the only requirement is that the sample be drawn in a random manner or
>according to some
>design where the inclusion probabilities are known. There can be all
>kinds of dependencies and
>relationships among the actual sample points, but as long as these
>represent dependencies and
>relationships that are also properties of the population, its not a problem.

-------------------

Simple Random Sampling theory applies whatever the underlying population
structure, and therefore, SRS estimator for the variance of the spatial
mean holds, even with spatially autocorrelated populations. Moreover, the
sampling distribution of the spatial mean is Gaussian (central limit
theorem). Consequently, if sampling is SRS, calculation of the confidence
interval of the spatial mean is straightforward.

For every sampling design (defined by the set of all possible samples,
first-order inclusion probabilities and second-order inclusion
probabilities) with second-order inclusion probabilities greater than zero,
the Horvitz-Thompson estimator for the variance of the mean is unbiased,
without regards to the spatial autocorrelation structure.

Of course, with purposive sampling (non probabilistic sampling), or even
systematic sampling (many second order inclusion probabilities equal to
zero), using the SRS estimator for the variance of the mean can lead to an
inclusion bias which magnitude depends on the underlying spatial
autocorrelation. In such a case, one should turn to model-based inference
since design-based inference is either impossible (purposive sampling) or
problematic (systematic sampling).

-----------------------

Speaking about "stochastic independence" or "statistical independence" for
the data is a non-sense since the data alone do not give a set of random
variables. Random variables are introduced by a stochastic mecanism.
Drawing SRS is a way of producing stochastically independent random
variables (the first RV is for all the first values we draw by repeating
the samplign scheme, the second RV is for all the second values and so on
... the nth RV is for all the nth values we draw). Assuming a
superpopulation model (geostatistics use random functions as
superpopulation models) from which the population is one realization is
another way to introduce stochasticity in order to perform statistical
inference. But now the RV must be statisically dependent (in the model) if
the data are spatially dependent (in the reality) or statistical inference
about the population will be very poor.

------------------------

For classical statistical tests it is required that the data are spatially
independent. Spatial independence (= no spatial autcorrelation, for all
lags = pure nugget) is similar to experimental independence for
biostatistics. With spatially autocorrelated data, it is necessary to take
into account the spatial dependence when assessing the p-value of any
statistic (i.e. the Pearson correlation coefficient between two
regionalized variables).

Hope this help

Best regards

Philippe AUBRY

-----------------------------------------
Laboratoire de Biometrie
UMR CNRS 5558
Universite Claude Bernard - Lyon 1
43 bd. du 11 Novembre 1918
69622 VILLEURBANNE Cedex
FRANCE
-----------------------------------------
private fax number : 04.72.74.47.46
-----------------------------------------
e-mail : paubry@...-lyon1.fr
-----------------------------------------

--
*To post a message to the list, send it to ai-geostats@....
*As a general service to list users, please remember to post a summary
of any useful responses to your questions.
*To unsubscribe, send email to majordomo@... with no subject and
"unsubscribe ai-geostats" in the message body.
DO NOT SEND Subscribe/Unsubscribe requests to the list!
------- End of Forwarded Message

-
------------------------------------------------------
Yetta Jager
Environmental Sciences Division
Oak Ridge National Laboratory
P.O. Box 2008, MS 6036
Oak Ridge, TN 37831-6036
U.S.A.

OFFICE: 865/574-8143
FAX: 865/576-8543
Work email: jagerhi@...
Home email: hjager@...
WEBpage: http://www.esd.ornl.gov/~zij/

******************************************************************
There are three kinds of lies: lies, damned lies, and statistics
Mark Twain
******************************************************************
-----------------------------------------------------

--
* To post a message to the list, send it to ai-geostats@...
* As a general service to the users, please remember to post a summary of any useful responses to your questions.
* To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org
Your message has been successfully submitted and would be delivered to recipients shortly.