Loading ...
Sorry, an error occurred while loading the content.

Re: GEOSTATS: spatial sampling

Expand Messages
  • Philippe Aubry
    Dear colleagues (I apologize for my poor grammar and vocabulary) ... Simple Random Sampling theory applies whatever the underlying population structure, and
    Message 1 of 5 , Nov 2, 1998
    • 0 Attachment
      Dear colleagues (I apologize for my poor grammar and vocabulary)

      In response to Yetta:

      >Let me ask this though. I hear this often, that using data that may be
      >spatially autocorrelated
      >violates the independence assumption. Maybe I'm wrong, but my
      >understanding is that the
      >correlation structure of the sample has no bearing on the independence
      >assumption of the
      >sampling. The main purpose of the independence assumption is to ensure
      >that the sample
      >is representative of the population that it will be used to make
      >inferences about. Therefore,
      >the only requirement is that the sample be drawn in a random manner or
      >according to some
      >design where the inclusion probabilities are known. There can be all
      >kinds of dependencies and
      >relationships among the actual sample points, but as long as these
      >represent dependencies and
      >relationships that are also properties of the population, its not a problem.


      Simple Random Sampling theory applies whatever the underlying population
      structure, and therefore, SRS estimator for the variance of the spatial
      mean holds, even with spatially autocorrelated populations. Moreover, the
      sampling distribution of the spatial mean is Gaussian (central limit
      theorem). Consequently, if sampling is SRS, calculation of the confidence
      interval of the spatial mean is straightforward.

      For every sampling design (defined by the set of all possible samples,
      first-order inclusion probabilities and second-order inclusion
      probabilities) with second-order inclusion probabilities greater than zero,
      the Horvitz-Thompson estimator for the variance of the mean is unbiased,
      without regards to the spatial autocorrelation structure.

      Of course, with purposive sampling (non probabilistic sampling), or even
      systematic sampling (many second order inclusion probabilities equal to
      zero), using the SRS estimator for the variance of the mean can lead to an
      inclusion bias which magnitude depends on the underlying spatial
      autocorrelation. In such a case, one should turn to model-based inference
      since design-based inference is either impossible (purposive sampling) or
      problematic (systematic sampling).


      Speaking about "stochastic independence" or "statistical independence" for
      the data is a non-sense since the data alone do not give a set of random
      variables. Random variables are introduced by a stochastic mecanism.
      Drawing SRS is a way of producing stochastically independent random
      variables (the first RV is for all the first values we draw by repeating
      the samplign scheme, the second RV is for all the second values and so on
      ... the nth RV is for all the nth values we draw). Assuming a
      superpopulation model (geostatistics use random functions as
      superpopulation models) from which the population is one realization is
      another way to introduce stochasticity in order to perform statistical
      inference. But now the RV must be statisically dependent (in the model) if
      the data are spatially dependent (in the reality) or statistical inference
      about the population will be very poor.


      For classical statistical tests it is required that the data are spatially
      independent. Spatial independence (= no spatial autcorrelation, for all
      lags = pure nugget) is similar to experimental independence for
      biostatistics. With spatially autocorrelated data, it is necessary to take
      into account the spatial dependence when assessing the p-value of any
      statistic (i.e. the Pearson correlation coefficient between two
      regionalized variables).

      Hope this help

      Best regards

      Philippe AUBRY

      Laboratoire de Biometrie
      UMR CNRS 5558
      Universite Claude Bernard - Lyon 1
      43 bd. du 11 Novembre 1918
      69622 VILLEURBANNE Cedex
      private fax number :
      e-mail : paubry@...-lyon1.fr

      *To post a message to the list, send it to ai-geostats@....
      *As a general service to list users, please remember to post a summary
      of any useful responses to your questions.
      *To unsubscribe, send email to majordomo@... with no subject and
      "unsubscribe ai-geostats" in the message body.
      DO NOT SEND Subscribe/Unsubscribe requests to the list!
    Your message has been successfully submitted and would be delivered to recipients shortly.