AI-GEOSTATS: when is spatial autocorrelation important?
- I am posing two questions to the list: An open-ended question that I hope
will spark some discussion, and a specific question regarding a project
that I am currently working on.
The open ended question: When estimating a parameter of a spatial
distribution (e.g., the mean), or when comparing estimates made for two or
more geographic regions, when is it important to consider spatial
autocorrelation ? The specific question: An example of the latter of
immediate interest to me involves testing the rates at which air samples
collected in ___ census block groups fail a health-based benchmark. One
of my tasks is to determine if these rates are homogeneous across the
census block groups that are located within the study area. Would it be
valid to simply compare the rates among the census block groups using a
Chi-square test, even if the rates show spatial autocorrelation, provided
the data is from a random sample?
In addition to the lattice data described above, I also have event data
that represent multiple observations of pass/fail tests for the same
locations (i.e., each location was cleaned and tested until a 'passing'
result was obtained). In a 'non-spatial' data analysis framework, I
planned to fit a geometric distribution to this data to estimate the
overall pass/fail rate for the entire study area. If the sample size is
sufficient, I would also like to fit a geometric distribution to each
census block group (or some other sub-region of the study area) and then
test for differences in the geometric distribution across the census block
Are 'non-spatial' methods valid if the data were collected using random
sampling methods? For example, there is some literature (e.g., Brus and de
Gruijter, 1997; Cressie, 1996) that argues 'non-spatial' methods provide
valid estimates of the mean, provided the data were collected using random
sampling methods. The authors argue that the 'independent' of the familiar
'i.i.d' assumption is satisfied by the random selection method; i.e., its a
characteristic of the sampling plan, not the variable (or
attribute). [Brus and de Gruijter (1997) and Cressie ( 1996) both found
the spatial methods were more efficient for small spatial domains (small
relative to the number of available observations), while the non-spatial
methods were more efficient for large spatial domains.] Any comments on this?
If the data were not gathered using random sampling methods (they were
not!), do the spatial methods tend to be more robust to the random sampling
assumption than 'non-spatial methods'?
I look forward to hearing your responses. I would really appreciate
* To post a message to the list, send it to ai-geostats@...
* As a general service to the users, please remember to post a summary of any useful responses to your questions.
* To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org