## GEOSTATS: Regression, aggregation and spatial autocorrelation : advice needed!

Expand Messages
• Dear all - I am hoping that some of the subscribers to this group will be kind enough to give me some (geo)statistical advice which is related to a project I
Message 1 of 1 , Jun 16, 1999
Dear all -

I am hoping that some of the subscribers to this group will be kind enough
to give me some (geo)statistical advice which is related to a project I am
currently working on. I have only a little background in geostats, so the
simpler the explanations the better :)

My research is based on using remote sensing information to predict the
occurrence of C4 species in a grassland. More specifically, I propose that
the ratio of early to late season productivity (E/L) as derived from
remote sensing information will correlate with C4 abundance. I also
propose that the nature of this relationship will be scale (resolution)
dependent.

In order to test this theory, I defined 3 field sites. Each was a 1ha grid
in which 72 points are spatially nested (see Webster; Webster and Oliver;
Belleheumer and Legendre; etc for nested sampling). At each of these 72
plots (0.5m resolution), E/L and C4 abundance were measured. I then ran a
simple linear regression model of E/L (X) on %C4 (Y) in order to describe
the functional relationship between the variables. I then aggregated these
samples (averaging) in order to look at these relationships at coarser
resolutions (2.5m, 10m and 50m). Thus, at coarser resolutions I have a
smaller number of points which represent a larger area than the original
points. At each of these resolutions, I then re-ran the regression
analysis on the "new" data points. My results indicate that estimates of
slope and intercept change with resolution, and R^2 values increase
nonlinearly, increasing from 0.5m to 2.5m and 10m, where they remain
constant to 50m.

However, here is my problem : I read a chapter by Bian in "Scaling remote
sensing and GIS", who got similar results to my study (although they used
elevation and biomass as variables). I would like to interpret my data (as
they did) as the results being from finer-resolution variations being
filtered out at coarser resolutions, giving increasing covariation between
variables. BUT, they also mention that spatial autocorrelation can
significantly affect R^2 values. They do not expand on this. I have read a
number of texts (Ripley, Cressie, Cliff and Ord) which were a little heavy
on the theory and papers on aggregation, but they did not help......

Given this, my question is simply "What can I infer wrt my results"?
Should I :

(a) not use regression techniques because of the problem of spl autocrln,

(b) use the techniques and acknowledge their limitations (given that at
present I cannot model the pattern of autocorrelation, (see below)), or

(c) do something else?

If I could find the "real" R^2 values that would be a big help. I have
analyzed my data in order to get a sense of spatial autocorrelation.
Variogram estimation is inconclusive as to whether a pattern exists, and
nested ANOVA suggests that "patchiness" exists in both datasets, but not
to a great degree (few statistically significant jumps in variance between
resolutions). I have scoured the ecological literature, and I find very
few references to this problem. Certainly, many people must have been
posed with this dilemma?????

Any help would be appreciated.

Andrew Davidson

Department of Geography, Tel : (416) 978 5070
University of Toronto, email : andy.davidson@...
100 St George Street,