[ai-geostats] Re: Sill versus least-squares classical variance estimate
- I understand why it is not appropriate to force the sill so it matches the
sample variance. My question is, why estimate the overall variance by the
sill value when data are actually correlated?
On Tue, 7 Dec 2004, Isobel Clark wrote:
> We are talking about estimating the variance of a set
> of samples where spatial dependence exists.
> The classical statistical unbiassed estimator of the
> population variance is s-squared which is the sum of
> the squared deviations from the mean divided by the
> relevant degrees of freedom. If the samples are not
> inter-correlated, the relevant degrees of freedom are
> (n-1). This gives the formula you find in any
> introductory statistics book or course.
> If samples are not independent of one another, the
> degrees of freedom issue becomes a problem and the
> classical estimator will be biassed (generally too
> small on average).
> In theory, pairs of samples beyond the range of
> influence on a semi-variogram graph are independent of
> one another. In theory, the variance of the difference
> betwen two values which are uncorrelated is twice the
> variance of one sample around the population mean.
> This is thought to be why Matheron defined the
> semi-variogram (one-half the squared difference) so
> that the final sill would be (theoretically) equal to
> the population variance.
> There are computer software packages which will draw a
> line on your experimental semi-variogram at the height
> equivalent to the classically calculated sample
> variance. Some people try to force their
> semi-variogram models to go through this line. This is
> dumb as the experimental sill is a better estimate
> because it does have the degrees of freedom it is
> supposed to have.
> I am not sure whether this is clear enough. If you
> email me off the list, I can recommend publications
> which might help you out.
> --- Meng-Ying Li <mengyl@...> wrote:
> > Hi Isobel,
> > Could you explain why it would be a better estimate
> > of the variance when
> > independance is considered? I'd rather think that we
> > consider the
> > dependance when the overall variance are to be
> > estimated-- if there
> > actually is dependance between values.
> > Or are you talking about modeling sill value by the
> > stablizing tail on
> > the experimental variogram, instead of modeling by
> > the calculated overall
> > variance?
> > Or, are we talking about variance of different
> > definitions? I'd be
> > concerned if I missed some point of the original
> > definition for variances,
> > like, the variance should be defined with no
> > dependance beween values or
> > something like that. Frankly, I don't think I took
> > the definition of
> > variance too serious when I was learning stats.
> > Meng-ying
> > > Digby
> > >
> > > I see where you are coming from on this, but in
> > fact
> > > the sill is composed of those pairs of samples
> > which
> > > are independent of one another - or, at least,
> > have
> > > reached some background correlation. This is why
> > the
> > > sill makes a better estimate of the variance than
> > the
> > > conventional statistical measures, since it is
> > based
> > > on independent sampling.
> > >
> > > Isobel