Loading ...
Sorry, an error occurred while loading the content.

[ai-geostats] Re: descriptive statistics or inference?

Expand Messages
  • Meng-Ying Li
    Isobel, I agree with you that no estimation is needed if we have the population, and that s what I said in the beginning of my last discussion. I m saying that
    Message 1 of 9 , Dec 8, 2004
    • 0 Attachment
      Isobel,

      I agree with you that no estimation is needed if we have the population,
      and that's what I said in the beginning of my last discussion. I'm saying
      that when variance and sill in a population doesn't match, I'll have
      concern when I have to use sill in a sample to estimate the
      population variance.

      Meng-ying

      On Wed, 8 Dec 2004, Isobel Clark wrote:

      > > And just a personal opinion, I would like to think
      > > geostatistic
      > > theories apply to population of any size, as small
      > > as 27, or as large as
      > > 1,000,000. If I'm making an example that
      > > geostatistics doesn't apply, then
      > > there's something to concern about in this approach.
      > Geostatistics applies to any size of sample set but
      > for the theory to work ou have to have a relatively
      > enormous population to draw rom.
      >
      > Put in plain terms, the assumption is that the
      > withdrawal of the samples does not materially affect
      > the behaviour of the population.
      >
      > If you have the whole population, you don't need to do
      > tests or estimates.
      >
      > Isobel
      > http:geoecosse.bizland.com/books.htm
      >
    • Digby Millikan
      Meng-Ying, Even if your population variance and sill do not match identically, the sample sill should still be a better estimate than the sample variance, when
      Message 2 of 9 , Dec 8, 2004
      • 0 Attachment
        Meng-Ying,

        Even if your population variance and sill do not match identically,
        the sample sill should still be a better estimate than the sample variance,
        when you consider the amount of clustering which occurs in sampling.

        Digby
      • Meng-Ying Li
        ... variance, ... All right, if you think the clustering of data values (I m not talking about clustering of locations) are not be part of the representation
        Message 3 of 9 , Dec 8, 2004
        • 0 Attachment
          On Thu, 9 Dec 2004, Digby Millikan wrote:

          > Meng-Ying,
          >
          > Even if your population variance and sill do not match identically,
          > the sample sill should still be a better estimate than the sample
          variance,
          > when you consider the amount of clustering which occurs in sampling.
          >
          > Digby
          >

          All right, if you think the clustering of data values (I'm not talking
          about clustering of locations) are not be part of the representation of
          population.

          I just found an example that I can use as population, with 2500 points
          and it's 2-D (in the GSLIB manual, second realization of SGSIM.OUT-- if
          you happen to have this data set) and found the sill and variance of this
          population not matching (sill~20, variance=18.63).

          I intended to use a smaller sample so everyone can have fun playing the
          data (even if you use M$-Excel to calculate the variogram), which also
          speaks out more what I'd like to say. But seems like people are more
          interested in discussing the size of population. . . I'll leave it here
          then, if nobody found any problem estimating population variance using the
          sill value. Maybe I'm just psychologically not comfortable estimating
          variance like that. . . (I'll probably follow you people if I found no
          theoretic derivation for my thinking.)

          It's fun discussing with you people though, and I'm happy to have this
          much discussion for my debut.


          Meng-ying
        • Digby Millikan
          Meng-ying, I did mean the clustering of locations, if the sample is evenly and or randomly spread your s^2 estimate will be no problem, it s when the data is
          Message 4 of 9 , Dec 9, 2004
          • 0 Attachment
            Meng-ying,

            I did mean the clustering of locations, if the sample is evenly and or
            randomly
            spread your s^2 estimate will be no problem, it's when the data is clustered
            in
            locations I believe removal of these data improves the estimate, the less
            clustering
            the less improvment, the higher the clustering, the higher the improvment.
            Spatially
            clustered data has correlation which is picked up in the sub-range portion
            of the
            variogram.

            So the variance is 93% of the sill for the population which adds credence
            to your
            argument. What would be interesting is to see the results we get from a
            sample with
            some clustering i.e. a sample variogram sill closer to 18.63. I'm interested
            in this
            method for use with mine sampling data. I have GSLIB, but no compiler. Are
            you
            developing a sample subset with some clustering, or can you send me the
            coordinated
            SGSIM.OUT?

            Digby

            >
            > All right, if you think the clustering of data values (I'm not talking
            > about clustering of locations) are not be part of the representation of
            > population.
            >
          • Digby Millikan
            Meng-ying, Sorry I don t have time to process the data yet, but a summary so far, is that geostatisitcs was originally developed in the mining field, and for
            Message 5 of 9 , Dec 12, 2004
            • 0 Attachment
              Meng-ying,

              Sorry I don't have time to process the data yet, but a summary so far,
              is that geostatisitcs was originally developed in the mining field, and
              for the application of mineral resource assesment with which I am involved
              where sampling patterns are generally clustered and the use of the sill of
              the
              variogram can be used as an estimate of the variance at your discression.
              As Isobel says this is where the half originates in the equation for the
              experimental variogram, as the sill will often approximate the variance.
              However with many applications of geostatisitics much more detailed and
              regular sampling patterns are used (even in mining e.g. soil geochemistry),
              in which case the sill is not an exact estimator (I'm not sure of the
              correct
              terminology here) of the variance and should be treated as such, though
              Isobel
              seems to be more familiar with the mathematics of this, and it might be
              interesting a further study of this.

              Digby
            • Isobel Clark
              Digby The variance/sill relationship is theoretical and does not depend on the layout of the samples, regular or clustered. Since the sill only uses pairs
              Message 6 of 9 , Dec 12, 2004
              • 0 Attachment
                Digby

                The variance/sill relationship is theoretical and does
                not depend on the layout of the samples, regular or
                clustered. Since the sill only uses pairs where
                samples are uncorrelated from one another, the
                clustering is irrelevant.

                It does depend on the distribution of the samples
                values being 'stationary', that is having constant
                mean and variance over the study area. It also depends
                on that distribution having a valid variance. For
                example, the variance of samples from a lognormal
                distribution depends on the average of those samples -
                hence the proportional effect.

                All of this is explained in any basic geostatistics
                book, including Matheron's original Theory of
                Regionalised Variables and my Practical Geostatistics
                (Chapter 3) which cn be freely downloaded from
                http://geoecosse.bizland.com/practica.htm

                Isobel
                http://uk.geocities.com/drisobelclark/seasonsgreetings.htm
              • Meng-Ying Li
                Like Isobel mentioned, the sill only uses pairs where samples are uncorrelated from one another, and in this case the clustering is irrelevant. And I totally
                Message 7 of 9 , Dec 12, 2004
                • 0 Attachment
                  Like Isobel mentioned, the sill only uses pairs where samples are
                  uncorrelated from one another, and in this case the clustering is
                  irrelevant.

                  And I totally agree with that. The crucial thing, Digby, is that you want
                  to make sure the variance estimated reflects the characteristic of what
                  you actually wanted, regardless of the terminology that may or may not be
                  stated for different purposes.


                  Meng-ying

                  On Sun, 12 Dec 2004, Isobel Clark wrote:

                  > Digby
                  >
                  > The variance/sill relationship is theoretical and does
                  > not depend on the layout of the samples, regular or
                  > clustered. Since the sill only uses pairs where
                  > samples are uncorrelated from one another, the
                  > clustering is irrelevant.
                  >
                  > It does depend on the distribution of the samples
                  > values being 'stationary', that is having constant
                  > mean and variance over the study area. It also depends
                  > on that distribution having a valid variance. For
                  > example, the variance of samples from a lognormal
                  > distribution depends on the average of those samples -
                  > hence the proportional effect.
                  >
                  > All of this is explained in any basic geostatistics
                  > book, including Matheron's original Theory of
                  > Regionalised Variables and my Practical Geostatistics
                  > (Chapter 3) which cn be freely downloaded from
                  > http://geoecosse.bizland.com/practica.htm
                  >
                  > Isobel
                  > http://uk.geocities.com/drisobelclark/seasonsgreetings.htm
                  >
                  >
                • Digby Millikan
                  Isobel, Thankyou, in relation to Colins work then the variance may be estimated from the sill of the variograms for the two orebodies and if the two orebodies
                  Message 8 of 9 , Dec 12, 2004
                  • 0 Attachment
                    Isobel,

                    Thankyou, in relation to Colins work then the variance
                    may be estimated from the sill of the variograms for
                    the two orebodies and if the two orebodies had lognormal
                    distributions, they may have a different mean and variance,
                    but may still display the proportional effect, i.e. similar
                    coefficients of variation in which case Geostatistical
                    Ore Reserve Estimation, pp172 M. David points out
                    that lognormal kriging may be avoided, from what I
                    understand as relative variograms may be used instead
                    of lognormal variograms.

                    Digby
                  • Meng-Ying Li
                    Hi people, Finally I got the point of argues for the estimation of population variance. What I had in mind as an overall variance is the variance of all
                    Message 9 of 9 , Dec 14, 2004
                    • 0 Attachment
                      Hi people,

                      Finally I got the point of argues for the estimation of population
                      variance.

                      What I had in mind as an "overall" variance is the variance of all
                      possible locations in any realization of the random field, while Isobel
                      and some other people are trying to explain to me is the variance of all
                      possible realization at any location of the random field.

                      I realized, by noticing this, why along the discussion the stationarity
                      and existence of variance has been emphasized. If the random field is not
                      stationary then we'll have no consistant population variance as Isobel
                      explained. I also learned that my understanding of the population variance
                      has a name called "areal variance."

                      Now, it should be clear why I emphasized on the expected variance of the
                      "future samples." This would be the variance of the any possible sample
                      taken from the current realization (which I called "population"
                      previously) by some planned sampling scheme. And this variance will have
                      to do with the clustering or non-clustering of the future sampling
                      scheme. I'm aware, of course, that in practice "future" samples may no
                      longer be taken from the current realization since in the real case
                      the study site would be changing. Calling it a "future" sample is
                      just a convenient saying for the expected variance based on possible
                      sampling schemes.


                      Hope I'm not getting things more confused.


                      Mng-yng
                    Your message has been successfully submitted and would be delivered to recipients shortly.