Loading ...
Sorry, an error occurred while loading the content.

Re: AI-GEOSTATS: Cracks in the foundations?

Expand Messages
  • dwmccarn@aol.com
    Dear Chuck: On reflection, I guess that I should have been less absolute in my statement (You never know how your own words may come back to haunt you!). A
    Message 1 of 5 , Oct 27, 2001
    • 0 Attachment
      Dear Chuck:

      On reflection, I guess that I should have been less "absolute" in my
      statement (You never know how your own words may come back to haunt you!). A
      number of companies (including COGEMA) have very successfully applied
      geostatistical techniques to the evaluation of ore zones in roll-front
      environments. One of the approaches is to use a curvilinear block which
      matches the sinuosity of the ore body geometry (a significant handicap if you
      can't model the shape). Although it doesn't directly overcome the problem of
      leading-edge discontinuity of the ore body, with practice, well justified
      estimates can be obtained. I am intrigued, however, by your suggestion of
      using a well-fitting non-linear probability model. But for two companies
      that I've worked with, other means to obtain ore reserve estimates have been
      applied. In other sandstone U environments besides roll-fronts, the ore
      tends to be more or less continuous, and spatial statistical models are quite
      appropriate. One of these ore bodies, however, had a disappointingly very
      "flat" variogram which had very little spatial structure over a large range
      and direction. This more or less justified their use of an "average" grade
      adjusted by windsorising the high outliers when specific criteria are met.

      R = Thickness * Grade * Area * density * probability of encountering ore
      zones in the area

      I think that the above is "out of the book" for Russian & Kazakh
      practitioners.

      Regards,

      Dan ii

      In a message dated 10/27/2001 5:18:11 PM Mountain Daylight Time,
      chuckre@... writes:

      << "For roll-front uranium deposits, a geostatistical approach is avoided
      because of the deposit characteristics. The leading edge of the roll has
      generally the highest concentration of uranium and is bounded by a very
      distinct discontinuity on the reduced side. The trailing edge displays a
      more gradual tapering of grade on the oxidized side." >>


      Dan W. McCarn, AIPG CPG #10245, Wyoming PG#3031
      10228A Admiral Halsey NE
      Albuquerque, NM 87111
      +1 (505) 822-1323
      +1 (505) 710-3600
      http://resumes.yahoo.com/dwmccarn/geologist
      11 Sept 2001

      --
      * To post a message to the list, send it to ai-geostats@...
      * As a general service to the users, please remember to post a summary of any useful responses to your questions.
      * To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
      * Support to the list is provided at http://www.ai-geostats.org
    • Donald E. Myers
      Re: the article by S. Henley ( Geostatistics-cracks in the foundations? ) There was a debate of sorts in Mathematical Geology in 1987 but I don t think it
      Message 2 of 5 , Oct 28, 2001
      • 0 Attachment
        Re: the article by S. Henley ("Geostatistics-cracks in the foundations?")

        There was a debate of sorts in Mathematical Geology in
        1987 but I don't think it contributed much to understanding geostatistics
        including its strengths and weaknesses. Here are a few more thoughts.

        GENERAL COMMENTS

        All of the concerns raised in the article are well-known in the geostatistical
        literature, in particular in Mathematical Geology and Water Resources
        Research as well as in the publications emanating from the various
        APCOMS. Unfortunately the problems, strengths and weaknesses of
        geostatistics are not well described in the article, in some cases there are
        clear errors or mis-representations.

        I. Although geostatistics is clearly and deservedly associated with G. Matheron
        (and some of his early students such as A. Journel, J.-P. Chiles,
        P. Delfiner as well as others from the Fontainebleau center), his
        work is similar to or duplicates the work of Matern and Ghandin. Matheron
        acknowledges this in some of his writings. In addition as almost always
        happens, his work builds on work done earlier by many people in a
        number of fields. Finally, geostatistics is not a stagnant field, it
        has developed and evolved due to the work of many people, not always
        in directions directly related to Matheron's original work (this is
        not to belittle or denigrate Matheron's contribution but to recognize
        that geostatistics as it exists today is not the same as 30 years ago).
        Henley's article seems to imply that little has changed in the interim.

        II. Although there are aspects of the applications in hydrology that
        incorporate state equations derived from basic principles, geostatistics
        has most often been used for the analysis of spatial data when no
        state equations are available. This means that the problem is really
        ill-posed, i.e., the solution is not unique and to obtain a unique solution
        one must impose some form of model. The stochastic model implicit in
        Matheron's work serves this purpose. However, one should not be mis-led
        in thinking that this is totally artificial. There are clear connections
        with Bayesian statistics (see the work of Wahba as early as 1970),
        Thin Plate Splines and the more general interpolation methodology
        known in the numerical analysis literature as "Radial Basis Functions"
        which is a deterministic approach to the problem. Again Henley seems to
        ignore this background.

        MORE SPECIFIC COMMENTS

        III. Henley makes a great deal out of the point that kriging is a "linear
        method". This is true and not true. The kriging estimator (Simple,
        Ordinary, Universal) for the value at a non-data location or the
        spatial average over an area or volume (e.g., average grade in a mining
        block) is a linear combination of the data. Written as an interpolating
        function it is NOT linear, i.e., not a linear function of the position
        coordinates. Moreover in the case of multivariate normality, the
        Simple Kriging estimator is the conditional expectation (which is THE minimum
        variance estimator, linear or otherwise). Of course multivariate
        normality is a strong condition.

        IV. Henley claims that it is necessary that the error distribution be normal,
        this is absolutely wrong.

        V. While there are individuals who would identify themselves as "geostatisticians"
        it is more likely that individuals using geostatistics as well as
        contributing to new developments in the field would call themselves:
        mining engineers, petroleum engineers, geologists, soil scientists,
        hydrologists, statisticians, mathematicians, ecologists, geographers,
        etc. Hence the constant reference to (and laying blame on) "geostatisticians"
        is mis-leading.

        VI. Henley fails to distinguish between "estimation variance" and
        "kriging variance" (the latter being the minimized value of the former,
        i.e., the weights in the kriging estimator are obtained by this
        minimization).

        VII. Henley fails to distinguish between the sample/experimental
        variogram and the (theoretical) model. The sample variogram is AN
        estimator of the model (and certainly a number of authors would
        say that it is not the only choice). Ultimately however it is the
        variogram (theoretical) that is of interest and which is use in
        the kriging equations (to obtain the coefficients in the kriging
        estimator).

        VIII. It is true that there are similarities between the Inverse
        Distance Weighting (IWD) estimator and the kriging estimator. (1) As
        noted by Henley, both are weighted averages of the data (but in the
        case of IWD the coefficients are always non-negative, this is not
        true for the kriging estimator), (2) in the usual formulation, IWD
        is isotropic (only distance is used, not direction), (3) IWD is
        not a "perfect/exact" interpolator as is the kriging estimator since
        one can estimate at a data location using the data value at that location
        in IWD (this would involve a zero distance), (4) while IWD in some
        sense incorporates the spatial correlation between the value at the location
        to be estimated and a data location (all pairings), IWD does NOT
        incorporate the spatial correlation between the values at pairs of
        data locations hence some information is ignored in IWD. Finally
        one might attempt to optimize the choice of the exponent in IWD, one
        size does NOT fit all (see a paper by Kane et al, Computers & Geosciences
        1982).

        IX. It is certainly true that kriging will "smooth" the data, howeve
        this is true of all interpolation methods/algorithms. This is one
        reason why some advocate the use of conditional simulation as an
        alternative.

        X. All statistical techniques are subject to problems resulting from
        a lack of data, many of the problems Henley identifies or asserts are
        related to insufficient data. Unfortunately, data costs money and one
        will almost never have enough (of either). The related problem is how the
        data is collected. In more typical applications of statistics, one
        "designs" the experiment to ensure that the underlying statistical
        assumptions are satisifed. This simply will not work in geostatistics.

        XI. Henley correctly identifies "stationarity" or rather the lack of it
        as a critical problem. However he does not quite describe it correctly. The
        constant mean condition (which is only part of the definition of
        second order stationarity) pertains to the implied random function, not
        to the data. Since one has data from at most one realization of the
        random function one can not statistically "test" this assumption. This
        is one of the places where the lack of data is a serious problem,e.g,
        to decide whether to partition the region of interest into separate
        regions to obtain "stationarity" on each separately. Note that
        the condition on the variogram/covariance (being only a function of
        the separation vector) is critical and not implied by the constant
        mean condition.

        XII. The statement "The problem with this method was that the semivariogram
        itself was sensitive to the form of the deterministic surface.
        Therefore, it required a number of iterations of kriging, variogram
        computation, and model fitting, to converge towards a consistent
        solution." contains a germ of truth but it is also confused and mis-leading.
        What Henley is presumably referring to is that if one fits a trend surface to
        the data and then computes a sample variogram using the residuals, the
        resulting sample variogram is biased (see a paper by N. Cressie in
        Mathematical Geology for example and a much earlier paper by another
        author in the proceedings of the NATO conference of 1975). Then there
        was a dissertation in the Dept Hydrology (J. Samper) about 1987) on
        a maximum likelihood method (an iterative application) for fitting the
        sample variogram to residuals. This problem is not completely resolved
        because as is frequently the case in geostatistics or rather in the
        application of geostatistics there is a discrepancy between theory and
        application. It is well known (see Matheron's 1971 Fountainebleau Summer
        School Lecture Notes) that the optimal estimator of the drift, i.e., the
        non-constant mean of the random function, is obtained by kriging. However
        to apply this means that one must first have the variogram model, yet
        one can not model the variogram using the original data. The problem is
        circular. In practice the problem in fact often addressed by fitting a
        trend surface to the data and fitting the variogram to the residuals.

        The real problem is the fact that the sample variogram only estimates
        the variogram if the mean of the random function is constant. When the
        mean is not constant and in particular when the fitted trend surface is
        not just a constant, the sample variogram will exhibit a very rapid growth
        rate (quadratic or greater). There are no valid variogram models with
        this property. Hence if the sample variogram exhibits this growth condition
        this property is taken as evidence of a non-stationarity. Note that this
        property or characteristic is not absolute, it may only appear for large lag
        distances and hence one may be able to fit a variogram model to the
        sample variogram using only the information for short lags.


        XIII. The statements "These methods have found little practical application
        because of their complexity, and the inherent instability of their
        solutions. Furthermore, the resulting kriging system is no longer
        linear and thus loses its ideal "BLUE" properties." are partially correct and
        partially incorrect.

        It is well known that the form of the kriging estimator and the kriging
        equations when using generalized covariances is exactly the same as for
        Universal Kriging. It is also well known that the kriging estimator
        (written in the dual form), when using the right choice of a generalized
        covariance, is the same as the thin plate spline. One must use the so-called
        "spline covariance", this was implemented in the BLUEPACK software back
        in the 80's and is also in the current ISATIS. It is NOT correct to
        say that the resulting kriging system is not linear, the form of the kriging
        estimator is unchanged and the kriging equations ARE still linear. The
        kriging estimator is still a BLUE contrary to Henley's claim.

        There are several reasons why generalized covariances are not widely used
        (for a particularly good presentation see the recent book by Delfiner
        and Chiles). One is that to write software is much more complicated, i.e.,
        to determine the order of the non-stationarity. A second is that as
        contrasted with the family of known variograms there are only a few
        known generalized covariances (although every variogram corresponds to
        a generalized covariance) and these are all isotropic. Hence in practice
        one is likely to revert to using a variogram. Third, it is not so difficult
        to use geostatistics/kriging because of the availability of both free
        and moderately inexpensive software, the theoretical questions are largely
        taken care of in writing the software. There is essentially no free
        software that incorporates the use of generalized covariances (ISATIS is
        rather expensive because it is intended primarily by petroleum and mining
        companies)

        XIV. The first half of the statement "Although regionalised variable theory
        does not require normal (gaussian) distribution of the data, it does
        assume normal distribution of error terms." is almost true but the second half
        is completely false. Henley consistently fails to distinguish between
        properties of a particular data set and the properties of the random function.
        The kriging equations are derived without making any distributional
        assumptions. It is true that kriging (and any other interpolation method)
        will tend to smooth the data. This characteristic is exagerated when the
        distribution of the data is not approximately symmetric (but not necessarily
        normal) and if the distribution of the data has "fat tails". Both lognormal
        kriging and indicator kriging are ways to deal with this problem.

        In claiming that the kriging estimator is no longer "BLUE" when using the
        a log transformation, Henley is presumably referring to the fact that
        if one simply exponentiates the result then there is a bias. Under an
        assumption of multivariate lognormality one can compute the bias and hence
        compensate for it. This has received considerable attention in the geostatistics
        literature. The problem of course is that is not possible to absolutely
        know whether the underlying random function (for a particular data set) is
        multivariate lognormal (univariate lognormal is a much weaker condition).
        All one can do is to determine whether that assumption is reasonable.

        Finally, it is quite plausible to ask whether there might be a better way
        to approach the problem. There is nothing really wrong with Matheron's
        work, the problem is knowing whether the assumptions implicit in its
        use are valid. Since those assumptions pertain to the random function
        , RATHER THAN THE DATA, it is not possible to completely answer the
        question. The real question, particularly for the practitioner is whether
        geostatistics produces useful results. "useful results" does not have
        a very precise meaning in general but it will have a very real meaning
        to the practitioner. Geostatistics is not a cure-all nor is it useful
        for all problems. It is not a black box scheme and must be used with
        some care. In some cases, i.e., for some data sets and some objectives,
        a user need only have access to a software package such as GEOEAS,
        G-STAT or even the add-ons available in ARCVIEW, S-PLUS, etc. In other
        cases it may be necessary to seek the advice and assistance of someone
        more experienced and with a stronger understanding of the mathematics/
        statistics in order to adequately apply the geostatistics to the data
        set.

        For a slightly different take on the same general question see a recent book by M.L. Stein

        Donald E. Myers
        http://www.u.arizona.edu/~donaldm


        --
        * To post a message to the list, send it to ai-geostats@...
        * As a general service to the users, please remember to post a summary of any useful responses to your questions.
        * To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
        * Support to the list is provided at http://www.ai-geostats.org
      Your message has been successfully submitted and would be delivered to recipients shortly.