Loading ...
Sorry, an error occurred while loading the content.

GEOSTATS: Cross-validation, Correlation, and Jackknifing

Expand Messages
  • Tom Mueller
    Dr. Myers did a nice job of listing different statistics that can be computed from cross-validation. He offers as one possible measure the correlation between
    Message 1 of 3 , Dec 7, 1998
    • 0 Attachment
      Dr. Myers did a nice job of listing different statistics that can be
      computed from cross-validation. He offers as one possible measure the
      correlation between predicted and measured. This in my opinion is a poor
      measures of interpolation error. A large correlation coefficient with a
      small slope is worthless. Also, where regression line intersect the 1:1
      line is important. This measure has caused many poor assessments of
      interpolation quality and in my opinion has been used too much in the
      literature. The RMSE is good because it is a scaled measure of both
      precision and overall interpolation accuracy. Scaled in the sense that the
      units are in the same units as your variable so there is some basis for
      interpretation (e.g. mg kg-1). The average error (bias) is useful because
      you can test the unbiased part of BLUE. There is also prediction efficiency
      which is a comparison of the MSE of two prediction surfaces. But this
      measure I find difficult to interpret because of the assumption of a squared
      loss function.

      It would be great if the geostatistics community could come up with
      standards for expressing map error. I've spent time reviewing the
      literature and found it hard to compare the results of one author to another
      because of this lack of standardization. But geostatistics has been bad
      with standards (e.g. semivariogram model parameterization, or should I say
      variogram model).

      Anyway, let me suggest "jack-knifing" with an independent validation data
      set as a superior approach to cross-validation especially with a regular
      gridded data set. But N should be large.

      Give me a quantitative measure of accuracy over only just a qualitative
      hunch of "reasonableness of the eventual spatial surface", any day of the
      week. I work with the spatial variability of soil properties and I'm not
      sure how you can look at an interpolated pH map and assess its quality.

      I'm not an expert just a geostatistics neophyte trying to make accurate soil
      properties maps. I'd appreciate any feedback.

      Thanks,

      Tom Mueller
      muelle26@...


      --
      *To post a message to the list, send it to ai-geostats@....
      *As a general service to list users, please remember to post a summary
      of any useful responses to your questions.
      *To unsubscribe, send email to majordomo@... with no subject and
      "unsubscribe ai-geostats" in the message body.
      DO NOT SEND Subscribe/Unsubscribe requests to the list!
    • Syed Abdul
      ... qualitative ... of the ... I m not ... quality. This is fine, if you re only working with soil properties and samples cost about as much as a Big Mac. At
      Message 2 of 3 , Dec 7, 1998
      • 0 Attachment
        > Give me a quantitative measure of accuracy over only >just a
        qualitative
        > hunch of "reasonableness of the eventual spatial >surface", any day
        of the
        > week. I work with the spatial variability of soil >properties and
        I'm not
        > sure how you can look at an interpolated pH map and >assess its
        quality.

        This is fine, if you're only working with soil
        properties and samples cost about as much as a Big Mac.
        At the extreme case, for example in an offshore
        environment, two or three delineation wells are all
        that one might afford (e.g., USD 30 million), and
        the subsurface description might have to be built
        deterministically, assuming different "scenarios",
        and incorporating dense (but "soft") information.
        And if one can afford such luxury, perhaps a Boolean
        simulation of sand-shale distributions can be built,
        the basis of which will be a dynamic fluid flow
        model and the detailed design of a USD 500 million
        dollar jackup platform. Whatever the case, samples
        will generally be sparse, and variograms ill-defined,
        and one would have to resort to deterministically
        derived "scenarios" to assess the uncertainty. I think
        the essence is that what might be applicable in one
        situation (e.g. remote sensing, with thousands of
        samples) might hardly be applicable in another
        (oil exploration and reservoir dileneation). One can
        generalize, but one will still be hard pressed to
        come up with a "quantitative measure of accuracy"
        when a strong deterministic component comes into
        focus in such a situation. The general impetus
        have always been to be able to quantify such related
        uncertainties, i.e. focus of petroleum geostatistics
        within the last couple of years, but I do not see it
        as "replacing" old fashioned geology; rather, one
        would complement the other, with varying degrees of
        emphasis in different phases of the project (e.g.
        whether exploration, or mature producing areas,
        or fields near abandonment). I think the real danger
        is assuming that you've covered "all bases" with
        the kriged variance or mapped posterior PDFs, while
        ignoring a larger scale (first-order) uncertainty
        that might have to be derived subjectively, or taken
        into account using analogous data from a similarly
        explored basin.

        Syed

        _________________________________________________________
        DO YOU YAHOO!?
        Get your free @... address at http://mail.yahoo.com

        --
        *To post a message to the list, send it to ai-geostats@....
        *As a general service to list users, please remember to post a summary
        of any useful responses to your questions.
        *To unsubscribe, send email to majordomo@... with no subject and
        "unsubscribe ai-geostats" in the message body.
        DO NOT SEND Subscribe/Unsubscribe requests to the list!
      • Tom Mueller
        Syed, You made a good point about the different costs of sampling. But perhaps I don t understand your example about offshore drilling. How can you talk about
        Message 3 of 3 , Dec 9, 1998
        • 0 Attachment
          Syed,

          You made a good point about the different costs of sampling.

          But perhaps I don't understand your example about offshore drilling. How
          can you talk about geostatistics with n = 2 or 3? You are talking about
          applying deterministic models to predict, not geostatistics, right?

          One other point. The reason I'm so concerned with map errors is that for my
          research I've been finding for a number of soil variables that ordinary
          kriging with second order stationary data (fairly well structured) yield
          very inaccurate maps. And that inverse distance weighted gave better
          results than ordinary kriging most of the time. I didn't expect this.
          Today, I would not trust an interpolated map without seeing some
          quantification of map accuracy.

          Tom

          "At the extreme case, for example in an offshore
          environment, two or three delineation wells are all
          that one might afford (e.g., USD 30 million), and
          the subsurface description might have to be built
          deterministically, assuming different "scenarios",
          and incorporating dense (but "soft") information.
          And if one can afford such luxury, perhaps a Boolean
          simulation of sand-shale distributions can be built,
          the basis of which will be a dynamic fluid flow
          model and the detailed design of a USD 500 million
          dollar jackup platform. Whatever the case, samples
          will generally be sparse, and variograms ill-defined,
          and one would have to resort to deterministically
          derived "scenarios" to assess the uncertainty.

          --
          *To post a message to the list, send it to ai-geostats@....
          *As a general service to list users, please remember to post a summary
          of any useful responses to your questions.
          *To unsubscribe, send email to majordomo@... with no subject and
          "unsubscribe ai-geostats" in the message body.
          DO NOT SEND Subscribe/Unsubscribe requests to the list!
        Your message has been successfully submitted and would be delivered to recipients shortly.