Loading ...
Sorry, an error occurred while loading the content.

Re: AI-GEOSTATS: mysterious kriging output

Expand Messages
  • Monica Palaseanu-Lovejoy
    Hi, I am working myself with pollution data in soils and i have very high values very close to very low values, and highly skewed distribution. I am more and
    Message 1 of 5 , Mar 9, 2004
    • 0 Attachment
      Hi,

      I am working myself with pollution data in soils and i have very high
      values very close to very low values, and highly skewed
      distribution. I am more and more concerned with doing kriging on
      transformed data. This simply means we believe the data came
      from only one population. But what if it comes from 2 different
      populations representing 2 different polluting processes? Much
      more if we do believe there are no gross error measurements. The
      fact that high values are very close to low values would tell me that
      the spatial autocorrelation is violated locally. I would try first to see
      if the outliers (local and global) represent a different population, if
      these values cluster or not, how significant is the association high-
      low values, and if the global Moran's I increases if i eliminate the
      "outliers". Maybe the majority of the data which have a higher
      spatial autocorrelation belong to a "better expressed" diffusive
      process, (maybe an older one) while the rest of the data which
      were identified as outliers before, represent a more patch-y or point
      source pollution process which didn't have time to diffuse over the
      entire study area (a younger process, maybe?).

      Of course if you have proof that the data came from only one
      population then .... it is a different story.

      I will really appreciate to hear other opinions about these thoughts.

      Thanks,

      Monica

      --
      * To post a message to the list, send it to ai-geostats@...
      * As a general service to the users, please remember to post a summary of any useful responses to your questions.
      * To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
      * Support to the list is provided at http://www.ai-geostats.org
    • Ruben Roa Ureta
      ... Exploratory analysis of the frequency distribution of the data (i.e. the aggregated, non-spatial, frequency) could reveal the existence of two (or more)
      Message 2 of 5 , Mar 9, 2004
      • 0 Attachment
        > Hi,
        >
        > I am working myself with pollution data in soils and i have very high
        > values very close to very low values, and highly skewed
        > distribution. I am more and more concerned with doing kriging on
        > transformed data. This simply means we believe the data came
        > from only one population. But what if it comes from 2 different
        > populations representing 2 different polluting processes? Much
        > more if we do believe there are no gross error measurements. The
        > fact that high values are very close to low values would tell me that
        > the spatial autocorrelation is violated locally. I would try first to see
        > if the outliers (local and global) represent a different population, if
        > these values cluster or not, how significant is the association high-
        > low values, and if the global Moran's I increases if i eliminate the
        > "outliers". Maybe the majority of the data which have a higher
        > spatial autocorrelation belong to a "better expressed" diffusive
        > process, (maybe an older one) while the rest of the data which
        > were identified as outliers before, represent a more patch-y or point
        > source pollution process which didn't have time to diffuse over the
        > entire study area (a younger process, maybe?).

        Exploratory analysis of the frequency distribution of the data (i.e. the
        aggregated, non-spatial, frequency) could reveal the existence of two (or
        more) populations. To evaluate the evidence in favour of such an
        hypothesis, you could compare the hypothesis that the frequency
        distribution is formed by a mixture of two (or more) specified
        distributions versus the hypothesis that it is formed by only one. The
        general topic in statistics is called 'mixture distribution analysis' (not
        to be confused with 'mixture models'). Useful references are:

        Everitt & Hand, 1981, Mixture distribution analysis. Chapman & Hall
        Chen & Chen, 2001, Statistics and Probability Letters 52:125
        Hawkins et al., 2001, Computational Statistics & Data Analysis 38:15
        http://www.math.mcmaster.ca/peter/mix/mix.html

        Some robust regression methods, for example, are based on treating the
        data as coming from a mixture of two distributions, the main one, and a
        contaminating distribution.

        If you conclude that there are two (or more) distributions, then you can
        compute the maximum conditional probability that any given data point
        belong to any of the two (or more) distributions, and use this computation
        to classify data. After this exploratory analysis, you could treat the two
        (or more) populations differently, if there is evidence for a mixture, and
        maybe even perform separate geostatistical analyses on the separate
        populations.

        I used this general strategy in the analysis of a time series of an index
        of returns from investments in finantial markets. The strategy was
        proposed by Hamilton, 1994, Time Series Analysis, Ch. 22, Princeton U. P.

        Ruben

        --
        * To post a message to the list, send it to ai-geostats@...
        * As a general service to the users, please remember to post a summary of any useful responses to your questions.
        * To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
        * Support to the list is provided at http://www.ai-geostats.org
      • Pierre Goovaerts
        Hello, I agree that in many environmental datasets we could question the assumption of existence of a single population. Although there are ways to split the
        Message 3 of 5 , Mar 9, 2004
        • 0 Attachment
          Hello,

          I agree that in many environmental datasets we could question the
          assumption of existence of a single population. Although there are
          ways to split the data into several populations, the key issue is
          that the study area needs also to be stratified into several populations.
          In some fields, such as geology, geological maps could provide
          a stratification of the study area and helps delineating the boundaries
          between populations. This is far less obvious for environmental
          data sets.

          Looking at Noemi's maps, I would agree with Richard's comment that
          nothing seems to be out of the ordinary. Of course, when dealing with
          streams the data configuration is far from optimal and screening effects
          abound. Also, the strong anisotropy ratio means that we deal with
          a "zonal-like" anisotopy which might cause sudden changes of covariance
          for slight difference of angles. In particular, this covariance model
          could lead to very small correlations off the two main axes of anisotropy,
          which could explain the larger kriging variance observed along the
          diagonal directions.

          Pierre

          <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

          Dr. Pierre Goovaerts
          President of PGeostat, LLC
          Chief Scientist with Biomedware Inc.
          710 Ridgemont Lane
          Ann Arbor, Michigan, 48103-1535, U.S.A.

          E-mail: goovaert@...
          Phone: (734) 668-9900
          Fax: (734) 668-7788
          http://alumni.engin.umich.edu/~goovaert/

          <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

          On Tue, 9 Mar 2004, Monica Palaseanu-Lovejoy wrote:

          > Hi,
          >
          > I am working myself with pollution data in soils and i have very high
          > values very close to very low values, and highly skewed
          > distribution. I am more and more concerned with doing kriging on
          > transformed data. This simply means we believe the data came
          > from only one population. But what if it comes from 2 different
          > populations representing 2 different polluting processes? Much
          > more if we do believe there are no gross error measurements. The
          > fact that high values are very close to low values would tell me that
          > the spatial autocorrelation is violated locally. I would try first to see
          > if the outliers (local and global) represent a different population, if
          > these values cluster or not, how significant is the association high-
          > low values, and if the global Moran's I increases if i eliminate the
          > "outliers". Maybe the majority of the data which have a higher
          > spatial autocorrelation belong to a "better expressed" diffusive
          > process, (maybe an older one) while the rest of the data which
          > were identified as outliers before, represent a more patch-y or point
          > source pollution process which didn't have time to diffuse over the
          > entire study area (a younger process, maybe?).
          >
          > Of course if you have proof that the data came from only one
          > population then .... it is a different story.
          >
          > I will really appreciate to hear other opinions about these thoughts.
          >
          > Thanks,
          >
          > Monica
          >
          > --
          > * To post a message to the list, send it to ai-geostats@...
          > * As a general service to the users, please remember to post a summary of any useful responses to your questions.
          > * To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
          > * Support to the list is provided at http://www.ai-geostats.org
          >

          --
          * To post a message to the list, send it to ai-geostats@...
          * As a general service to the users, please remember to post a summary of any useful responses to your questions.
          * To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
          * Support to the list is provided at http://www.ai-geostats.org
        Your message has been successfully submitted and would be delivered to recipients shortly.