Loading ...
Sorry, an error occurred while loading the content.

AI-GEOSTATS: data transformation and variograms

Expand Messages
  • Juliann Aukema
    Hi. I have a question about transforming data. I have infection prevalence data for many points- a proportion of trees infected. Numbers are between 0 and 1.
    Message 1 of 2 , Apr 19, 2001
    • 0 Attachment
      Hi. I have a question about transforming data.

      I have infection prevalence data for many points- a proportion of
      trees infected. Numbers are between 0 and 1. Sample size varies for the
      different points (because density of trees varies). When I plot a variogram
      of the prevalence data, I get a nice sill for about 4000 meters and then a
      rise in the variogram. If I take the residuals of prevalence against
      elevation the second rise goes away. Biologically this all makes sense and
      makes a nice story.
      However for some other analyses that I also did with this data, I
      was advised to logit transform the prevalence data because it is a
      proportion and should be binomially distributed.
      If I plot the variogram of the logit transformed prevalence, the
      first sill is much less distinct if it is there at all - this seems to be
      mostly due to one point, the last point before the rise, which now goes up
      instead of being about even with the previous point. ( I guess this
      difference is due to the stretching of zero prevalence values that occurs
      with the logit transformation.) And if I look at smaller lags, it looks
      like a power function with no sill. Biologically, that is harder to
      explain. If I plot the residuals of the (logit transformed prevalence)
      against ( elevation), the variogram has a nice sill and is similar, even
      prettier than the analysis of the untransformed data (but based on the
      previous variogram, I don't have a very good reason for plotting the
      residuals).
      My question, then is whether the logit transformation is necessary
      and/or appropriate for the geostatistical analysis. Does it make sense to
      use the transformed data for both variograms, for just the residuals
      (because the residuals are based on regression for which the transformation
      ought to be done) or for neither?
      Thank you very much.

      Juliann
      jaukema@...



      --
      * To post a message to the list, send it to ai-geostats@...
      * As a general service to the users, please remember to post a summary of any useful responses to your questions.
      * To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
      * Support to the list is provided at http://www.ai-geostats.org
    • Nicholas Lewin-Koh
      Hi, I think the problem might be even more subtle. Essentially you are looking at a marked point process, and trying to apply methods designed principally for
      Message 2 of 2 , Apr 19, 2001
      • 0 Attachment
        Hi,
        I think the problem might be even more subtle. Essentially you are looking
        at a marked point process, and trying to apply methods designed
        principally for data that is continuous throughout the sampling domain.

        I would suggest looking at the following paper:
        Stoyan and Waelder 2000. On variograms in point process statistics
        II. Models of markings and ecological interpretation. Biometrical journal
        42(2):171-187

        Another approach you might think about is spatial cdf estimation. take a
        look at the work of cressie and friends.

        Nicholas

        CH3
        |
        N Nicholas Lewin-Koh
        / \ Dept of Statistics
        N----C C==O Program in Ecology and Evolutionary Biology
        || || | Iowa State University
        || || | Ames, IA 50011
        CH C N--CH3 http://www.public.iastate.edu/~nlewin
        \ / \ / nlewin@...
        N C
        | || Currently
        CH3 O Graphics Lab
        School of Computing
        National University of Singapore
        The Real Part of Coffee kohnicho@...

        On Thu, 19 Apr 2001, Juliann Aukema wrote:

        > Hi. I have a question about transforming data.
        >
        > I have infection prevalence data for many points- a proportion of
        > trees infected. Numbers are between 0 and 1. Sample size varies for the
        > different points (because density of trees varies). When I plot a variogram
        > of the prevalence data, I get a nice sill for about 4000 meters and then a
        > rise in the variogram. If I take the residuals of prevalence against
        > elevation the second rise goes away. Biologically this all makes sense and
        > makes a nice story.
        > However for some other analyses that I also did with this data, I
        > was advised to logit transform the prevalence data because it is a
        > proportion and should be binomially distributed.
        > If I plot the variogram of the logit transformed prevalence, the
        > first sill is much less distinct if it is there at all - this seems to be
        > mostly due to one point, the last point before the rise, which now goes up
        > instead of being about even with the previous point. ( I guess this
        > difference is due to the stretching of zero prevalence values that occurs
        > with the logit transformation.) And if I look at smaller lags, it looks
        > like a power function with no sill. Biologically, that is harder to
        > explain. If I plot the residuals of the (logit transformed prevalence)
        > against ( elevation), the variogram has a nice sill and is similar, even
        > prettier than the analysis of the untransformed data (but based on the
        > previous variogram, I don't have a very good reason for plotting the
        > residuals).
        > My question, then is whether the logit transformation is necessary
        > and/or appropriate for the geostatistical analysis. Does it make sense to
        > use the transformed data for both variograms, for just the residuals
        > (because the residuals are based on regression for which the transformation
        > ought to be done) or for neither?
        > Thank you very much.
        >
        > Juliann
        > jaukema@...
        >
        >
        >
        > --
        > * To post a message to the list, send it to ai-geostats@...
        > * As a general service to the users, please remember to post a summary of any useful responses to your questions.
        > * To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
        > * Support to the list is provided at http://www.ai-geostats.org
        >


        --
        * To post a message to the list, send it to ai-geostats@...
        * As a general service to the users, please remember to post a summary of any useful responses to your questions.
        * To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
        * Support to the list is provided at http://www.ai-geostats.org
      Your message has been successfully submitted and would be delivered to recipients shortly.