Loading ...
Sorry, an error occurred while loading the content.

[ai-geostats] natural neighbor applied to indicator transforms

Expand Messages
  • seba
    Dear list members I would like to have some comments, suggestions or critics about the following topic: building a (preliminary) local uncertainty model of the
    Message 1 of 7 , Aug 30, 2005
    • 0 Attachment
      Dear list members

      I would like to have some comments, suggestions or critics about the following topic:
      building a (preliminary) local uncertainty model of the spatial distribution of discrete (categorical) variables by means of natural neighbor interpolation method applied to indicator transforms.

      From my perspective, interpolating  indicator variables (well, at the end an indicator variable is the probability of occurrence of a given class) by means of a method like natural neighbor is an easy and quick way to build a (preliminary) model of local uncertainty of the studied properties, avoiding problems of order relation violations.
      In my specific case I apply natural neighbor interpolation to indicator transforms representing lithological classes in the same way in which direct indicator kriging is applied. In this way, looking at the spatial distribution of the probability of occurrence of lithologies (or at the distribution of the lithological classes, if some classification algorithm is applied) I can have a first idea of the spatial distribution of lithologies. Clearly this method is utilized only as an explorative and preliminary data analysis tool.

      Thank you in advance for your replies.
       
      S. Trevisani
    • seba
      Hi Gregorie Well, I think that classification could be viewed as a way of coding of information in sampled areas. In particular for soil properties continuos
      Message 2 of 7 , Aug 31, 2005
      • 0 Attachment
        Hi Gregorie
        Well, I think that classification could be viewed as a way of coding  of information in sampled areas. In particular for soil properties continuos or fuzzy classification seems to work properly. Then, avoiding to talk about the non-convexity of kriging, we can interpolate before or after performing classification. But after all, also classification algorithms are a regression problem.......

        Bye
        Sebastiano

        At 11.25 31/08/2005, Gregoire Dubois wrote:
        I recently attended a presentation about the mapping of soil properties. Kriging was applied and I was wondering why a regression technique was used instead of a classification algorithm.
        Delineating soil properties seemed to be, at first sight, a classification problem than a regression case. This was at first sight and we didn't debate much on this issue unfortunately.
        Indicator kriging (IK) is somehow a bridge between these two issues (regression versus classification) and its simplicity in use and concept makes it very attractive to solve many problems.
        Now I wonder (again) if there are some fundamental papers comparing IK to classification algorithms (e.g. Support Vector Machine, SVM). In the same way, SVM used for regression seems to be not that uncommon as well. So where is the borderline? When are we facing a classification problem and when is it a regression problem? I am not sure the borderline is always that obvious.
         
        I am not answering Sebastiano's mail here but would be curious to see on this list a debate on "regression versus classification"... I presume there may there some material as well regarding the issue discussed below.
         
        Best regards,
         
        Gregoire
        -----Original Message-----
        From: seba [ mailto:sebastiano.trevisani@...]
        Sent: 30 August 2005 18:17
        To: ai-geostats@...
        Subject: [ai-geostats] natural neighbor applied to indicator transforms

        Dear list members

        I would like to have some comments, suggestions or critics about the following topic:
        building a (preliminary) local uncertainty model of the spatial distribution of discrete (categorical) variables by means of natural neighbor interpolation method applied to indicator transforms.

        From my perspective, interpolating  indicator variables (well, at the end an indicator variable is the probability of occurrence of a given class) by means of a method like natural neighbor is an easy and quick way to build a (preliminary) model of local uncertainty of the studied properties, avoiding problems of order relation violations.
        In my specific case I apply natural neighbor interpolation to indicator transforms representing lithological classes in the same way in which direct indicator kriging is applied. In this way, looking at the spatial distribution of the probability of occurrence of lithologies (or at the distribution of the lithological classes, if some classification algorithm is applied) I can have a first idea of the spatial distribution of lithologies. Clearly this method is utilized only as an explorative and preliminary data analysis tool.

        Thank you in advance for your replies.
         
        S. Trevisani
      • Nicolas Gilardi
        To answer to Gregoire s question, for some comparisons between SVM and Indicator Kriging, here is a very basic paper (from 1999):
        Message 3 of 7 , Aug 31, 2005
        • 0 Attachment
          To answer to Gregoire's question, for some comparisons between SVM and
          Indicator Kriging, here is a very basic paper (from 1999):

          http://baikal-bangkok.org/~nicolas/publi/acai99-svm.pdf

          and a thesis chapter (chapter 6), perhaps more interesting (from 2002):

          http://baikal-bangkok.org/~nicolas/cartann/these_gilardi.pdf

          My personnal feeling about the distinction between using a
          classification algorithm or a regression one is the importance you put
          on the boundaries.
          If you look for smooth boundaries, with uncertainty estimations, etc.,
          then a regression algorithm (like indicator kriging) is certainly a good
          approach.
          Now, if you don't care much about how the categories mix together at the
          interface, or if you want clear decision boundaries, then a real
          classification algorithm (like SVM) is certainly a better choice.

          However, it is true that many algorithms can be used in either cases,
          often with a small or no modification. The best examples are the
          algorithms for density estimation (RBF, Parzen Windows...).
          Algorithms of the category of SVM (i.e. large margin classifiers) are
          interesting for classification because they are concentrating on finding
          a separation between classes, not finding the "centre" of classes. In my
          opinion, the interest of this technic for regression isn't obvious...

          Best regards,

          Nico

          Gregoire Dubois wrote:
          > I recently attended a presentation about the mapping of soil properties.
          > Kriging was applied and I was wondering why a regression technique was
          > used instead of a classification algorithm.
          > Delineating soil properties seemed to be, at first sight, a
          > classification problem than a regression case. This was at first sight
          > and we didn't debate much on this issue unfortunately.
          > Indicator kriging (IK) is somehow a bridge between these two issues
          > (regression versus classification) and its simplicity in use and concept
          > makes it very attractive to solve many problems.
          > Now I wonder (again) if there are some fundamental papers comparing IK
          > to classification algorithms (e.g. Support Vector Machine, SVM). In the
          > same way, SVM used for regression seems to be not that uncommon as well.
          > So where is the borderline? When are we facing a classification problem
          > and when is it a regression problem? I am not sure the borderline is
          > always that obvious.
          >
          > I am not answering Sebastiano's mail here but would be curious to see on
          > this list a debate on "regression versus classification"... I presume
          > there may there some material as well regarding the issue discussed below.
          >
          > Best regards,
          >
          > Gregoire

          --
          Nicolas Gilardi

          Particle Physics Experiment group
          University of Edinburgh, JCMB
          Edinburgh EH9 3JZ, United Kingdoms

          tel: +44 (0)131 650 5300 ; fax: +44 (0)131 650 7189
          e-mail: ngilardi@... ; web: http://baikal-bangkok.org/~nicolas
        • seba
          I try to reformulate my question..... When performing direct (i.e. without crossvariogram) indicator kriging, practically we interpolate probability values by
          Message 4 of 7 , Sep 2 1:07 AM
          • 0 Attachment

            I try to reformulate my question.....
            When performing direct (i.e. without crossvariogram) indicator kriging, practically we interpolate probability values by means of ordinary kriging. These probability values could represent the probability of occurrence of some category or the probability to overcome some threshold.
            My question is: is there anything wrong to interpolate these probability values with other interpolating algorithm like, for example natural neighbor (or triangulation)?
            In my opinion is all ok ..... considering also that we have no problem of order relation violations.
            Again, this technique is applied only for a preliminary data analysis

            Then a short consideration directed about the importance of boundaries:
            Quoting Nicolas Gilardi
            "My personnal feeling about the distinction between using a classification algorithm or a regression one is the importance you put on the boundaries.If you look for smooth boundaries, with uncertainty estimations, etc., then a regression algorithm (like indicator kriging) is certainly a good approach."

            Well, if you use fuzzy classification the boundaries become continuos...fuzzy.

            Bye

            S. Trevisani
          • Gregoire Dubois
            Ciao Sebastiano, I realized nobody replied to your question (sorry for have added confusion here). I don t see any objection in applying any interpolator to
            Message 5 of 7 , Sep 5 4:00 AM
            • 0 Attachment
              Message
              Ciao Sebastiano,
               
              I realized nobody replied to your question (sorry for have added confusion here).
               
              I don't see any objection in applying any interpolator to probability values.
              However, you should better use exact interpolators to avoid getting probabilities of occurences > 1 (or smaller than 0)
               
              Cheers
               
              Gregoire
               
               
              -----Original Message-----
              From: seba [mailto:sebastiano.trevisani@...]
              Sent: 02 September 2005 10:07
              To: ai-geostats@...
              Cc: ai-geostats@...; 'Nicolas Gilardi'
              Subject: RE: [ai-geostats] natural neighbor applied to indicator transforms


              I try to reformulate my question.....
              When performing direct (i.e. without crossvariogram) indicator kriging, practically we interpolate probability values by means of ordinary kriging. These probability values could represent the probability of occurrence of some category or the probability to overcome some threshold.
              My question is: is there anything wrong to interpolate these probability values with other interpolating algorithm like, for example natural neighbor (or triangulation)?
              In my opinion is all ok ..... considering also that we have no problem of order relation violations.
              Again, this technique is applied only for a preliminary data analysis

              Then a short consideration directed about the importance of boundaries:
              Quoting Nicolas Gilardi
              "My personnal feeling about the distinction between using a classification algorithm or a regression one is the importance you put on the boundaries.If you look for smooth boundaries, with uncertainty estimations, etc., then a regression algorithm (like indicator kriging) is certainly a good approach."

              Well, if you use fuzzy classification the boundaries become continuos...fuzzy.

              Bye

              S. Trevisani
            • seba
              Dear Pierre and Gregorie Thank you for your help ..... Concluding (considering that natural neighbor method should be a convex and an exact interpolator) it
              Message 6 of 7 , Sep 5 8:47 AM
              • 0 Attachment
                Dear Pierre and Gregorie

                Thank you for your help .....
                Concluding (considering that natural neighbor method should be a convex and
                an exact interpolator) it seems that the approach has not side effects !!!!!!

                Sincerely
                Sebastiano

                At 17.19 05/09/2005, you wrote:
                >Content-Class: urn:content-classes:message
                >Content-Type: text/plain;
                > charset="utf-8"
                >
                >Hi,
                >
                >In fact, as long as the weights are all positive and sum up to one, your
                >interpolated probability
                >will always be between 0 and 1; so you should be all right..
                >The approach proposed by Sebastiano is similar to median indicator kriging
                >in the sense
                >that the weights assigned to the observations will be the same across all
                >indicators (here instead of
                >a single indicator semivariogram used to compute the kriging weights, the
                >same weighting set
                >will be applied to all indicators since the data configuration, hence the
                >size of the Thiessen polygons,
                >doesn't change among indicators). Because all the weights are positive and
                >remain the same
                >for the different indicators, this approach should eliminate all order
                >relation deviations
                >(all estimated probabilities will be between 0 and 1, and at each location
                >their sum will be one).
                >
                >
                >Pierre
                >
                > -----Original Message-----
                > From: Gregoire Dubois [mailto:gregoire.dubois@...]
                > Sent: Mon 9/5/2005 7:00 AM
                > To: 'seba'; ai-geostats@...
                > Cc:
                > Subject: RE: [ai-geostats] natural neighbor applied to indicator
                > transforms
                >
                >
                > Ciao Sebastiano,
                >
                > I realized nobody replied to your question (sorry for have added
                > confusion here).
                >
                > I don't see any objection in applying any interpolator to
                > probability values.
                > However, you should better use exact interpolators to avoid
                > getting probabilities of occurences > 1 (or smaller than 0)
                >
                > Cheers
                >
                > Gregoire
                >
                >
                >
                > -----Original Message-----
                > From: seba [mailto:sebastiano.trevisani@...]
                > Sent: 02 September 2005 10:07
                > To: ai-geostats@...
                > Cc: ai-geostats@...; 'Nicolas Gilardi'
                > Subject: RE: [ai-geostats] natural neighbor applied to
                > indicator transforms
                >
                >
                >
                > I try to reformulate my question.....
                > When performing direct (i.e. without crossvariogram)
                > indicator kriging, practically we interpolate probability values by means
                > of ordinary kriging. These probability values could represent the
                > probability of occurrence of some category or the probability to overcome
                > some threshold.
                > My question is: is there anything wrong to interpolate
                > these probability values with other interpolating algorithm like, for
                > example natural neighbor (or triangulation)?
                > In my opinion is all ok ..... considering also that we
                > have no problem of order relation violations.
                > Again, this technique is applied only for a preliminary
                > data analysis
                >
                > Then a short consideration directed about the importance
                > of boundaries:
                > Quoting Nicolas Gilardi
                > "My personnal feeling about the distinction between using
                > a classification algorithm or a regression one is the importance you put
                > on the boundaries.If you look for smooth boundaries, with uncertainty
                > estimations, etc., then a regression algorithm (like indicator kriging)
                > is certainly a good approach."
                >
                > Well, if you use fuzzy classification the boundaries
                > become continuos...fuzzy.
                >
                > Bye
                >
                > S. Trevisani
                >
                >* By using the ai-geostats mailing list you agree to follow its rules
                >( see http://www.ai-geostats.org/help_ai-geostats.htm )
                >
                >* To unsubscribe to ai-geostats, send the following in the subject or in
                >the body (plain text format) of an email message to sympa@...
                >
                >Signoff ai-geostats
              Your message has been successfully submitted and would be delivered to recipients shortly.