Loading ...
Sorry, an error occurred while loading the content.

[ai-geostats] natural neighbor applied to indicator transforms

Expand Messages
  • seba
    Dear list members I would like to have some comments, suggestions or critics about the following topic: building a (preliminary) local uncertainty model of the
    Message 1 of 7 , Aug 30, 2005
    • 0 Attachment
      Dear list members

      I would like to have some comments, suggestions or critics about the following topic:
      building a (preliminary) local uncertainty model of the spatial distribution of discrete (categorical) variables by means of natural neighbor interpolation method applied to indicator transforms.

      From my perspective, interpolating  indicator variables (well, at the end an indicator variable is the probability of occurrence of a given class) by means of a method like natural neighbor is an easy and quick way to build a (preliminary) model of local uncertainty of the studied properties, avoiding problems of order relation violations.
      In my specific case I apply natural neighbor interpolation to indicator transforms representing lithological classes in the same way in which direct indicator kriging is applied. In this way, looking at the spatial distribution of the probability of occurrence of lithologies (or at the distribution of the lithological classes, if some classification algorithm is applied) I can have a first idea of the spatial distribution of lithologies. Clearly this method is utilized only as an explorative and preliminary data analysis tool.

      Thank you in advance for your replies.
       
      S. Trevisani
    • Gregoire Dubois
      I recently attended a presentation about the mapping of soil properties. Kriging was applied and I was wondering why a regression technique was used instead of
      Message 2 of 7 , Aug 31, 2005
      • 0 Attachment
        Message
        I recently attended a presentation about the mapping of soil properties. Kriging was applied and I was wondering why a regression technique was used instead of a classification algorithm.
        Delineating soil properties seemed to be, at first sight, a classification problem than a regression case. This was at first sight and we didn't debate much on this issue unfortunately.
        Indicator kriging (IK) is somehow a bridge between these two issues (regression versus classification) and its simplicity in use and concept makes it very attractive to solve many problems.
        Now I wonder (again) if there are some fundamental papers comparing IK to classification algorithms (e.g. Support Vector Machine, SVM). In the same way, SVM used for regression seems to be not that uncommon as well. So where is the borderline? When are we facing a classification problem and when is it a regression problem? I am not sure the borderline is always that obvious.
         
        I am not answering Sebastiano's mail here but would be curious to see on this list a debate on "regression versus classification"... I presume there may there some material as well regarding the issue discussed below.
         
        Best regards,
         
        Gregoire
        -----Original Message-----
        From: seba [mailto:sebastiano.trevisani@...]
        Sent: 30 August 2005 18:17
        To: ai-geostats@...
        Subject: [ai-geostats] natural neighbor applied to indicator transforms

        Dear list members

        I would like to have some comments, suggestions or critics about the following topic:
        building a (preliminary) local uncertainty model of the spatial distribution of discrete (categorical) variables by means of natural neighbor interpolation method applied to indicator transforms.

        From my perspective, interpolating  indicator variables (well, at the end an indicator variable is the probability of occurrence of a given class) by means of a method like natural neighbor is an easy and quick way to build a (preliminary) model of local uncertainty of the studied properties, avoiding problems of order relation violations.
        In my specific case I apply natural neighbor interpolation to indicator transforms representing lithological classes in the same way in which direct indicator kriging is applied. In this way, looking at the spatial distribution of the probability of occurrence of lithologies (or at the distribution of the lithological classes, if some classification algorithm is applied) I can have a first idea of the spatial distribution of lithologies. Clearly this method is utilized only as an explorative and preliminary data analysis tool.

        Thank you in advance for your replies.
         
        S. Trevisani
      • seba
        Hi Gregorie Well, I think that classification could be viewed as a way of coding of information in sampled areas. In particular for soil properties continuos
        Message 3 of 7 , Aug 31, 2005
        • 0 Attachment
          Hi Gregorie
          Well, I think that classification could be viewed as a way of coding  of information in sampled areas. In particular for soil properties continuos or fuzzy classification seems to work properly. Then, avoiding to talk about the non-convexity of kriging, we can interpolate before or after performing classification. But after all, also classification algorithms are a regression problem.......

          Bye
          Sebastiano

          At 11.25 31/08/2005, Gregoire Dubois wrote:
          I recently attended a presentation about the mapping of soil properties. Kriging was applied and I was wondering why a regression technique was used instead of a classification algorithm.
          Delineating soil properties seemed to be, at first sight, a classification problem than a regression case. This was at first sight and we didn't debate much on this issue unfortunately.
          Indicator kriging (IK) is somehow a bridge between these two issues (regression versus classification) and its simplicity in use and concept makes it very attractive to solve many problems.
          Now I wonder (again) if there are some fundamental papers comparing IK to classification algorithms (e.g. Support Vector Machine, SVM). In the same way, SVM used for regression seems to be not that uncommon as well. So where is the borderline? When are we facing a classification problem and when is it a regression problem? I am not sure the borderline is always that obvious.
           
          I am not answering Sebastiano's mail here but would be curious to see on this list a debate on "regression versus classification"... I presume there may there some material as well regarding the issue discussed below.
           
          Best regards,
           
          Gregoire
          -----Original Message-----
          From: seba [ mailto:sebastiano.trevisani@...]
          Sent: 30 August 2005 18:17
          To: ai-geostats@...
          Subject: [ai-geostats] natural neighbor applied to indicator transforms

          Dear list members

          I would like to have some comments, suggestions or critics about the following topic:
          building a (preliminary) local uncertainty model of the spatial distribution of discrete (categorical) variables by means of natural neighbor interpolation method applied to indicator transforms.

          From my perspective, interpolating  indicator variables (well, at the end an indicator variable is the probability of occurrence of a given class) by means of a method like natural neighbor is an easy and quick way to build a (preliminary) model of local uncertainty of the studied properties, avoiding problems of order relation violations.
          In my specific case I apply natural neighbor interpolation to indicator transforms representing lithological classes in the same way in which direct indicator kriging is applied. In this way, looking at the spatial distribution of the probability of occurrence of lithologies (or at the distribution of the lithological classes, if some classification algorithm is applied) I can have a first idea of the spatial distribution of lithologies. Clearly this method is utilized only as an explorative and preliminary data analysis tool.

          Thank you in advance for your replies.
           
          S. Trevisani
        • Nicolas Gilardi
          To answer to Gregoire s question, for some comparisons between SVM and Indicator Kriging, here is a very basic paper (from 1999):
          Message 4 of 7 , Aug 31, 2005
          • 0 Attachment
            To answer to Gregoire's question, for some comparisons between SVM and
            Indicator Kriging, here is a very basic paper (from 1999):

            http://baikal-bangkok.org/~nicolas/publi/acai99-svm.pdf

            and a thesis chapter (chapter 6), perhaps more interesting (from 2002):

            http://baikal-bangkok.org/~nicolas/cartann/these_gilardi.pdf

            My personnal feeling about the distinction between using a
            classification algorithm or a regression one is the importance you put
            on the boundaries.
            If you look for smooth boundaries, with uncertainty estimations, etc.,
            then a regression algorithm (like indicator kriging) is certainly a good
            approach.
            Now, if you don't care much about how the categories mix together at the
            interface, or if you want clear decision boundaries, then a real
            classification algorithm (like SVM) is certainly a better choice.

            However, it is true that many algorithms can be used in either cases,
            often with a small or no modification. The best examples are the
            algorithms for density estimation (RBF, Parzen Windows...).
            Algorithms of the category of SVM (i.e. large margin classifiers) are
            interesting for classification because they are concentrating on finding
            a separation between classes, not finding the "centre" of classes. In my
            opinion, the interest of this technic for regression isn't obvious...

            Best regards,

            Nico

            Gregoire Dubois wrote:
            > I recently attended a presentation about the mapping of soil properties.
            > Kriging was applied and I was wondering why a regression technique was
            > used instead of a classification algorithm.
            > Delineating soil properties seemed to be, at first sight, a
            > classification problem than a regression case. This was at first sight
            > and we didn't debate much on this issue unfortunately.
            > Indicator kriging (IK) is somehow a bridge between these two issues
            > (regression versus classification) and its simplicity in use and concept
            > makes it very attractive to solve many problems.
            > Now I wonder (again) if there are some fundamental papers comparing IK
            > to classification algorithms (e.g. Support Vector Machine, SVM). In the
            > same way, SVM used for regression seems to be not that uncommon as well.
            > So where is the borderline? When are we facing a classification problem
            > and when is it a regression problem? I am not sure the borderline is
            > always that obvious.
            >
            > I am not answering Sebastiano's mail here but would be curious to see on
            > this list a debate on "regression versus classification"... I presume
            > there may there some material as well regarding the issue discussed below.
            >
            > Best regards,
            >
            > Gregoire

            --
            Nicolas Gilardi

            Particle Physics Experiment group
            University of Edinburgh, JCMB
            Edinburgh EH9 3JZ, United Kingdoms

            tel: +44 (0)131 650 5300 ; fax: +44 (0)131 650 7189
            e-mail: ngilardi@... ; web: http://baikal-bangkok.org/~nicolas
          • seba
            I try to reformulate my question..... When performing direct (i.e. without crossvariogram) indicator kriging, practically we interpolate probability values by
            Message 5 of 7 , Sep 2, 2005
            • 0 Attachment

              I try to reformulate my question.....
              When performing direct (i.e. without crossvariogram) indicator kriging, practically we interpolate probability values by means of ordinary kriging. These probability values could represent the probability of occurrence of some category or the probability to overcome some threshold.
              My question is: is there anything wrong to interpolate these probability values with other interpolating algorithm like, for example natural neighbor (or triangulation)?
              In my opinion is all ok ..... considering also that we have no problem of order relation violations.
              Again, this technique is applied only for a preliminary data analysis

              Then a short consideration directed about the importance of boundaries:
              Quoting Nicolas Gilardi
              "My personnal feeling about the distinction between using a classification algorithm or a regression one is the importance you put on the boundaries.If you look for smooth boundaries, with uncertainty estimations, etc., then a regression algorithm (like indicator kriging) is certainly a good approach."

              Well, if you use fuzzy classification the boundaries become continuos...fuzzy.

              Bye

              S. Trevisani
            • Gregoire Dubois
              Ciao Sebastiano, I realized nobody replied to your question (sorry for have added confusion here). I don t see any objection in applying any interpolator to
              Message 6 of 7 , Sep 5, 2005
              • 0 Attachment
                Message
                Ciao Sebastiano,
                 
                I realized nobody replied to your question (sorry for have added confusion here).
                 
                I don't see any objection in applying any interpolator to probability values.
                However, you should better use exact interpolators to avoid getting probabilities of occurences > 1 (or smaller than 0)
                 
                Cheers
                 
                Gregoire
                 
                 
                -----Original Message-----
                From: seba [mailto:sebastiano.trevisani@...]
                Sent: 02 September 2005 10:07
                To: ai-geostats@...
                Cc: ai-geostats@...; 'Nicolas Gilardi'
                Subject: RE: [ai-geostats] natural neighbor applied to indicator transforms


                I try to reformulate my question.....
                When performing direct (i.e. without crossvariogram) indicator kriging, practically we interpolate probability values by means of ordinary kriging. These probability values could represent the probability of occurrence of some category or the probability to overcome some threshold.
                My question is: is there anything wrong to interpolate these probability values with other interpolating algorithm like, for example natural neighbor (or triangulation)?
                In my opinion is all ok ..... considering also that we have no problem of order relation violations.
                Again, this technique is applied only for a preliminary data analysis

                Then a short consideration directed about the importance of boundaries:
                Quoting Nicolas Gilardi
                "My personnal feeling about the distinction between using a classification algorithm or a regression one is the importance you put on the boundaries.If you look for smooth boundaries, with uncertainty estimations, etc., then a regression algorithm (like indicator kriging) is certainly a good approach."

                Well, if you use fuzzy classification the boundaries become continuos...fuzzy.

                Bye

                S. Trevisani
              • seba
                Dear Pierre and Gregorie Thank you for your help ..... Concluding (considering that natural neighbor method should be a convex and an exact interpolator) it
                Message 7 of 7 , Sep 5, 2005
                • 0 Attachment
                  Dear Pierre and Gregorie

                  Thank you for your help .....
                  Concluding (considering that natural neighbor method should be a convex and
                  an exact interpolator) it seems that the approach has not side effects !!!!!!

                  Sincerely
                  Sebastiano

                  At 17.19 05/09/2005, you wrote:
                  >Content-Class: urn:content-classes:message
                  >Content-Type: text/plain;
                  > charset="utf-8"
                  >
                  >Hi,
                  >
                  >In fact, as long as the weights are all positive and sum up to one, your
                  >interpolated probability
                  >will always be between 0 and 1; so you should be all right..
                  >The approach proposed by Sebastiano is similar to median indicator kriging
                  >in the sense
                  >that the weights assigned to the observations will be the same across all
                  >indicators (here instead of
                  >a single indicator semivariogram used to compute the kriging weights, the
                  >same weighting set
                  >will be applied to all indicators since the data configuration, hence the
                  >size of the Thiessen polygons,
                  >doesn't change among indicators). Because all the weights are positive and
                  >remain the same
                  >for the different indicators, this approach should eliminate all order
                  >relation deviations
                  >(all estimated probabilities will be between 0 and 1, and at each location
                  >their sum will be one).
                  >
                  >
                  >Pierre
                  >
                  > -----Original Message-----
                  > From: Gregoire Dubois [mailto:gregoire.dubois@...]
                  > Sent: Mon 9/5/2005 7:00 AM
                  > To: 'seba'; ai-geostats@...
                  > Cc:
                  > Subject: RE: [ai-geostats] natural neighbor applied to indicator
                  > transforms
                  >
                  >
                  > Ciao Sebastiano,
                  >
                  > I realized nobody replied to your question (sorry for have added
                  > confusion here).
                  >
                  > I don't see any objection in applying any interpolator to
                  > probability values.
                  > However, you should better use exact interpolators to avoid
                  > getting probabilities of occurences > 1 (or smaller than 0)
                  >
                  > Cheers
                  >
                  > Gregoire
                  >
                  >
                  >
                  > -----Original Message-----
                  > From: seba [mailto:sebastiano.trevisani@...]
                  > Sent: 02 September 2005 10:07
                  > To: ai-geostats@...
                  > Cc: ai-geostats@...; 'Nicolas Gilardi'
                  > Subject: RE: [ai-geostats] natural neighbor applied to
                  > indicator transforms
                  >
                  >
                  >
                  > I try to reformulate my question.....
                  > When performing direct (i.e. without crossvariogram)
                  > indicator kriging, practically we interpolate probability values by means
                  > of ordinary kriging. These probability values could represent the
                  > probability of occurrence of some category or the probability to overcome
                  > some threshold.
                  > My question is: is there anything wrong to interpolate
                  > these probability values with other interpolating algorithm like, for
                  > example natural neighbor (or triangulation)?
                  > In my opinion is all ok ..... considering also that we
                  > have no problem of order relation violations.
                  > Again, this technique is applied only for a preliminary
                  > data analysis
                  >
                  > Then a short consideration directed about the importance
                  > of boundaries:
                  > Quoting Nicolas Gilardi
                  > "My personnal feeling about the distinction between using
                  > a classification algorithm or a regression one is the importance you put
                  > on the boundaries.If you look for smooth boundaries, with uncertainty
                  > estimations, etc., then a regression algorithm (like indicator kriging)
                  > is certainly a good approach."
                  >
                  > Well, if you use fuzzy classification the boundaries
                  > become continuos...fuzzy.
                  >
                  > Bye
                  >
                  > S. Trevisani
                  >
                  >* By using the ai-geostats mailing list you agree to follow its rules
                  >( see http://www.ai-geostats.org/help_ai-geostats.htm )
                  >
                  >* To unsubscribe to ai-geostats, send the following in the subject or in
                  >the body (plain text format) of an email message to sympa@...
                  >
                  >Signoff ai-geostats
                Your message has been successfully submitted and would be delivered to recipients shortly.