Loading ...
Sorry, an error occurred while loading the content.

R: AI-GEOSTATS: Extreme values?

Expand Messages
  • claudio.cocheo
    Dear Chaosheng, ... Is it possible, in your opinion, to model your variogram excluding those few extremes data and after to krige all data, included the
    Message 1 of 10 , Dec 14, 2001
    • 0 Attachment
      Dear Chaosheng,

      > Thanks. I think the sampling density is good enough to reveal the spatial
      > structure, and the extreme samples are located within the "hot spots". The
      > problem is that the few values are still extremely high within the "hot
      > spots". This may be what the "nugget effect" means.
      >
      > I'm just wondering if these few extreme values should really be
      > "discarded"/
      > "censored" or replaced. However, this could get some criticism as they may
      > be "real".

      Is it possible, in your opinion, to model your variogram excluding those few
      extremes data and after to krige all data, included the extremes values?
      In this way, probably, you loose some spatial information concerning the
      variability of your data but you could obtain a more reliable picture of the
      "background" values. It depends from what you are asking to your data.
      What you, or somebody else, think about?

      regards
      Claudio

      ----------------------------------------------------------------------------
      -----------------------------

      Claudio Cocheo
      Fondazione Salvatore Maugeri - IRCCS
      Centro di Ricerche Ambientali
      via Svizzera, 16
      I 35127 - Padova
      ph. (39) 0498064511
      fax (39) 0498064555
      mailto:ccocheo@...
      website: http://www.fsm.it


      --
      * To post a message to the list, send it to ai-geostats@...
      * As a general service to the users, please remember to post a summary of any useful responses to your questions.
      * To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
      * Support to the list is provided at http://www.ai-geostats.org
    • Marcel Vallée
      Dear Chaosheng Zhang This problem can be looked in various perspectives. You have to fit the data in the broader picture and objectives. First, what do your
      Message 2 of 10 , Dec 14, 2001
      • 0 Attachment
        Dear Chaosheng Zhang

        This problem can be looked in various perspectives. You have to fit the data in the broader
        picture and objectives.

        First, what do your soil samples represent? How were they collected, what was their size? Are
        they spot samples, multiple takes in a cross pattern with x metres between takes up to y
        meters away from the centre? Etc.?

        A significant part of nuggets effects when dealing with rock or soil materials may be sampling
        and sample preparation generated. If these samples were assayed by AA, what was the size
        of the portion used? If one gram, it is much more liable to generating a nugget effect than with 5
        or 10 grams whenever pulverisation size was not fine enough and uniform.

        Second, what is the purpose of your study. Academic work? Detection, remediation-
        restoration, etc.? The high values might have physical significance in the later perspective
        and smothing them may not be the ideal solution. Lead and Arsenic contamination cannot be
        neglected or minimized.

        In an industry or regulation perspective, the recommendation in that case might be to to carry
        out additional sampling around the hot spots to delineate them better, say samples at 100 m
        spacing, as well as checking the original hot spots, with a sampling method designed to be
        representative. I am afraid I may not be easing you out of your problem, but such is physical
        reality.

        Chapter 8 in Jeff Myer's book "Geostatistical Error Management," deals with sampling and
        Chapter 16 with sampling strategy. I published a text on "Sampling Quality Control" in a
        mineral exploration and development perspective in Exploration and Mining Geology, Vol 7,
        No 1-2, p. 107-116 (1998). This issue has several other papers on sampling. If it is not
        available to you, I could send you a file copy of my paper.

        Cheers

        Marcel Vallée

        Geoconseil Marcel Vallée Inc.
        706 Routhier Ave
        Québec, Québec G1X 3J9
        Canada
        Tel: (1) 418 652 3497
        Fax: (1) 418 652 9148
        Email: vallee.marcel@...

        ================================================

        14/12/01 06:33:35, Chaosheng Zhang <Chaosheng.Zhang@...> wrote:

        >Dear Marcel Vallée,
        >
        >Thanks. I think the sampling density is good enough to reveal the spatial
        >structure, and the extreme samples are located within the "hot spots". The
        >problem is that the few values are still extremely high within the "hot
        >spots". This may be what the "nugget effect" means.
        >
        >I'm just wondering if these few extreme values should really be "discarded"/
        >"censored" or replaced. However, this could get some criticism as they may
        >be "real".
        >
        >If it is hard to find the best way, I will have to "replace" all the extreme
        >values with 99% or 98% percentiles. But I'm not sure if it is appropriate to
        >do so.
        >
        >Cheers,
        >
        >Chaosheng Zhang
        >
        >
        >----- Original Message -----
        >From: "Marcel Vallée" <vallee.marcel@...>
        >To: <ai-geostats@...>; "Chaosheng Zhang" <Chaosheng.Zhang@...>
        >Sent: Thursday, December 13, 2001 10:40 PM
        >Subject: Re: AI-GEOSTATS: Extreme values?
        >
        >
        >>
        >> Dear Chaosheng Zang
        >>
        >> The sampling interval is so wide that the high values could easily be
        >>related to "hot spots" of
        >> higher grade contamination, i..e dumping areas for particular kinds of
        >>slags, mineralized waste, etc. A property map might help.
        >>
        >> Have you contoured the data? If so, the sampling interval is so wide that
        >>real hot spots of
        >> environmental significance might not show 2D distribution on such a wide
        >sampling grid, however.
        >>
        >> Regards
        >>
        >> Marcel Vallée, Eng,, Geo.
        >> Geoconseil Marcel Vallée Inc.
        >> 706 Routhier Ave
        >> Québec, Québec G1X 3J9
        >> Canada
        >> Tel: (1) 418 652 3497
        >> Fax: (1) 418 652 9148
        >> Email: vallee.marcel@...
        >>
        >> ==============================================
        >> 13/12/01 08:01:48, Chaosheng Zhang <Chaosheng.Zhang@...> wrote:
        >> >
        >> > Date: Thu, 13 Dec 2001 13:01:48 +0000
        >> >
        >> > From: Chaosheng Zhang <Chaosheng.Zhang@...>
        >> > Subject:AI-GEOSTATS: Extreme values?
        >> > To: ai-geostats@...
        >> >
        >> > Dear all,
        >> >
        >> > My question is: How to deal with the extreme/outlying values in a data
        >>>set?
        >>>
        >> > I am dealing with heavy metal concentrations in soils from a mine area.
        >>>The sample number is 223, and the samples are spatially evenly distributed
        >>>with the sampling interval of 400 metres. There are several samples with
        >>>extremely high values, which makes me feel uncomfortable. The
        >>>percentiles of the dataset are listed as follows (in mg/kg):
        >> >
        >> >
        >> > Zn Cu Pb Cd As
        >> > Min 4 1 25 0.0 2
        >> > 5% 35 6 35 0.1 6
        >> > 10% 40 7 41 0.2 7
        >> >
        >> > 25% 65 13 62 0.3 9
        >> > 50% 122 18 168 0.6 15
        >> > 75% 338 27 821 1.5 28
        >> > 90% 907 56 2799 2.8 58
        >> >
        >> > 95% 1986 116 4490 4.2 80
        >> > 96% 2462 151 4698 4.9 82
        >> > 97% 3493 178 5413 6.2 91
        >> > 98% 4697 207 7609 8.3 111
        >> >
        >> > 99% 6712 247 11750 12.4 184
        >> > Max 11473 1293 16305 48.5 1060

        >> > When doing geostatistical and statistical analyses, we need some confidence
        >> > in dealing with the these very high extreme values which account for less
        >> > than 2% of the total sample number.
        >> >
        >> > Any suggestions?
        >> >
        >> > Cheers,
        >> >
        >> > Chaosheng Zhang
        >> > ===================================
        >> > Dr. Chaosheng Zhang
        >> > Department of Geography
        >> > National University of Ireland
        >> > Galway
        >> > IRELAND
        >> >
        >> > Tel: +353-91-524411 ext. 2375
        >> > Fax: +353-91-525700
        >> > Email: Chaosheng.Zhang@...
        >> > ===================================
        >>
        >>
        >>
        >>
        >> --
        >> * To post a message to the list, send it to ai-geostats@...
        >> * As a general service to the users, please remember to post a summary of
        >any useful responses to your questions.
        >> * To unsubscribe, send an email to majordomo@... with no subject and
        >"unsubscribe ai-geostats" followed by "end" on the next line in the message
        >body. DO NOT SEND Subscribe/Unsubscribe requests to the list
        >> * Support to the list is provided at http://www.ai-geostats.org
        >
        >
        >




        --
        * To post a message to the list, send it to ai-geostats@...
        * As a general service to the users, please remember to post a summary of any useful responses to your questions.
        * To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
        * Support to the list is provided at http://www.ai-geostats.org
      • Myers, Jeff
        Chaosheng Zhang - I think Marcel Vallee is headed in the right direction on your problem. There is a good chance that the problem is one of sample and or
        Message 3 of 10 , Dec 14, 2001
        • 0 Attachment
          Chaosheng Zhang -

          I think Marcel Vallee is headed in the right direction on your problem.
          There is a good chance that the problem is one of sample and or subsample
          support. As mentioned, if you sampled within a foot or tow of a location
          that displays an extreme or "outlier" value, you may find values an order of
          magnitude or more below the outlier. Similarly, you may also have
          "inliers", where a sample nearby a location with a low concentration may
          contain a significantly higher value. Of course, no one gets excited about
          the inliers that may be unrepresentative, but we get very excited about the
          outliers!

          The possibility of extreme values should be planned for in the initial stage
          of the sampling program. Pierre Gy's work has revealed that the physical
          size, volume, and orientation of a sample and subsample (i.e. the support)
          are crucial to the concentration estimate obtained. You are asking a lot to
          have a 10-g sample represent 400 meters between sample locations in any
          case. Unless the support of the original sample and all subsampling stages
          was sufficient, there is little chance that the samples are highly
          representative of the true concentration. Mine areas typically are very
          heterogeneous and proper sampling support when sampling is essential.
          Perhaps you can provide some details. If the underlying data are not
          representative due to improper suppoort, you are trying to "contour an
          illusion", and typically the results are not pleasing.

          The way in which the data are used in decision-making is also important.
          For instance, if your purpose is to delineate hot spots for risk assessment,
          extreme values do not pose a problem as they will be addressed. You may,
          however, be very interested in getting your best information at an economic
          cutoff value or risk threshold, since the decision for treatment of values
          high above or way below the action level is easy.

          Jeff Myers
          Westinghouse Safety Management Solutions
          2131 S. Centennial Ave., SE
          Aiken, SC 29803
          803.502.9747 (direct)
          803.502.9767 (main)
          803.502.2747 (fax)
          jeff.myers@... <mailto:jeff.myers@...>
          http://www.gemdqos.com <http://www.gemdqos.com>


          -----Original Message-----
          From: Chaosheng Zhang [mailto:Chaosheng.Zhang@...]
          Sent: Thursday, December 13, 2001 8:02 AM
          To: ai-geostats@...
          Subject: AI-GEOSTATS: Extreme values?


          Dear all,

          My question is: How to deal with the extreme/outlying values in a data set?

          I am dealing with heavy metal concentrations in soils from a mine area. The
          sample number is 223, and the samples are spatially evenly distributed with
          the sampling interval of 400 metres. There are several samples with
          extremely high values, which makes me feel uncomfortable. The percentiles of
          the dataset are listed as follows (in mg/kg):

          Zn Cu Pb Cd As
          Min 4 1 25 0.0 2
          5% 35 6 35 0.1 6
          10% 40 7 41 0.2 7
          25% 65 13 62 0.3 9
          50% 122 18 168 0.6 15
          75% 338 27 821 1.5 28
          90% 907 56 2799 2.8 58
          95% 1986 116 4490 4.2 80
          96% 2462 151 4698 4.9 82
          97% 3493 178 5413 6.2 91
          98% 4697 207 7609 8.3 111
          99% 6712 247 11750 12.4 184
          Max 11473 1293 16305 48.5 1060

          When doing geostatistical and statistical analyses, we need some confidence
          in dealing with the these very high extreme values which account for less
          than 2% of the total sample number.

          Any suggestions?


          Cheers,

          Chaosheng Zhang
          ===================================
          Dr. Chaosheng Zhang
          Department of Geography
          National University of Ireland
          Galway
          IRELAND

          Tel: +353-91-524411 ext. 2375
          Fax: +353-91-525700
          Email: Chaosheng.Zhang@...
          ===================================




          [Non-text portions of this message have been removed]
        Your message has been successfully submitted and would be delivered to recipients shortly.