Loading ...
Sorry, an error occurred while loading the content.

AI-GEOSTATS: Summary: Risk Assessment with Gaussian Simulation?

Expand Messages
  • Chaosheng Zhang
    Dear all, The following is a summary of replies for my question: Risk assessment with Gaussian simulation. Thanks for all the thoughtful replies. Generally
    Message 1 of 1 , May 2, 2002
    • 0 Attachment
      Dear all,

      The following is a summary of replies for my question: Risk assessment with
      Gaussian simulation. Thanks for all the thoughtful replies.

      Generally speaking, there are three ways to produce a probability map for
      the purpose of risk assessment:

      (1) Gaussian simulation with many (e.g.: 100) realisations/maps;
      (2) MultiGaussian or Indicator Kriging;
      (3) Ordinary Kriging + Kriging Variance.

      The first way is traditional, and will not receive much criticism if you use
      it. The second method is recommended by many people as it saves CPU time and
      disk space.The third method may have some advantages as it is direct and
      provides a way to incorporate errors from other sources, but there are some
      criticisms. Generally, the results are quite similar, but some theoretical
      and practical research are necessary. Well, the following are the replies
      and you may digest them by yourselves.

      Regards,

      Chaosheng Zhang

      ====================================

      Original question:

      Dear list,

      First, I would like to say thank you to Gregoire for keeping this list
      alive.

      I'm trying to do "risk assessment", and I have some questions about risk
      assessment with Gaussian Simulation:

      (1) How to produce a probability map?

      With Gaussian simulation, we can produce many maps/realisations, e.g., 100.
      Based on the 100 maps, a probability map of higher than a threshold can be
      produced. I wonder how to produce such a probability map? My understanding
      is that for each pixel, we just count how many values out of the 100 are
      >threshold, and the number is regarded as the "probability". Am I right? It
      seems that this is a time consuming procedure with GIS map algebra. Are
      there any suggestions for a quick calculation?

      (2) Is a probability map better than a Kriging interpolated map for the
      purpose of risk assessment?

      (3) Is "PCLASS" function in IDRISI 32 Release 2 better/easier than the
      probability map from Gaussian simulation?

      From the online help of IDRISI 32 R2, Section "Kriging and Simulation
      Notes", it says "If the final goal of simulated surfaces will be to directly
      reclassify the surfaces by a threshold value, and calculate a probability of
      occurrence for a process based on that threshold, conditional simulation may
      be unnecessary. Instead kriging and variance images may be created and then
      used together with PCLASS." Any comments?

      (4) How to carry out "PCLASS"?

      Following the above question, I have a problem in doing PCLASS. I cannot
      input the file name of Kriging variance to the field of "Value error" of the
      documentation file. It seems that this field only accepts a "value", not an
      "image file name" or anything in text. Anyone has the experience?

      Cheers,

      Chaosheng Zhang
      =================================================
      Dr. Chaosheng Zhang
      Lecturer in GIS
      Department of Geography
      National University of Ireland
      Galway
      IRELAND

      Tel: +353-91-524411 ext. 2375
      Fax: +353-91-525700
      Email: Chaosheng.Zhang@...
      ChaoshengZhang@...
      Web: http://www.nuigalway.ie/geography/zhang.html
      =================================================

      Replies:
      ---------------------------------------------------------------------------

      Hello,

      In the past few years stochastic simulation has
      been increasingly used to produce probability maps.
      To my opinion it's generally a waste of CPU time since
      similar information can be retrieved using kriging,
      either in a multiGaussian framework or applied to
      indicator transforms.
      The issue of when using simulation vs kriging
      is further discussed in:
      Goovaerts, P. 2001.
      Geostatistical modelling of uncertainty in soil science.
      Geoderma, 103: 3-26.

      I take this opportunity to thank Gregoire
      for a remarkable and often challenging job
      of keeping this e-mail list alive through the years.

      Pierre
      ><><><><><><><><><><><><><><><><><><><><><><><><><><>

      ________ ________
      | \ / | Pierre Goovaerts
      |_ \ / _| Assistant professor
      __|________\/________|__ Dept of Civil & Environmental Engineering
      | | The University of Michigan
      | M I C H I G A N | EWRE Building, Room 117
      |________________________| Ann Arbor, Michigan, 48109-2125, U.S.A
      _| |_\ /_| |_
      | |\ /| | E-mail: goovaert@...
      |________| \/ |________| Phone: (734) 936-0141
      Fax: (734) 763-2275
      http://www-personal.engin.umich.edu/~goovaert/

      ><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

      Dr Zhang:
      If you are interested in assessing uncertainty, simulation offers some
      advantages over kriging. Use of the kriging variance to model uncertainty
      at a point or over a block requires the assumption of constant variance
      (homoscedasticity): the kriging variance does not consider where the point
      or block is on the map. For example, assume you have two exposure units
      that
      you are interested in; one is located in an area characterized by low
      contaminant concentrations and low variance and the other is located in an
      area with high concentrations and high variance (commonly observed in
      environmental contamination data). You wish to estimate the probability
      that the mean concentration within each of these exposure units exceeds a
      threshold. If
      the data that are used to predict the concentration for each of the exposure
      units have the same or similar configurations, the uncertainty in the
      predictions
      will be the same or similar, even though the variability in concentrations
      in the two areas are very different. The values of the samples used in the
      prediction (not only geometric arrangement of the locations) are considered
      in the simulation approach to modeling uncertainty. Indicator kriging
      offers
      some advantages to ordinary kriging, although it requires modeling more than
      one variogram and modeling the variogram for high concentration levels can
      be problematic.

      If you are interested, I have put a few pages on our web site that compare
      ordinary krigng and simulation. http://esc.syrres.com/geosem

      Bill Thayer
      ----------------------------------------------------------------------------

      Pierre,

      Thanks for the comments. It's my first time to use Gaussian simulation to do
      something possibly useful, and I have also found the calculation quite slow
      even though the speed of my computer is not so bad. I'm using Idrisi 32
      (with GStat), and the grid is about 500*500.

      What I worry about is that how useful these realizations are? Obviously they
      are not "realistic" even though some people say they want to produce a more
      realistic map, instead of the smoothed Kriging map. Another concern is that
      the probability map produced based on these realisations may not be so good
      as the PCLASS (available in Idrisi), as PCLASS may have a better probability
      background or clearer assumption. In PCLASS, the square root (not sure
      yet???) of Kriging variances can be used as the RMS (root mean square) or
      standard deviation of the pixel corresponding to the Kriging map, and the
      probability > a threshold can be calculated based on the normal assumption.

      More comments and suggestions will give me more confidence in doing the risk
      assessment (heavy metal pollution in soils of a mine area).

      Cheers,

      Chaosheng
      ----------------------------------------------------------------------------

      Dear Chaosheng

      I have the same opinion. This is a paper that is published at Geostats Rio
      2000 (Dr Margaret Armstrong Kluwer book). I think if your realizations honor
      the
      conditional data you can associate uncertainity measures for your dataset. I
      have beeing working with this with Isatis, using the Simulation Post
      Processing to calculate error for a confidence interval

      At this paper I use probability map, mean of simulations map and IQR as a
      uncertainity measure.

      All the Best

      Alessandro Henrique Medeiros Silva
      Geologist - Anglogold Brasil
      alessandro@... <mailto:alessandro@...>

      +55-31-3589-1687
      +55-31-9953-0759

      ----------------------------------------------------------------------------

      I am curious about the use of 100 realizations to generate a probability
      map. is this a standard approach? if so, is a "small" p-value (such as
      .05) used? if so, it would seem like 100 iterations might be a smallish
      sample size for distinguishing, say, .05 (ie 5 outcomes out of 100) from,
      say, .01. is 100 used because it seems like it is a reasonable number or
      because of the computer time restrictions?

      also, do geostat folks treat these as realizations or as
      pseudo-realizations?

      brian

      ****************************************************************
      Brian Gray
      USGS Upper Midwest Environmental Sciences Center
      575 Lester Avenue, Onalaska, WI 54650
      ph 608-783-7550 ext 19, FAX 608-783-8058
      brgray@...
      *****************************************************************

      ----------------------------------------------------------------------------

      Chaosheng, I agree with Pierre that if your only goal is to generate a
      probability map, then IK is faster and more straightforward than simulation
      and that MG kriging will give the same results, faster, than MG simulation.

      However, we have found a couple of practical reasons where it may be
      advantageous to use simulation for soil contamination studies, so I'll add
      my two cents worth to this discussion:

      1) When trying to explain the concepts of spatial variability and
      uncertainty, we have found that showing example realizations of what the
      possible distribution of contaminants could look like provides the groups
      involved to get a more intuitive understanding of these ideas. People
      understand the idea of flipping a coin 100 times to get the probability of
      heads or tails, but have a hard time visualizing in their mind what a "coin
      flip" looks like in a 2-D soil contamination problem. Showing some example
      conditional realizations gives them a stronger feel for the nature of the
      answers geostats is providing to their questions.

      2) A number of sites are in the process of designing chemical and/or
      mechanical treatment systems for the soil that will be removed from the site
      while the remediation map is being determined. One set of design parameters
      for these treatment systems is the best and worst case estimates of the
      total amount of contamination (curies, grams, etc.) contained in the soil at
      the site. These best/worst case estimates depend on the joint estimate of
      the contamination at all locations across the site. This is something
      simulation provides, but kriging doesn't.

      3) For soils with radioactive contaminants, there are a number of different
      sensors (e.g., a gamma detector mounted several meters off the ground) being
      deployed at field sites that integrate the activity of the contaminant over
      a larger area/volume. Simulation of the fine scale distribution of the
      activity can be useful in looking at how these sensors scale up the activity
      values to the integrated measurement.

      Also when looking at IK vs MG kriging (or simulation) keep in mind that
      rarely do the client, stakeholder(s) and regulator(s) have a single action
      level or threshold that they have all agreed to for application at the site.
      There are usually multiple thresholds corresponding to different future-land
      use scenarios and different health risk models. If creating the probabilty
      maps through IK then each different threshold requires a new set of
      indicator variograms. If you use MG kriging or simulation, you only need do
      the variography once-keep in mind that the MG assumption does have other
      problems with connectivity of extreme values that may or may not be
      important in your application (this is generally a bigger concern in fluid
      flow problems than in soil contamination problems).

      I'll add my thanks to Gregoire for 7 years of superb work!

      Sean

      Sean A. McKenna Ph.D.
      Geohydrology Department
      Sandia National Laboratories
      PO Box 5800 MS 0735
      Albuquerque, NM 87185-0735
      ph: 505 844-2450
      ----------------------------------------------------------------------------

      Hi Brian,

      One hundred realizations are typically generated
      mainly for CPU reasons.
      You are perfectly right that this number is
      too small when looking at small probabilities
      like 0.05 or 0.01. It's why I wouldn't recommend
      using stochastic simulation to derive probability of occurrence
      of events at pixel locations. Just use kriging to build
      your local probability distributions.
      Use simulation if you have a transfer function, such as flow
      simulator, that requires a model of spatial uncertainty,
      or if you need to derive block probability distributions
      (upscaling or aggregation problems).

      More generally, there is more research to be done on the
      use of stochastic simulation for probabilistic assessment,
      including the question of equally-probability of realizatiuons
      being generated.

      Pierre
      ----------------------------------------------------------------------------

      My tuppence worth.

      The major advantages of simulation as a risk
      assessment tool lie in the cases where you are trying
      to derive some conclusion from the data rather than
      just look at the values themselves.

      For example, see Bill and my papers at Battelle
      Conference 1987 or the paper at the Geostat Avignon in
      1988. There are oters. All of these are available in
      Word format for download at my page
      http://uk.geocities.com/drisobelclark/resume/Publications.html

      We were trying to derive the travel path of a particle
      given the pressure of fluid in an aquifer. Not a
      linear transform by anyone's standards.

      Isobel Clark

      ----------------------------------------------------------------------------

      >1) When trying to explain the concepts of spatial variability and
      >uncertainty, we have found that showing example realizations of what the
      >possible distribution of contaminants could look like provides the groups
      >involved to get a more intuitive understanding of these ideas.

      Taking this a step further, there was a paper in the AAPG Stochastic
      Modeling and Geostatistics Volume entitled "The Visualization
      of Spatial Uncertainty" (R Mohan Srivastava) which proposes the use
      of probability field simulation to generate dynamic animations
      of different realizations. I have yet to see it being implemented in
      commercial software, although in concept I can see the benefit
      of having something like this to illustrate the "equiprobable"
      realizations. The idea was to generate smooth transitions of
      successive "frames" by sampling from adjacent columns of a set of
      probability values, for a movie-like effect.

      Syed
      ----------------------------------------------------------------------------

      Dear all,

      Thanks for so many interesting replies and thoughtful discussion. This is
      not a summary yet, as I am expecting more to come.

      Just to express my feeling about Indicator Kriging. To produce a probability
      map, IK might be one of the choices. However, I always feel that too much
      information is lost when doing the indicator transformation. When I see so
      many "0"s in a dataset, I just feel the data quality is too poor.

      Well, the other method of combination of Kriging and Kriging variance for
      risk assessment has not been well discussed yet, and I would like to read
      more comments.

      My last question "(4) how to carry out "PCLASS" " is now answered by the
      developer of Idrisi. The fact that the file name of Kriging variance cannot
      be entered (with Metadata command) is a bug of the program, which will be
      corrected soon. At present time, a text editor may be used to modify the
      image documentation file.

      Now, let me discuss how I would like to make a probability map based on
      Kriging and Kriging variance. For each pixel of the Kriging interpolated
      map, there is a value of Kriging variance. The Kriging variance is a measure
      of uncertainty (which is related to sampling density and spatial variation,
      etc.???). If we assume that the value of the Kriging pixel follow a normal
      distribution and the standard deviation is equal to the SQRT of Kriging
      variance, the probability of any threshold can be calculated. Furthermore,
      to make the risk assessment more realistic, I would like to include other
      errors, such as sampling error and laboratory analysis error into risk
      assessment. These errors can hardly be quantified, but if we say 10% or 20%
      of the pixel value (for soil samples), perhaps there is no objection.
      Therefore, the standard deviation of the pixel is increased by adding this
      kind of errors.

      I am not clear how to calculate the total standard deviation of the two
      sources, is it:
      Total standard deviation =
      SQRT (Kriging Variance + SQUARE (Sampling Errors) ) ?

      Any ideas and comments on this method?

      Chaosheng Zhang

      ----------------------------------------------------------------------------

      Chaosheng,

      The kriging equations themselves are non-parametric. Without a
      transformation of the data, the kriging estimate and the kriging variance do
      not define a Gaussian distribution. If you want to make the Gaussian
      assumption, as you state below, then using multiGaussian (MG) Kriging as
      mentioned a few times in the replys to your question yesterday, is the way
      to go. MG Kriging requires the transform of the raw data to standard normal
      space and that the variograms be calculated using the transformed data.

      Good luck

      Sean

      Sean A. McKenna Ph.D.
      Geohydrology Department
      Sandia National Laboratories
      PO Box 5800 MS 0735
      Albuquerque, NM 87185-0735
      ph: 505 844-2450

      ----------------------------------------------------------------------------

      >
      >Well, the other method of combination of Kriging and Kriging variance for
      >risk assessment has not been well discussed yet, and I would like to read
      >more comments.

      I believe it is useful to compare maps of predicted concentrations or maps
      of the probabilityof exceeding threshold(s) that have been prepared with
      different methods (e.g., SGS and OK) and make some judgments/decisions
      about which maps appear reasonable given your knowledge of the data and
      processes that created, transported and possibly transformed the
      contaminants. My experience with kriging is it is very useful for central
      tendency estimates but should be viewed with a high level of suspicion if
      used to model uncertainty.

      >
      >Now, let me discuss how I would like to make a probability map based on
      >Kriging and Kriging variance. For each pixel of the Kriging interpolated
      >map, there is a value of Kriging variance. The Kriging variance is a
      measure
      >of uncertainty (which is related to sampling density and spatial variation,
      >etc.???). If we assume that the value of the Kriging pixel follow a normal
      >distribution and the standard deviation is equal to the SQRT of Kriging
      >variance, the probability of any threshold can be calculated. Furthermore,
      >to make the risk assessment more realistic, I would like to include other
      >errors, such as sampling error and laboratory analysis error into risk
      >assessment. These errors can hardly be quantified, but if we say 10% or 20%
      >of the pixel value (for soil samples), perhaps there is no objection.
      >Therefore, the standard deviation of the pixel is increased by adding this
      >kind of errors.
      >
      >I am not clear how to calculate the total standard deviation of the two
      >sources, is it:
      > Total standard deviation =
      > SQRT (Kriging Variance + SQUARE (Sampling Errors) ) ?
      >
      >Any ideas and comments on this method?

      In general, you can add the variances if you believe the two sources of
      error are independent. However, the data you use to develop a model of
      spatial autocorrelation already includes measurement error (it is
      incorporated in the 'nugget'), unless you have 'filtered it out' by
      estimating their magnitudes and variances with quality control samples.

      **************************************************
      William C. Thayer, P.E.

      Environmental Science Center
      Syracuse Research Corporation
      301 Plainfield Road, Suite 350
      Syracuse, NY 13212
      phone: (315) 452-8424
      fax: (315) 452-8440
      email: thayer@...
      web: http://esc.syrres.com/
      http://esc.syrres.com/geosem/
      **************************************************

      The kriging variance at short distances will increase with increasing
      nugget effect. In any case, the kriging variance can not be less than the
      nugget, so if you ensure that the nugget includes the measurement error
      (which you can set based on your knowledge of the sampling and measurement
      protocols, as well as the site-specific data and quality control data), the
      minimum kriging variance (i.e., at short distances) will be valid.

      A related point: What do risk estimates over small areas represent? I
      think it is important to keep in mind the exposure unit or the area over
      which exposure occurs. It seems to me that one first has to estimate the
      likelihood that the risk from exposure to contamination exceeds some
      threshold, which should be done at the exposure unit scale (i.e., over an
      area that exposure occurs, such as an industrial site or perhpas a
      residential yard). If the risk is unacceptable, calculating the
      probability of exceeding a threshold at a smaller scale may be appropriate
      for determining what areas of the site should be cleaned up. In my
      opinion, one of the shortcomings of the risk assessment/geostatistics
      programs that the U.S. EPA has developed is its inability to assess the
      uncertainty in the mean on the exposure unit scale. So they have set about
      answering the second question (where to clean up or where to collect more
      samples) without answering the first sufficiently (is there a problem?);
      again, in my opinion.

      I apologize if you are already aware of these issues. Its hard to know
      sometimes with e-mail!

      Best regards,
      Bill
      ----------------------------------------------------------------------------

      For some resources classification (measured, indicated, inferred)
      it's possible to find some kriging variance values that have correlation
      with sample density. When you have more samples, you have small values for
      KV...

      But The best way to consider a error measure is calculate based on a set of
      simulations
      (My opinion...). What's you geostats area? environment? mining? oil?

      All the best

      Alessandro Henrique Medeiros Silva
      Geologist - Anglogold Brasil
      alessandro@... <mailto:alessandro@...>

      +55-31-3589-1687
      +55-31-9953-0759
      ----------------------------------------------------------------------------

      Three comments on your ideas below

      1. I would be very careful about using the kriging variance to generate
      confidence intervals, with or without the assumption that the "errors" are
      normally distributed. The kriging variance is a useful measure of the
      relative reliability of the kriging estimator but it is certainly not an
      absolute measure. If you multiply the variogram model by a postive number (I
      know this will make the model appear to not fit the graph of the
      experimental variogram) then the kriging variance will change by that
      multiplier BUT the kriging weights will not change and hence the estimated
      values will not change. Thus the kriging variance can not be treated as an
      absolute measure of reliability.

      2. If you really want to incorporate the measurement or instrument
      variability then it should be done another way. You should use the smoothing
      form of the kriging equations, if you want details send me an email. This is
      also discussed in several but not all books on geostatistics.

      3. The probability distribution that you would generate , i.e., estimate,
      by using indicator kriging is very different from the distribution you would
      generate by using confidence intervals. There is a reason why the term is
      "confidence" and not "probability" and that is at the heart of the
      distinction between the two approaches (indicator kriging vs confidence
      intervals).

      Donald E. Myers
      http://www.u.arizona.edu/~donaldm

      ----------------------------------------------------------------------------

      Dear Dr. Zhang,
      I understand from the discussion on the ai-geostats forum that your are
      dealing with producing a probability map for the purposes of risk
      assesment. I am interested in agricultural hydrology. In this context, I
      would like to apply IK technique to precipitation fields and produce
      probability maps. As you know the difficulty comes from the lack of
      literature explaining the subject clearly with examples. I would like to
      learn if you have any kind of documents such as reports, class solved
      examples, books,..etc. on the subject.
      Thanks for your attention in advance.
      Yours Sincerely,
      Dr. Mahmut CETIN

      ---
      Dr. Mahmut CETIN
      Department of Agricultural Structures and Irrigation
      Faculty of Agriculture
      University of Cukurova
      01330 Balcali-Adana/TURKEY
      Tel: ++90 (322)338 6877 (Direct)
      ++90 (322) 338 6318 (Switchboard, ext. : 25)
      Fax: ++90 (322) 338 6386
      E-Mail: mcet64@...

      ----------------------------------------------------------------------------

      Dear Syed, et al.,

      I did much of what you described in the GRASS GIS a while back. (GRASS
      is public domain, not commercial, but it is a very good GIS.) The title
      of the paper is "Visualizing Spatial Data Uncertainty Using Animation"
      and a copy of it is located at:

      http://www.geo.hunter.cuny.edu/~chuck/CGFinal/paper.htm

      The special issue of Computers & Geosciences (Vol. 23, No. 4, pp.
      387-395, 1997) included a CD-ROM that contained some of the animations
      in MPEG form. My web site includes the animations and instructions on
      how to construct them.

      I used spherical interpolation to generate smooth transitions between
      realizations in order to keep the interpolations valid statistically.

      I have a more recent work that studies user perception of animated maps
      representing data and application uncertainty. An outline of that work
      from a conference presentation (with all equations and animations) is
      available at:

      http://www.geo.hunter.cuny.edu/~chuck/GIScience2000/paper.html

      The full paper is about to head out for peer review.

      sincerely, chuck

      ----------------------------------------------------------------------------

      Dear Syed, et al.,

      I did much of what you described in the GRASS GIS a while back. (GRASS
      is public domain, not commercial, but it is a very good GIS.) The title
      of the paper is "Visualizing Spatial Data Uncertainty Using Animation"
      and a copy of it is located at:

      http://www.geo.hunter.cuny.edu/~chuck/CGFinal/paper.htm

      The special issue of Computers & Geosciences (Vol. 23, No. 4, pp.
      387-395, 1997) included a CD-ROM that contained some of the animations
      in MPEG form. My web site includes the animations and instructions on
      how to construct them.

      I used spherical interpolation to generate smooth transitions between
      realizations in order to keep the interpolations valid statistically.

      I have a more recent work that studies user perception of animated maps
      representing data and application uncertainty. An outline of that work
      from a conference presentation (with all equations and animations) is
      available at:

      http://www.geo.hunter.cuny.edu/~chuck/GIScience2000/paper.html

      The full paper is about to head out for peer review.

      sincerely, chuck

      ===========================================================



      --
      * To post a message to the list, send it to ai-geostats@...
      * As a general service to the users, please remember to post a summary of any useful responses to your questions.
      * To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
      * Support to the list is provided at http://www.ai-geostats.org
    Your message has been successfully submitted and would be delivered to recipients shortly.