## AI-GEOSTATS: Summary: Risk Assessment with Gaussian Simulation?

Expand Messages
• Dear all, The following is a summary of replies for my question: Risk assessment with Gaussian simulation. Thanks for all the thoughtful replies. Generally
Message 1 of 1 , May 2, 2002
Dear all,

The following is a summary of replies for my question: Risk assessment with
Gaussian simulation. Thanks for all the thoughtful replies.

Generally speaking, there are three ways to produce a probability map for
the purpose of risk assessment:

(1) Gaussian simulation with many (e.g.: 100) realisations/maps;
(2) MultiGaussian or Indicator Kriging;
(3) Ordinary Kriging + Kriging Variance.

The first way is traditional, and will not receive much criticism if you use
it. The second method is recommended by many people as it saves CPU time and
disk space.The third method may have some advantages as it is direct and
provides a way to incorporate errors from other sources, but there are some
criticisms. Generally, the results are quite similar, but some theoretical
and practical research are necessary. Well, the following are the replies
and you may digest them by yourselves.

Regards,

Chaosheng Zhang

====================================

Original question:

Dear list,

First, I would like to say thank you to Gregoire for keeping this list
alive.

I'm trying to do "risk assessment", and I have some questions about risk
assessment with Gaussian Simulation:

(1) How to produce a probability map?

With Gaussian simulation, we can produce many maps/realisations, e.g., 100.
Based on the 100 maps, a probability map of higher than a threshold can be
produced. I wonder how to produce such a probability map? My understanding
is that for each pixel, we just count how many values out of the 100 are
>threshold, and the number is regarded as the "probability". Am I right? It
seems that this is a time consuming procedure with GIS map algebra. Are
there any suggestions for a quick calculation?

(2) Is a probability map better than a Kriging interpolated map for the
purpose of risk assessment?

(3) Is "PCLASS" function in IDRISI 32 Release 2 better/easier than the
probability map from Gaussian simulation?

From the online help of IDRISI 32 R2, Section "Kriging and Simulation
Notes", it says "If the final goal of simulated surfaces will be to directly
reclassify the surfaces by a threshold value, and calculate a probability of
occurrence for a process based on that threshold, conditional simulation may
be unnecessary. Instead kriging and variance images may be created and then
used together with PCLASS." Any comments?

(4) How to carry out "PCLASS"?

Following the above question, I have a problem in doing PCLASS. I cannot
input the file name of Kriging variance to the field of "Value error" of the
documentation file. It seems that this field only accepts a "value", not an
"image file name" or anything in text. Anyone has the experience?

Cheers,

Chaosheng Zhang
=================================================
Dr. Chaosheng Zhang
Lecturer in GIS
Department of Geography
National University of Ireland
Galway
IRELAND

Tel: +353-91-524411 ext. 2375
Fax: +353-91-525700
Email: Chaosheng.Zhang@...
ChaoshengZhang@...
Web: http://www.nuigalway.ie/geography/zhang.html
=================================================

Replies:
---------------------------------------------------------------------------

Hello,

In the past few years stochastic simulation has
been increasingly used to produce probability maps.
To my opinion it's generally a waste of CPU time since
similar information can be retrieved using kriging,
either in a multiGaussian framework or applied to
indicator transforms.
The issue of when using simulation vs kriging
is further discussed in:
Goovaerts, P. 2001.
Geostatistical modelling of uncertainty in soil science.
Geoderma, 103: 3-26.

I take this opportunity to thank Gregoire
for a remarkable and often challenging job
of keeping this e-mail list alive through the years.

Pierre
><><><><><><><><><><><><><><><><><><><><><><><><><><>

________ ________
| \ / | Pierre Goovaerts
|_ \ / _| Assistant professor
__|________\/________|__ Dept of Civil & Environmental Engineering
| | The University of Michigan
| M I C H I G A N | EWRE Building, Room 117
|________________________| Ann Arbor, Michigan, 48109-2125, U.S.A
_| |_\ /_| |_
| |\ /| | E-mail: goovaert@...
|________| \/ |________| Phone: (734) 936-0141
Fax: (734) 763-2275
http://www-personal.engin.umich.edu/~goovaert/

><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Dr Zhang:
If you are interested in assessing uncertainty, simulation offers some
advantages over kriging. Use of the kriging variance to model uncertainty
at a point or over a block requires the assumption of constant variance
(homoscedasticity): the kriging variance does not consider where the point
or block is on the map. For example, assume you have two exposure units
that
you are interested in; one is located in an area characterized by low
contaminant concentrations and low variance and the other is located in an
area with high concentrations and high variance (commonly observed in
environmental contamination data). You wish to estimate the probability
that the mean concentration within each of these exposure units exceeds a
threshold. If
the data that are used to predict the concentration for each of the exposure
units have the same or similar configurations, the uncertainty in the
predictions
will be the same or similar, even though the variability in concentrations
in the two areas are very different. The values of the samples used in the
prediction (not only geometric arrangement of the locations) are considered
in the simulation approach to modeling uncertainty. Indicator kriging
offers
some advantages to ordinary kriging, although it requires modeling more than
one variogram and modeling the variogram for high concentration levels can
be problematic.

If you are interested, I have put a few pages on our web site that compare
ordinary krigng and simulation. http://esc.syrres.com/geosem

Bill Thayer
----------------------------------------------------------------------------

Pierre,

Thanks for the comments. It's my first time to use Gaussian simulation to do
something possibly useful, and I have also found the calculation quite slow
even though the speed of my computer is not so bad. I'm using Idrisi 32
(with GStat), and the grid is about 500*500.

What I worry about is that how useful these realizations are? Obviously they
are not "realistic" even though some people say they want to produce a more
realistic map, instead of the smoothed Kriging map. Another concern is that
the probability map produced based on these realisations may not be so good
as the PCLASS (available in Idrisi), as PCLASS may have a better probability
background or clearer assumption. In PCLASS, the square root (not sure
yet???) of Kriging variances can be used as the RMS (root mean square) or
standard deviation of the pixel corresponding to the Kriging map, and the
probability > a threshold can be calculated based on the normal assumption.

More comments and suggestions will give me more confidence in doing the risk
assessment (heavy metal pollution in soils of a mine area).

Cheers,

Chaosheng
----------------------------------------------------------------------------

Dear Chaosheng

I have the same opinion. This is a paper that is published at Geostats Rio
2000 (Dr Margaret Armstrong Kluwer book). I think if your realizations honor
the
conditional data you can associate uncertainity measures for your dataset. I
have beeing working with this with Isatis, using the Simulation Post
Processing to calculate error for a confidence interval

At this paper I use probability map, mean of simulations map and IQR as a
uncertainity measure.

All the Best

Alessandro Henrique Medeiros Silva
Geologist - Anglogold Brasil
alessandro@... <mailto:alessandro@...>

+55-31-3589-1687
+55-31-9953-0759

----------------------------------------------------------------------------

I am curious about the use of 100 realizations to generate a probability
map. is this a standard approach? if so, is a "small" p-value (such as
.05) used? if so, it would seem like 100 iterations might be a smallish
sample size for distinguishing, say, .05 (ie 5 outcomes out of 100) from,
say, .01. is 100 used because it seems like it is a reasonable number or
because of the computer time restrictions?

also, do geostat folks treat these as realizations or as
pseudo-realizations?

brian

****************************************************************
Brian Gray
USGS Upper Midwest Environmental Sciences Center
575 Lester Avenue, Onalaska, WI 54650
ph 608-783-7550 ext 19, FAX 608-783-8058
brgray@...
*****************************************************************

----------------------------------------------------------------------------

Chaosheng, I agree with Pierre that if your only goal is to generate a
probability map, then IK is faster and more straightforward than simulation
and that MG kriging will give the same results, faster, than MG simulation.

However, we have found a couple of practical reasons where it may be
my two cents worth to this discussion:

1) When trying to explain the concepts of spatial variability and
uncertainty, we have found that showing example realizations of what the
possible distribution of contaminants could look like provides the groups
involved to get a more intuitive understanding of these ideas. People
understand the idea of flipping a coin 100 times to get the probability of
heads or tails, but have a hard time visualizing in their mind what a "coin
flip" looks like in a 2-D soil contamination problem. Showing some example
conditional realizations gives them a stronger feel for the nature of the
answers geostats is providing to their questions.

2) A number of sites are in the process of designing chemical and/or
mechanical treatment systems for the soil that will be removed from the site
while the remediation map is being determined. One set of design parameters
for these treatment systems is the best and worst case estimates of the
total amount of contamination (curies, grams, etc.) contained in the soil at
the site. These best/worst case estimates depend on the joint estimate of
the contamination at all locations across the site. This is something
simulation provides, but kriging doesn't.

3) For soils with radioactive contaminants, there are a number of different
sensors (e.g., a gamma detector mounted several meters off the ground) being
deployed at field sites that integrate the activity of the contaminant over
a larger area/volume. Simulation of the fine scale distribution of the
activity can be useful in looking at how these sensors scale up the activity
values to the integrated measurement.

Also when looking at IK vs MG kriging (or simulation) keep in mind that
rarely do the client, stakeholder(s) and regulator(s) have a single action
level or threshold that they have all agreed to for application at the site.
There are usually multiple thresholds corresponding to different future-land
use scenarios and different health risk models. If creating the probabilty
maps through IK then each different threshold requires a new set of
indicator variograms. If you use MG kriging or simulation, you only need do
the variography once-keep in mind that the MG assumption does have other
problems with connectivity of extreme values that may or may not be
important in your application (this is generally a bigger concern in fluid
flow problems than in soil contamination problems).

I'll add my thanks to Gregoire for 7 years of superb work!

Sean

Sean A. McKenna Ph.D.
Geohydrology Department
Sandia National Laboratories
PO Box 5800 MS 0735
Albuquerque, NM 87185-0735
ph: 505 844-2450
----------------------------------------------------------------------------

Hi Brian,

One hundred realizations are typically generated
mainly for CPU reasons.
You are perfectly right that this number is
too small when looking at small probabilities
like 0.05 or 0.01. It's why I wouldn't recommend
using stochastic simulation to derive probability of occurrence
of events at pixel locations. Just use kriging to build
Use simulation if you have a transfer function, such as flow
simulator, that requires a model of spatial uncertainty,
or if you need to derive block probability distributions
(upscaling or aggregation problems).

More generally, there is more research to be done on the
use of stochastic simulation for probabilistic assessment,
including the question of equally-probability of realizatiuons
being generated.

Pierre
----------------------------------------------------------------------------

My tuppence worth.

The major advantages of simulation as a risk
assessment tool lie in the cases where you are trying
to derive some conclusion from the data rather than
just look at the values themselves.

For example, see Bill and my papers at Battelle
Conference 1987 or the paper at the Geostat Avignon in
1988. There are oters. All of these are available in
http://uk.geocities.com/drisobelclark/resume/Publications.html

We were trying to derive the travel path of a particle
given the pressure of fluid in an aquifer. Not a
linear transform by anyone's standards.

Isobel Clark

----------------------------------------------------------------------------

>1) When trying to explain the concepts of spatial variability and
>uncertainty, we have found that showing example realizations of what the
>possible distribution of contaminants could look like provides the groups
>involved to get a more intuitive understanding of these ideas.

Taking this a step further, there was a paper in the AAPG Stochastic
Modeling and Geostatistics Volume entitled "The Visualization
of Spatial Uncertainty" (R Mohan Srivastava) which proposes the use
of probability field simulation to generate dynamic animations
of different realizations. I have yet to see it being implemented in
commercial software, although in concept I can see the benefit
of having something like this to illustrate the "equiprobable"
realizations. The idea was to generate smooth transitions of
successive "frames" by sampling from adjacent columns of a set of
probability values, for a movie-like effect.

Syed
----------------------------------------------------------------------------

Dear all,

Thanks for so many interesting replies and thoughtful discussion. This is
not a summary yet, as I am expecting more to come.

Just to express my feeling about Indicator Kriging. To produce a probability
map, IK might be one of the choices. However, I always feel that too much
information is lost when doing the indicator transformation. When I see so
many "0"s in a dataset, I just feel the data quality is too poor.

Well, the other method of combination of Kriging and Kriging variance for
risk assessment has not been well discussed yet, and I would like to read

My last question "(4) how to carry out "PCLASS" " is now answered by the
developer of Idrisi. The fact that the file name of Kriging variance cannot
be entered (with Metadata command) is a bug of the program, which will be
corrected soon. At present time, a text editor may be used to modify the
image documentation file.

Now, let me discuss how I would like to make a probability map based on
Kriging and Kriging variance. For each pixel of the Kriging interpolated
map, there is a value of Kriging variance. The Kriging variance is a measure
of uncertainty (which is related to sampling density and spatial variation,
etc.???). If we assume that the value of the Kriging pixel follow a normal
distribution and the standard deviation is equal to the SQRT of Kriging
variance, the probability of any threshold can be calculated. Furthermore,
to make the risk assessment more realistic, I would like to include other
errors, such as sampling error and laboratory analysis error into risk
assessment. These errors can hardly be quantified, but if we say 10% or 20%
of the pixel value (for soil samples), perhaps there is no objection.
Therefore, the standard deviation of the pixel is increased by adding this
kind of errors.

I am not clear how to calculate the total standard deviation of the two
sources, is it:
Total standard deviation =
SQRT (Kriging Variance + SQUARE (Sampling Errors) ) ?

Any ideas and comments on this method?

Chaosheng Zhang

----------------------------------------------------------------------------

Chaosheng,

The kriging equations themselves are non-parametric. Without a
transformation of the data, the kriging estimate and the kriging variance do
not define a Gaussian distribution. If you want to make the Gaussian
assumption, as you state below, then using multiGaussian (MG) Kriging as
mentioned a few times in the replys to your question yesterday, is the way
to go. MG Kriging requires the transform of the raw data to standard normal
space and that the variograms be calculated using the transformed data.

Good luck

Sean

Sean A. McKenna Ph.D.
Geohydrology Department
Sandia National Laboratories
PO Box 5800 MS 0735
Albuquerque, NM 87185-0735
ph: 505 844-2450

----------------------------------------------------------------------------

>
>Well, the other method of combination of Kriging and Kriging variance for
>risk assessment has not been well discussed yet, and I would like to read

I believe it is useful to compare maps of predicted concentrations or maps
of the probabilityof exceeding threshold(s) that have been prepared with
different methods (e.g., SGS and OK) and make some judgments/decisions
about which maps appear reasonable given your knowledge of the data and
processes that created, transported and possibly transformed the
contaminants. My experience with kriging is it is very useful for central
tendency estimates but should be viewed with a high level of suspicion if
used to model uncertainty.

>
>Now, let me discuss how I would like to make a probability map based on
>Kriging and Kriging variance. For each pixel of the Kriging interpolated
>map, there is a value of Kriging variance. The Kriging variance is a
measure
>of uncertainty (which is related to sampling density and spatial variation,
>etc.???). If we assume that the value of the Kriging pixel follow a normal
>distribution and the standard deviation is equal to the SQRT of Kriging
>variance, the probability of any threshold can be calculated. Furthermore,
>to make the risk assessment more realistic, I would like to include other
>errors, such as sampling error and laboratory analysis error into risk
>assessment. These errors can hardly be quantified, but if we say 10% or 20%
>of the pixel value (for soil samples), perhaps there is no objection.
>Therefore, the standard deviation of the pixel is increased by adding this
>kind of errors.
>
>I am not clear how to calculate the total standard deviation of the two
>sources, is it:
> Total standard deviation =
> SQRT (Kriging Variance + SQUARE (Sampling Errors) ) ?
>
>Any ideas and comments on this method?

In general, you can add the variances if you believe the two sources of
error are independent. However, the data you use to develop a model of
spatial autocorrelation already includes measurement error (it is
incorporated in the 'nugget'), unless you have 'filtered it out' by
estimating their magnitudes and variances with quality control samples.

**************************************************
William C. Thayer, P.E.

Environmental Science Center
Syracuse Research Corporation
Syracuse, NY 13212
phone: (315) 452-8424
fax: (315) 452-8440
email: thayer@...
web: http://esc.syrres.com/
http://esc.syrres.com/geosem/
**************************************************

The kriging variance at short distances will increase with increasing
nugget effect. In any case, the kriging variance can not be less than the
nugget, so if you ensure that the nugget includes the measurement error
(which you can set based on your knowledge of the sampling and measurement
protocols, as well as the site-specific data and quality control data), the
minimum kriging variance (i.e., at short distances) will be valid.

A related point: What do risk estimates over small areas represent? I
think it is important to keep in mind the exposure unit or the area over
which exposure occurs. It seems to me that one first has to estimate the
likelihood that the risk from exposure to contamination exceeds some
threshold, which should be done at the exposure unit scale (i.e., over an
area that exposure occurs, such as an industrial site or perhpas a
residential yard). If the risk is unacceptable, calculating the
probability of exceeding a threshold at a smaller scale may be appropriate
for determining what areas of the site should be cleaned up. In my
opinion, one of the shortcomings of the risk assessment/geostatistics
programs that the U.S. EPA has developed is its inability to assess the
uncertainty in the mean on the exposure unit scale. So they have set about
answering the second question (where to clean up or where to collect more
samples) without answering the first sufficiently (is there a problem?);
again, in my opinion.

I apologize if you are already aware of these issues. Its hard to know
sometimes with e-mail!

Best regards,
Bill
----------------------------------------------------------------------------

For some resources classification (measured, indicated, inferred)
it's possible to find some kriging variance values that have correlation
with sample density. When you have more samples, you have small values for
KV...

But The best way to consider a error measure is calculate based on a set of
simulations
(My opinion...). What's you geostats area? environment? mining? oil?

All the best

Alessandro Henrique Medeiros Silva
Geologist - Anglogold Brasil
alessandro@... <mailto:alessandro@...>

+55-31-3589-1687
+55-31-9953-0759
----------------------------------------------------------------------------

1. I would be very careful about using the kriging variance to generate
confidence intervals, with or without the assumption that the "errors" are
normally distributed. The kriging variance is a useful measure of the
relative reliability of the kriging estimator but it is certainly not an
absolute measure. If you multiply the variogram model by a postive number (I
know this will make the model appear to not fit the graph of the
experimental variogram) then the kriging variance will change by that
multiplier BUT the kriging weights will not change and hence the estimated
values will not change. Thus the kriging variance can not be treated as an
absolute measure of reliability.

2. If you really want to incorporate the measurement or instrument
variability then it should be done another way. You should use the smoothing
form of the kriging equations, if you want details send me an email. This is
also discussed in several but not all books on geostatistics.

3. The probability distribution that you would generate , i.e., estimate,
by using indicator kriging is very different from the distribution you would
generate by using confidence intervals. There is a reason why the term is
"confidence" and not "probability" and that is at the heart of the
distinction between the two approaches (indicator kriging vs confidence
intervals).

Donald E. Myers
http://www.u.arizona.edu/~donaldm

----------------------------------------------------------------------------

Dear Dr. Zhang,
I understand from the discussion on the ai-geostats forum that your are
dealing with producing a probability map for the purposes of risk
assesment. I am interested in agricultural hydrology. In this context, I
would like to apply IK technique to precipitation fields and produce
probability maps. As you know the difficulty comes from the lack of
literature explaining the subject clearly with examples. I would like to
learn if you have any kind of documents such as reports, class solved
examples, books,..etc. on the subject.
Yours Sincerely,
Dr. Mahmut CETIN

---
Dr. Mahmut CETIN
Department of Agricultural Structures and Irrigation
Faculty of Agriculture
University of Cukurova
Tel: ++90 (322)338 6877 (Direct)
++90 (322) 338 6318 (Switchboard, ext. : 25)
Fax: ++90 (322) 338 6386
E-Mail: mcet64@...

----------------------------------------------------------------------------

Dear Syed, et al.,

I did much of what you described in the GRASS GIS a while back. (GRASS
is public domain, not commercial, but it is a very good GIS.) The title
of the paper is "Visualizing Spatial Data Uncertainty Using Animation"
and a copy of it is located at:

http://www.geo.hunter.cuny.edu/~chuck/CGFinal/paper.htm

The special issue of Computers & Geosciences (Vol. 23, No. 4, pp.
387-395, 1997) included a CD-ROM that contained some of the animations
in MPEG form. My web site includes the animations and instructions on
how to construct them.

I used spherical interpolation to generate smooth transitions between
realizations in order to keep the interpolations valid statistically.

I have a more recent work that studies user perception of animated maps
representing data and application uncertainty. An outline of that work
from a conference presentation (with all equations and animations) is
available at:

http://www.geo.hunter.cuny.edu/~chuck/GIScience2000/paper.html

sincerely, chuck

----------------------------------------------------------------------------

Dear Syed, et al.,

I did much of what you described in the GRASS GIS a while back. (GRASS
is public domain, not commercial, but it is a very good GIS.) The title
of the paper is "Visualizing Spatial Data Uncertainty Using Animation"
and a copy of it is located at:

http://www.geo.hunter.cuny.edu/~chuck/CGFinal/paper.htm

The special issue of Computers & Geosciences (Vol. 23, No. 4, pp.
387-395, 1997) included a CD-ROM that contained some of the animations
in MPEG form. My web site includes the animations and instructions on
how to construct them.

I used spherical interpolation to generate smooth transitions between
realizations in order to keep the interpolations valid statistically.

I have a more recent work that studies user perception of animated maps
representing data and application uncertainty. An outline of that work
from a conference presentation (with all equations and animations) is
available at:

http://www.geo.hunter.cuny.edu/~chuck/GIScience2000/paper.html