Sorry, an error occurred while loading the content.

## RE: AI-GEOSTATS: Summary: Extreme Values?

Expand Messages
• ... Dear Chaosheng Xhang, Thank you for your comprehensive summary (which now enables me to delete all the interesting individual replies by others!). I d like
Message 1 of 5 , Dec 22, 2001
• 0 Attachment
On 22-Dec-01 Chaosheng Zhang wrote:
> Dear list,
>
> Happy Christmas! Many thanks to all those who replied my question about
> extreme values, especially Isobel Clark, Marcel Vallée, Benjamin Warr,
> Claudio Cocheo, Martin Roseveare, Pierre Goovaerts, Jeff Myers.

Dear Chaosheng Xhang,
Thank you for your comprehensive summary (which now enables
me to delete all the interesting individual replies by others!).

I'd like to add one consideration which seems not to
have been mentioned by others.

Especially in a regulatory ("clean-up") context, the
regulator may want to have a determination of the total
quantity of contaminant on a site.

Your high sample values are typical of "hot-spot" values.

It is a useful formula (proof by integration by parts)
that

integral from 0 to inf (1 - F(x)) dx = expectation of X

where F(x) is the cumulative distribution function of X.

Applying this (somewhat crudely, numerically speaking)
to your data for Lead shows that the top second percentile
(98-100%) accounts for about 1/3 of the total content,
while the top 5th percentile (95-100%) accounts for well
over half the total.

It is therefore essential, for purposes such as the
above, both to take these extreme values very seriously,
and also to try to get estimates of the percentiles which
are as accurate as possible. The latter is not at all easy
(in fact I do not know of a satisfactory solution in
the context of "grid sampling" of a contaminated site,
where the amount and density of sampling which is feasible
in practice is usually quite insufficient -- how do you
know, for instance, that there are not much larger
extremes still, somewhere, remaining unobserved? their
probabilities of being sampled may be very small; but if
their values are very high they could dominate everything
else.)

However, as far as the data are concerned, you have
what you have got and you _must_ respect what it tells
you. From the above, for purposes of estimating the
total, rather than trying to ignore the high percentiles
you would even do better to ignore the lower percentiles,
since they contribute very little!

And, that being said,

Happy Christmas and Prosperous New Year to All!
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding@...>
Fax-to-email: +44 (0)870 167 1972
Date: 22-Dec-01 Time: 21:38:33
------------------------------ XFMail ------------------------------

--
* To post a message to the list, send it to ai-geostats@...
* As a general service to the users, please remember to post a summary of any useful responses to your questions.
* To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org
• Ted s comments on thr regulatory perpspective brings up some interesting issues, assuming this were an hazardous waste site. First, since the upper 5 percent
Message 2 of 5 , Dec 29, 2001
• 0 Attachment
Ted's comments on thr regulatory perpspective brings up some interesting
issues, assuming this were an hazardous waste site. First, since the upper
5 percent accounts for such a high percentage of the mass of lead, a
surgical cleanup targeting the "hot spots" might be sufficient.
Environmental remediation is focused on reducing risk to human health and
the environment, and if removing the high zones brings the average lead
concentration below th risk-based threshold, then the remediation is
successful.

Next, in remediation, the remediation decision support unit is as important
as the sample and subsample support. A few extreme values can be "diluted"
out if a large decision unit is selected.

Furthermore, a "hot spot" must be defined with relation to its size and
concentration (at a minimum). Then the impact of the size/support of the
"hot spot" can be determined in relation to the decision unit support. So
basically, the only things you need to remember in environmental
characterization and decision-making are support, support, and support
(sample, subsample, decision unit).

As far as contouring, issues still remain. It's hard to contour yourself
out of a situation you sampled yourself into.

Happy Holidays and a Prosperous New Year!

Jeff Myers
Westinghouse Safety Management Solutions
2131 S. Centennial Dr., SE
Aiken, SC 29803
jeff.myers@...
http://www.gemdqos.com

-----Original Message-----
From: Ted.Harding@...
To: Chaosheng Zhang
Cc: ai-geostats@...
Sent: 12/22/01 4:38 PM
Subject: RE: AI-GEOSTATS: Summary: Extreme Values?

On 22-Dec-01 Chaosheng Zhang wrote:
> Dear list,
>
> Happy Christmas! Many thanks to all those who replied my question
> extreme values, especially Isobel Clark, Marcel Vallée, Benjamin Warr,
> Claudio Cocheo, Martin Roseveare, Pierre Goovaerts, Jeff Myers.

Dear Chaosheng Xhang,
Thank you for your comprehensive summary (which now enables
me to delete all the interesting individual replies by others!).

I'd like to add one consideration which seems not to
have been mentioned by others.

Especially in a regulatory ("clean-up") context, the
regulator may want to have a determination of the total
quantity of contaminant on a site.

Your high sample values are typical of "hot-spot" values.

It is a useful formula (proof by integration by parts)
that

integral from 0 to inf (1 - F(x)) dx = expectation of X

where F(x) is the cumulative distribution function of X.

Applying this (somewhat crudely, numerically speaking)
to your data for Lead shows that the top second percentile
(98-100%) accounts for about 1/3 of the total content,
while the top 5th percentile (95-100%) accounts for well
over half the total.

It is therefore essential, for purposes such as the
above, both to take these extreme values very seriously,
and also to try to get estimates of the percentiles which
are as accurate as possible. The latter is not at all easy
(in fact I do not know of a satisfactory solution in
the context of "grid sampling" of a contaminated site,
where the amount and density of sampling which is feasible
in practice is usually quite insufficient -- how do you
know, for instance, that there are not much larger
extremes still, somewhere, remaining unobserved? their
probabilities of being sampled may be very small; but if
their values are very high they could dominate everything
else.)

However, as far as the data are concerned, you have
what you have got and you _must_ respect what it tells
you. From the above, for purposes of estimating the
total, rather than trying to ignore the high percentiles
you would even do better to ignore the lower percentiles,
since they contribute very little!

And, that being said,

Happy Christmas and Prosperous New Year to All!
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding@...>
Fax-to-email: +44 (0)870 167 1972
Date: 22-Dec-01 Time: 21:38:33
------------------------------ XFMail ------------------------------

--
* To post a message to the list, send it to ai-geostats@...
* As a general service to the users, please remember to post a summary
of any useful responses to your questions.
* To unsubscribe, send an email to majordomo@... with no subject and
"unsubscribe ai-geostats" followed by "end" on the next line in the
message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org

--
* To post a message to the list, send it to ai-geostats@...
* As a general service to the users, please remember to post a summary of any useful responses to your questions.
* To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org
• ... Jeff, Thanks for your comments which are very much to the point. ... With your permission (which I assume will not be unreasonably withheld) I propose to
Message 3 of 5 , Dec 29, 2001
• 0 Attachment
On 29-Dec-01 Myers, Jeff wrote:
> Ted's comments on the regulatory perspective bring up some
> interesting issues, assuming this were an hazardous waste site.

Jeff, Thanks for your comments which are very much to the point.

> It's hard to contour yourself out of a situation you sampled
> yourself into.

With your permission (which I assume will not be unreasonably
withheld) I propose to trot out this delightful maxim on
suitable occasions!

Thanks for this too -- just in time to set me smiling for
the New Year.

Best wishes to all,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding@...>
Fax-to-email: +44 (0)870 167 1972
Date: 29-Dec-01 Time: 19:22:06
------------------------------ XFMail ------------------------------

--
* To post a message to the list, send it to ai-geostats@...
* As a general service to the users, please remember to post a summary of any useful responses to your questions.
* To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org
• Permission granted. And a Happy New Year to all! Jeff ... From: Ted.Harding@nessie.mcc.ac.uk To: Myers, Jeff Cc: ai-geostats@unil.ch Sent: 12/29/01 2:22 PM
Message 4 of 5 , Jan 1, 2002
• 0 Attachment
Permission granted. And a Happy New Year to all!

Jeff

-----Original Message-----
From: Ted.Harding@...
To: Myers, Jeff
Cc: ai-geostats@...
Sent: 12/29/01 2:22 PM
Subject: RE: AI-GEOSTATS: Summary: Extreme Values?

On 29-Dec-01 Myers, Jeff wrote:
> Ted's comments on the regulatory perspective bring up some
> interesting issues, assuming this were an hazardous waste site.

Jeff, Thanks for your comments which are very much to the point.

> It's hard to contour yourself out of a situation you sampled
> yourself into.

With your permission (which I assume will not be unreasonably
withheld) I propose to trot out this delightful maxim on
suitable occasions!

Thanks for this too -- just in time to set me smiling for
the New Year.

Best wishes to all,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding@...>
Fax-to-email: +44 (0)870 167 1972
Date: 29-Dec-01 Time: 19:22:06
------------------------------ XFMail ------------------------------

--
* To post a message to the list, send it to ai-geostats@...
* As a general service to the users, please remember to post a summary of any useful responses to your questions.
* To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org
Your message has been successfully submitted and would be delivered to recipients shortly.