- Noemi

I am not too sure what you mean by "mysterious kriging output" - your

attached plans appear okay to me, the model appears to reflect the

underlying data reasonably well. The "plaid effect" that you refer to -

seen on the variance map makes sense. There seems to be a general

mis-conception that kriging variance is related to the actual data values -

this is not the case. Kriging variance is related, pure and simply, to data

configuration - in areas where there is a lot of data the kriging variance

is low, where there is little data the variance is high.

Hope this helps.

Richard Hague

Original Message:

-----------------

From: Noemi Barabas barabas@...

Date: Mon, 8 Mar 2004 18:18:55 -0500 (EST)

To: ai-geostats@...

Subject: AI-GEOSTATS: mysterious kriging output

Dear list,

I am working on a kriging problem of log-PCB concentrations in

river sediments (the coordinates have been "straightened"), using GSLib.

I have strong anisotropy with a ratio of about 1:6 (x:y). I have some

clustered locations as well as some sparsely sampled areas, and several

instances where the high and low concentrations are found very close to

eachother. The distribution is lognormal and I am working with

log-transformed values. The variograms are rather nice in both

directions. Nevertheless, ordinary kriging gives a very peculiar-looking

map (of log-concentrations). It would be too difficult to put into

words, so I have included maps of estimates, variance and local mean as

an attachment.

Does anybody know what causes this "plaid" effect? Looking at the map

of variances, it appears that an estimation location has low variance

if it has a data point directly above and next to it, but intermediate

variance if those same two data points are in a diagonal direction

relative to the axes of anisotropy, even if the new position takes the

estimation point closer to the data points. I would like to undestand the

reason for this effect, as well as whether there is something that can be

done about it.

Could the fact that there are high values embedded in low value locations

be partially responsible for these strange maps?

(I did experiment with octant search, various maximum search radii,

various min and max number of data points for estimation, and this effect

persists. I even reversed the angles of anisotropy, tried different

variogram ranges. The variogram ranges are about 20% of the width/length

of the domain, and the relative nugget effect is about 6% in both

directions)

Thanks very much!

Noemi

--------------------------------------------------------------------

mail2web - Check your email from the web at

http://mail2web.com/ .

--

* To post a message to the list, send it to ai-geostats@...

* As a general service to the users, please remember to post a summary of any useful responses to your questions.

* To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list

* Support to the list is provided at http://www.ai-geostats.org - Hi Ruben,

thanks so much for the references .... and especially the R

routines .... i will look into it. This may really give some good

answers to my data - once for all - i hope at least. I think we

neglect in majority of cases to verify if the data come from one or 2

(or more) distributions and just apply a transformation and do a

kriging .... it is just too easy that way ;-))

Again, thank you so much,

Monica

> Exploratory analysis of the frequency distribution of the data (i.e. the

--

> aggregated, non-spatial, frequency) could reveal the existence of two (or

> more) populations. To evaluate the evidence in favour of such an

> hypothesis, you could compare the hypothesis that the frequency

> distribution is formed by a mixture of two (or more) specified

> distributions versus the hypothesis that it is formed by only one. The

> general topic in statistics is called 'mixture distribution analysis' (not

> to be confused with 'mixture models'). Useful references are:

>

> Everitt & Hand, 1981, Mixture distribution analysis. Chapman & Hall

> Chen & Chen, 2001, Statistics and Probability Letters 52:125

> Hawkins et al., 2001, Computational Statistics & Data Analysis 38:15

> http://www.math.mcmaster.ca/peter/mix/mix.html

>

> Some robust regression methods, for example, are based on treating the

> data as coming from a mixture of two distributions, the main one, and a

> contaminating distribution.

>

> If you conclude that there are two (or more) distributions, then you can

> compute the maximum conditional probability that any given data point

> belong to any of the two (or more) distributions, and use this computation

> to classify data. After this exploratory analysis, you could treat the two

> (or more) populations differently, if there is evidence for a mixture, and

> maybe even perform separate geostatistical analyses on the separate

> populations.

>

> I used this general strategy in the analysis of a time series of an index

> of returns from investments in finantial markets. The strategy was

> proposed by Hamilton, 1994, Time Series Analysis, Ch. 22, Princeton U. P.

>

> Ruben

* To post a message to the list, send it to ai-geostats@...

* As a general service to the users, please remember to post a summary of any useful responses to your questions.

* To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list

* Support to the list is provided at http://www.ai-geostats.org