A while ago, I posted the following enquiry:

I have a Digital Elevation Model with around 600,000 data points

(resolution=500 x 500 m). The x and y coordinates are in UTM and the z

coordinate are elevation from mean sea level.

Using GsLib to model the associated variogram, I am experiencing problem

in fixing the parameters within gamv.par file. The sensitive parameters

are:

Total number of lags

Unit lag separation distance

It seems that variogram values are very sensitive to these parameters. I

was wondering if colleagues out there have similar observation(s) and have

a remedy for that.

In response, I received a few replies for which you will read in a minute.

All in all, I found the notion of variogram modeling to be still an art as

opposed to a science. In this particular case, working with a

nonstationary phenomena, it seems, I have to subdivide the spatial domain

into a number of regions and model the process in each region separately.

My sincere thanks to all those who took their time and responded to my

original enquiry.

Date: Fri, 20 Feb 2004 14:11:24 -0700

From: "Donald E. Myers" <myers@...>

Some comments

1. The combination of the number of lags and the lag separation distance

relate to the total distance for which you are going to compute a sample

variogram. Generally speaking you don't want this to exceed about half

the maximum distance. There are several reasons for this (1) the number

of pairs per lag will increase beginning with the first lag (although

not perfectly monotone) until about half the maximum distance,at about

that distance they begin to decrease. More pairs per lag is better. (2)

Depending somewhat on what you want to use the variogram for, e.g.,

kriging, it is the behavior of the variogram for shorter lags that is

most important (3) for very long lag distances the only pairs will be

for data locations that are essentially at the extreme ends, i.e., a

somewhat peculiar pattern, and hence best avoided.

2. There is no absolute best choice but knowing that your resolution is

500 x 500 there is no advantage in choosing a lag spacing smaller than

500 ( you might have lags with no pairs at all). Since your data

locations are on a regular grid, choose a multiple of 500.

3. The total number of possible pairs is fixed by the total number of

data locations, this is not affected by your choice of the number of

lags nor the lag spacing. These pairs will be split up among the lags,

note that the sample variogram at any plotted point is actually an

average hence there is a conflict between wanting more pairs per lag

(presumably a more reliable estimate) and averaging over a wider

spacing. Assuming that you fix the total distance for which you compute

the sample variogram (number of lags x lag spacing), more plotted

points means you are likely to get a better picture of the shape of the

variogram. This is offset perhaps by somewhat less reliable estimates at

each plotted point (Two extremes (i) use very short lag spacing so that

every pair appears in exactly one lag, this way you get the maximum

number of plotted points, i.e., the variogram cloud (ii) use only one

lag interval, you get the maximum number of pairs but it is hard to

detect shape from only one plotted point).

The bottom line is that you will probably want to experiment a bit to

see how sensitive the plot is to changing the lag spacing. You might

also find the following paper of interest

# 1987, A. Warrick and D.E. Myers, Optimization of Sampling Locations for

Variogram Calculations. Water Resources Research 23, 496-500

Donald E. Myers

httpp://www.u.arizona.edu/~donaldm

From: "Donald E. Myers" <myers@...>

Addendum

The sample variogram only estimates the variogram IF the underlying

assumption of a constant mean is satisfied. Given the size of your data

set (and the large geographic extent as a result) you will likely want

to at least examine the data set for possible evidence of

non-stationarity. Try fitting a trend surface to the data, if the

coefficients other than the constant term are very non-zero then you

will want to consider several possibilities (1) using residuals (2)

splitting the data set up into separate sets where this does not seem to

occur. Another indication of non-stationarity, if the sample variogram

grows at a quadratic or higher rate there is no variogram model that

will fit this.

Donald E. Myers

http://www.u.arizona.edu/~donaldm

From: Adrian_Mart=EDnez_Vargas?= <amvargas@...>

Date: Sat, 21 Feb 2004 00:05:43 -0500

If are you using a grid File it is preferable to use gam.exe. Other key

point is, that in these case (regular grid) the lag must be similar to de

separation distance, for avoid the smooth of the variogram. For directional

use the main grid directions, and points separation.

regards

Adrian...

--

* To post a message to the list, send it to ai-geostats@...

* As a general service to the users, please remember to post a summary of any useful responses to your questions.

* To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list

* Support to the list is provided at http://www.ai-geostats.org