AI-GEOSTATS: Summary on Variogram modeling of DEM
- Dear Colleagues
A while ago, I posted the following enquiry:
I have a Digital Elevation Model with around 600,000 data points
(resolution=500 x 500 m). The x and y coordinates are in UTM and the z
coordinate are elevation from mean sea level.
Using GsLib to model the associated variogram, I am experiencing problem
in fixing the parameters within gamv.par file. The sensitive parameters
Total number of lags
Unit lag separation distance
It seems that variogram values are very sensitive to these parameters. I
was wondering if colleagues out there have similar observation(s) and have
a remedy for that.
In response, I received a few replies for which you will read in a minute.
All in all, I found the notion of variogram modeling to be still an art as
opposed to a science. In this particular case, working with a
nonstationary phenomena, it seems, I have to subdivide the spatial domain
into a number of regions and model the process in each region separately.
My sincere thanks to all those who took their time and responded to my
Date: Fri, 20 Feb 2004 14:11:24 -0700
From: "Donald E. Myers" <myers@...>
1. The combination of the number of lags and the lag separation distance
relate to the total distance for which you are going to compute a sample
variogram. Generally speaking you don't want this to exceed about half
the maximum distance. There are several reasons for this (1) the number
of pairs per lag will increase beginning with the first lag (although
not perfectly monotone) until about half the maximum distance,at about
that distance they begin to decrease. More pairs per lag is better. (2)
Depending somewhat on what you want to use the variogram for, e.g.,
kriging, it is the behavior of the variogram for shorter lags that is
most important (3) for very long lag distances the only pairs will be
for data locations that are essentially at the extreme ends, i.e., a
somewhat peculiar pattern, and hence best avoided.
2. There is no absolute best choice but knowing that your resolution is
500 x 500 there is no advantage in choosing a lag spacing smaller than
500 ( you might have lags with no pairs at all). Since your data
locations are on a regular grid, choose a multiple of 500.
3. The total number of possible pairs is fixed by the total number of
data locations, this is not affected by your choice of the number of
lags nor the lag spacing. These pairs will be split up among the lags,
note that the sample variogram at any plotted point is actually an
average hence there is a conflict between wanting more pairs per lag
(presumably a more reliable estimate) and averaging over a wider
spacing. Assuming that you fix the total distance for which you compute
the sample variogram (number of lags x lag spacing), more plotted
points means you are likely to get a better picture of the shape of the
variogram. This is offset perhaps by somewhat less reliable estimates at
each plotted point (Two extremes (i) use very short lag spacing so that
every pair appears in exactly one lag, this way you get the maximum
number of plotted points, i.e., the variogram cloud (ii) use only one
lag interval, you get the maximum number of pairs but it is hard to
detect shape from only one plotted point).
The bottom line is that you will probably want to experiment a bit to
see how sensitive the plot is to changing the lag spacing. You might
also find the following paper of interest
# 1987, A. Warrick and D.E. Myers, Optimization of Sampling Locations for
Variogram Calculations. Water Resources Research 23, 496-500
Donald E. Myers
From: "Donald E. Myers" <myers@...>
The sample variogram only estimates the variogram IF the underlying
assumption of a constant mean is satisfied. Given the size of your data
set (and the large geographic extent as a result) you will likely want
to at least examine the data set for possible evidence of
non-stationarity. Try fitting a trend surface to the data, if the
coefficients other than the constant term are very non-zero then you
will want to consider several possibilities (1) using residuals (2)
splitting the data set up into separate sets where this does not seem to
occur. Another indication of non-stationarity, if the sample variogram
grows at a quadratic or higher rate there is no variogram model that
will fit this.
Donald E. Myers
From: Adrian_Mart=EDnez_Vargas?= <amvargas@...>
Date: Sat, 21 Feb 2004 00:05:43 -0500
If are you using a grid File it is preferable to use gam.exe. Other key
point is, that in these case (regular grid) the lag must be similar to de
separation distance, for avoid the smooth of the variogram. For directional
use the main grid directions, and points separation.
* To post a message to the list, send it to ai-geostats@...
* As a general service to the users, please remember to post a summary of any useful responses to your questions.
* To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org