I'm afraid this is going to be a very long mail. I believe it is an

interesting topic which could generate some good discussions on the

ai-geostats mailing list. It summarises a presentation I'm currently preparing

for a small GIS workshop. Although GIS have still nowadays a significant lack

of tools and functions that are required for a proper analysis of

geostatistical data (the new versions of Idrisi and Arc/Info seem however to

have finally proper geostatistical modules), GIS users can gain some

experience in the application of geostatistics by using recently developed

supplementary software. But the progress made in combining geostatistics to

GIS seems to have taken a one way road since geostatisticians have not shown

much interest in the use of GIS. On the other hand, GIS users are looking for

geostatistical software that could be integrated within their tools without

justifying clearly the need for such integration besides the benefit of an

improved convenience of use. The intent of this mail is therefore to identify

through a step-by-step analysis of a standard geostatistical case study where

interactions between GIS and geostatistics exist and how these could be used

efficiently.

A typical sequence of the work of a geostatistician can be summarised as

following

1) Primary analysis of the data in order to make a portrait of the analysed

variable(s)

- Univariate statistics

- Multivariate statistics

2) Analysis of the coherence in the sampling of the studied phenomena

- Declustering techniques

- Thiessen/Voronoi polygons

- Fractal dimension of the sampling network

3) Analysis of the coherence in the spatial structure of the studied variable

- Moving windows statistics: analysis of stationnarity

- Exploratory variography

- Modelling of the drift if any and variogram modeling

4) Estimations/Interpolation

- Definition of the hypothesis on the basis of the information provided during

the exploratory analysis

- Selection of a spatial estimator from the kriging family

- Estimation/Interpolation

5) Validation of the selected method

- Cross-validations

- Analysis of the errors and residuals

- Correction of the parameters of the interpolation method

- New estimations and cross-validations

So where do GIS operations fit within these steps ? My personal experience,

gained during the development of an object oriented GIS which integrates a

Bayesien geostatistical module developed at the university of Klagenfurt

(Austria) is the following

1) Primary analysis:

Before any analysis, GIS will clearly facilitate the projection of the

geostatistical dataset. GIS functions also allow efficient logical consistency

checks of the topological information of the data. Soil samples, for example,

can not be located within a lake or an ocean, measurements provided by a

national monitoring network are not supposed to be discovered across the

political borders of the country where the measurements are made. Practically,

such filtering require the overlay of the digital maps on the geostatistical

dataset and the selection of points falling within specific polygons, or cells

in case the geographic data is in a raster format.

It is also clear that interactive histograms as well as map of proportional

symbols will help to identify and locate anomalies and trends.

2) Analysis of the coherence in the sampling of the studied phenomena

Declustering techniques are usually either based on polygons of influence

(Thiessen/Voronoi polygons) or on a cell based method. The advantage of the

polygons is that it doesn't require subjective decisions (a size of the cells

has to be selected for the second method) and it can allow to take into

account the geometry of the surface that is analyzed. The last can be done by

"clipping" the surface generated by Thiessen polygons with the borders of the

analyzed surface. Doing this also avoids the bias generated by the external

polygons when taking a simple rectangle or the convex hull. The problem of

measurements situated on the edges is also still not really solved when using

fractal techniques.

3) Analysis of the coherence in the spatial structure of the studied variable

Variowin has clearly shown the power of interactivity during the analysis of

the spatial correlation. However, the location on a map (with any additional

geographical information) of the sample pairs selected on an h-scatterplot

could in certain cases help to explain certain anomalies. The display of

additional information should also certainly help to understand certain

anisotropies.

4) Estimations/Interpolation

This is probably where the use of GIS in the analysis of geostatistical data

becomes more tricky but also where it might show the biggest perspectives. If

GIS can certainly facilitate the definition of polygons of exclusion used

during the estimation, it theoretically could also be used to define other

criteria to select the measurements to be used during the estimation than the

distance. Such criteria could be the slope, aspect, curvature of a DEM. Or

more simply all data falling within a buffer created around a polygon, a point

or a line.

Around two years ago I asked on the list if any work was made in the use of

other information in the variogram modelling than the Euclidean distance. I

suggested as an example to use the so called "cost-path distance", the

distance one would effectively travel by walking from one point to another one

when the variable under study is strongly affected by the DEM. If the few

tests I have made were not convincing but I still believe that better tests

can still be made since I didn't have a good code to calculate this cost-path

distance and most probably because I didn't really take much time to think

about it. Another example could come from studies involving hydrology:

particles, pollutants, etc. in suspension will have their spatial

distribution certainly influenced by the shape of the river/lake and most

probably will show patterns with strong local anisotropies. The lake/river

could however be somehow "transformed" in order to become in a situation

where the hypothesis of stationnarity can be considered as true and improve so

the estimations.

I suggest the readers to have a look at

Zoraster Steven, 1996. "Imposing Geologic Interpretations on

Computer-Generated Contours Using Distance Transformations"

Mathematical Geology, Vol. 28, No. 8, pp 969-985.

It uses a different approach, with an interesting potential for GIS users, but

has the same ideas.

Bayesian geostatistics is a growing field full of promises. The use of

additional information to reduce the uncertainty associated to the estimates

certainly should gain a lot in being combined with GIS. I believe that this is

the field in geostats that should see its major developments in the near

future and stimulate the geostatisticians to work more with GIS.

This was certainly not an exhaustive list on what GIS can bring to

geostatistics but the first that came to my mind.

I would more than welcome any references, ideas and comments on all this.

Sorry for having taken so much of your time.

Best regards

Gregoire

PS: A small comment on the outputs of the estimates: Isaaks and Srivastava

have in their book clearly underlined the importance of the sample support. So

why do most GIS directly convert point estimates into a raster (block

estimates) ? I believe this should be corrected in the future.

Gregoire Dubois

Section of Earth Sciences

Institute of Mineralogy and Petrography

University of Lausanne

Switzerland

Currently detached in Italy

http://curie.ei.jrc.it/ai-geostats.htm

--

*To post a message to the list, send it to ai-geostats@....

*As a general service to list users, please remember to post a summary

of any useful responses to your questions.

*To unsubscribe, send email to majordomo@... with no subject and

"unsubscribe ai-geostats" in the message body.

DO NOT SEND Subscribe/Unsubscribe requests to the list!