some of the variables involved in the model are only known for a sample, or additional

observation is necessary (for example a ground survey), and still, even in this

case, there may be ways to use all the information you have,

depending on the target of the study.

I have seen some paper (may be from 15 years ago), in which pixels were sampled to save

computing time.

Today computing time should not be that much of a limiting factor for most remote sensing

algorithms, but you may be in trouble if you want compute for example a variogram on a set of

several millions of pixels (a TM image). In this case sampling may be necessary to cope with

computing limitations. By the way this might be one question to the GEOSTATS list:

which may be the largest data set on which you can compute a variogram or a correlogram

in a reasonable time with common computing means, say a powerful PC or an average

workstation?

Javier Gallego

Subject: GEOSTATS: Sampling Theory

This question revolves around the use of satellite imagery,

digital terrain data, and statistical logistic regression models,

not just geostatistical applications. However, I

hope there might be a geostatistical person on this list who

has encountered this question too. My appologies for posting

to the group if this is considered inappropriate.

I'm having a philosophical discussion with an associate on the

merits of sampling a classified rasterize map (dem, land cover

etc.)

which will feed into a statistical analysis.

Should a random sample of a specified sampling intensity (ie a

proportion of the cells in the raster domain) be taken which will

then serve as the data for a statistical model of some response

given a set of independent variables? Or should the entire raster

be used in this statistical test?

Following the second argument, since we have a "census" of

the geographic space, why take a sample of that for use in

the statistical model? When the entire population is known,

the mean, sigma, etc. are also known. So why not use the

entire set of data for the analysis? Would a sample correspond

to "tossing out get information" ?

Thanks for any insight provided.

Steve Friedman

