Loading ...
Sorry, an error occurred while loading the content.
 

RE: GEOSTATS: Sampling Theory

Expand Messages
  • Javier Gallego
    The main reason I can see to use a sample when you know the whole population is that some of the variables involved in the model are only known for a sample,
    Message 1 of 2 , Oct 29, 1998
      The main reason I can see to use a sample when you know the whole population is that
      some of the variables involved in the model are only known for a sample, or additional
      observation is necessary (for example a ground survey), and still, even in this
      case, there may be ways to use all the information you have,
      depending on the target of the study.

      I have seen some paper (may be from 15 years ago), in which pixels were sampled to save
      computing time.
      Today computing time should not be that much of a limiting factor for most remote sensing
      algorithms, but you may be in trouble if you want compute for example a variogram on a set of
      several millions of pixels (a TM image). In this case sampling may be necessary to cope with
      computing limitations. By the way this might be one question to the GEOSTATS list:
      which may be the largest data set on which you can compute a variogram or a correlogram
      in a reasonable time with common computing means, say a powerful PC or an average
      workstation?

      Javier Gallego


      -----Original Message-----
      From: Steve Friedman [SMTP:friedman@...]
      Sent: Wednesday, October 28, 1998 11:33 PM
      To: erdas-l@...; ai-geostats@...
      Subject: GEOSTATS: Sampling Theory

      This question revolves around the use of satellite imagery,
      digital terrain data, and statistical logistic regression models,

      not just geostatistical applications. However, I
      hope there might be a geostatistical person on this list who
      has encountered this question too. My appologies for posting
      to the group if this is considered inappropriate.

      I'm having a philosophical discussion with an associate on the
      merits of sampling a classified rasterize map (dem, land cover
      etc.)
      which will feed into a statistical analysis.

      Should a random sample of a specified sampling intensity (ie a
      proportion of the cells in the raster domain) be taken which will

      then serve as the data for a statistical model of some response
      given a set of independent variables? Or should the entire raster

      be used in this statistical test?

      Following the second argument, since we have a "census" of
      the geographic space, why take a sample of that for use in
      the statistical model? When the entire population is known,
      the mean, sigma, etc. are also known. So why not use the
      entire set of data for the analysis? Would a sample correspond
      to "tossing out get information" ?

      Thanks for any insight provided.

      Steve Friedman

      --
      *To post a message to the list, send it to ai-geostats@....
      *As a general service to list users, please remember to post a summary
      of any useful responses to your questions.
      *To unsubscribe, send email to majordomo@... with no subject and
      "unsubscribe ai-geostats" in the message body.
      DO NOT SEND Subscribe/Unsubscribe requests to the list!

      --
      *To post a message to the list, send it to ai-geostats@....
      *As a general service to list users, please remember to post a summary
      of any useful responses to your questions.
      *To unsubscribe, send email to majordomo@... with no subject and
      "unsubscribe ai-geostats" in the message body.
      DO NOT SEND Subscribe/Unsubscribe requests to the list!
    Your message has been successfully submitted and would be delivered to recipients shortly.