Loading ...
Sorry, an error occurred while loading the content.

AI-GEOSTATS: generating 2-dim point pattern / summary

Expand Messages
  • Peter Bossew
    Dear list, this is a summary of the replies I got to my question, I want to generate a set of points in a plane, {(xi,yi)}, such that the points are
    Message 1 of 1 , Mar 18, 2003
    • 0 Attachment
      Dear list,

      this is a summary of the replies I got to my question,

      I want to generate a set of points in a plane, {(xi,yi)}, such that the
      points are distributed according to a chosen probability
      distribution density f, i.e., the infinitesimal rectangle [x,x+dx] X
      [y,y+dy] shall contain f(x,y)*dx*dy points.
      How do I do this ??
      In other words, I am looking for a generalization of the 1-dim method,
      where this is done with F*(z), where F=cumul(f) and F* = inverse of F, and
      z = random out of [0,1] (or any appropriate interval if f is not
      normalized).
      I tried to do it the same way, just using complex numbers, but this does
      not seem to work, and could also not be generalized for n>2 dimensions.

      In statistical terms, I was looking for a way how to sample from a given
      pdf.


      Ted Harding (U.K.) and Ghandi Pawitan (Bandung, Indonesia) sent replies.
      Thanks Ted and Ghandi !


      - Ted Harding wrote:

      The problem with "doing it the same way" is that "the same way" only
      applies in 1 dimension (as a consequence of the result that F(X) has a
      uniform distribution, so that F*(X) has the distribution of X). In more
      than 1 dimension, you still only have the result that F(X,Y) has a uniform
      distribution, and this is only 1 equation. There is no _straightforward_
      way of basing it on two equations

      (F1(X,Y), F2(X,Y)) = (Z1,Z1) where (Z1,Z2) is uniform on [0,1]x[0,1]

      However, there are possible approaches. One is to use conditional
      distributions: let F(X) be the marginal distribution for X, and let G(Y;x)
      be the conditional distribution of Y given that X=x. Then both U=F(X) and,
      for each x, V=G(Y;x) have uniform distributions, so provided you can
      invert these as F*(U) and G*(V;x) you can then sample U=u,V=v and obtain
      x=F*(u) and then y=G*(v;x).

      The practical issues here depend a lot on the form of F(X,Y), since both
      obtaining the conditional distribution and inverting
      the functions F(X), G(Y;x) may be difficult in practice.

      If it is straightforward to obtain the conditional distributions for Y
      given X and X given Y, then Gibbs Sampling can enable you to sample from
      (X,Y) without inverting the functions.

      If you can't obtain the conditional distributions then it may be
      impossible to follow such approaches, and then you may need to fall back
      on simulating a random process with a known mechanism which has the
      property of yielding (X,Y) distributed as f(x,y) -- if you know of such a
      process!

      If you could describe the density f(x,y) you are trying to sample from to
      the list, then perhaps someone can suggest an approach.


      - My reply:

      I was also thinking on something like the conditional distribution and the
      simulation approach, but this is not feasable in this case. What I have is
      a potential field, V(x) = sum over i (a/r^m + b*r^n), with r=|x-xi|, x
      and xi = 2-dim, and I want to assign a probability field to it, like p(x)
      = 1/V(x). With many xi, this is beyond any easy tractability, I think.


      - Ted Harding:

      Yes, not tractable at all in the analytic sense.
      However, this may be well suited to the Metropolis-Hastings algorithm
      which is the basis of Markov Chain Monte Carlo.

      Basically, suppose you want to sample from a density p(x) which is
      proportional to a function g(x) (in your case g(x) = 1/V(x)).

      Choose a "proposal distribution" Q(x) (in your case if sampling over a
      rectangle it may be OK to take Q uniform, constant density, over the
      rectangle).

      For each point you intend to sample from p(x), choose a "starting value"
      x(0) (it could be the same x(0) each time). Then:

      Given any "current point" x(n), choose x(n+1) as follows:

      x is sampled from Q(x) (in general from a distribution Q(x|x(n)) but this
      may not be needed here)

      With probability pA = min[1, (g(x)/g(x(n))*(Q(x(n))/Q(x)) ] in general
      min[1, (g(x)/g(x(n)))*(Q(x(n)|x)/Q(x|x(n)) ])
      adopt the value x as x(n+1); otherwise (with probability 1 - pA) stay with
      x(n+1) = x(n).

      Repeat the above many times. This generates a Markov Chain
      x(0), x(1), ... , x(n), ...
      whose equilibrium probability density is proportional to g(x), i.e. equal
      to p(x).

      Stop at a suitable point (basically, when n is large enough for the
      distribution of X(n) to be sufficiently close to the
      equilbrium). The main issue to be settled (which may require emprical
      investigation) is: When is n "large enough"? The MCMC literature has a lot
      of studies of "convergence diagnostics".

      When you have stopped, at x(n) say, this result is a point sampled from
      p(x) which you can then place in your region. Repeat all the above until
      you have your desired sample.


      - comment:

      This was the hint I needed. The method works well, although probably some
      caution is needed as convergence of the Markov Chain is concerned.
      A google search with keyword "Metropolis Hastings algorithm" gave me a lot
      of useful additional information.



      - Ghandi Pawitan wrote:

      you could try to look at Arbia's book, that is Spatial data configuration
      in Statistical analysis of regional economics and related problem (1989)
      published by Kluwer.


      - comment:

      This book is quite technical and not easy to read for a humble physicist
      like me. It has however a short section on the problem. Furthermore it
      gives extensive references. E.g. J.M. Chambers, Computational methods for
      data analysis, John Wiley & Sons New York 1977, seems to be another good
      source.



      =================================================================

      Peter Bossew
      Institute of Physics and Biophysics,
      University of Salzburg

      home: A-1090 Vienna, Austria, Georg Sigl-Gasse 13/11, ph: +43-1-3177627
      peter.bossew@...
      peter.bossew@...



      --
      * To post a message to the list, send it to ai-geostats@...
      * As a general service to the users, please remember to post a summary of any useful responses to your questions.
      * To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list
      * Support to the list is provided at http://www.ai-geostats.org
    Your message has been successfully submitted and would be delivered to recipients shortly.