this is a summary of the replies I got to my question,

I want to generate a set of points in a plane, {(xi,yi)}, such that the

points are distributed according to a chosen probability

distribution density f, i.e., the infinitesimal rectangle [x,x+dx] X

[y,y+dy] shall contain f(x,y)*dx*dy points.

How do I do this ??

In other words, I am looking for a generalization of the 1-dim method,

where this is done with F*(z), where F=cumul(f) and F* = inverse of F, and

z = random out of [0,1] (or any appropriate interval if f is not

normalized).

I tried to do it the same way, just using complex numbers, but this does

not seem to work, and could also not be generalized for n>2 dimensions.

In statistical terms, I was looking for a way how to sample from a given

pdf.

Ted Harding (U.K.) and Ghandi Pawitan (Bandung, Indonesia) sent replies.

Thanks Ted and Ghandi !

- Ted Harding wrote:

The problem with "doing it the same way" is that "the same way" only

applies in 1 dimension (as a consequence of the result that F(X) has a

uniform distribution, so that F*(X) has the distribution of X). In more

than 1 dimension, you still only have the result that F(X,Y) has a uniform

distribution, and this is only 1 equation. There is no _straightforward_

way of basing it on two equations

(F1(X,Y), F2(X,Y)) = (Z1,Z1) where (Z1,Z2) is uniform on [0,1]x[0,1]

However, there are possible approaches. One is to use conditional

distributions: let F(X) be the marginal distribution for X, and let G(Y;x)

be the conditional distribution of Y given that X=x. Then both U=F(X) and,

for each x, V=G(Y;x) have uniform distributions, so provided you can

invert these as F*(U) and G*(V;x) you can then sample U=u,V=v and obtain

x=F*(u) and then y=G*(v;x).

The practical issues here depend a lot on the form of F(X,Y), since both

obtaining the conditional distribution and inverting

the functions F(X), G(Y;x) may be difficult in practice.

If it is straightforward to obtain the conditional distributions for Y

given X and X given Y, then Gibbs Sampling can enable you to sample from

(X,Y) without inverting the functions.

If you can't obtain the conditional distributions then it may be

impossible to follow such approaches, and then you may need to fall back

on simulating a random process with a known mechanism which has the

property of yielding (X,Y) distributed as f(x,y) -- if you know of such a

process!

If you could describe the density f(x,y) you are trying to sample from to

the list, then perhaps someone can suggest an approach.

- My reply:

I was also thinking on something like the conditional distribution and the

simulation approach, but this is not feasable in this case. What I have is

a potential field, V(x) = sum over i (a/r^m + b*r^n), with r=|x-xi|, x

and xi = 2-dim, and I want to assign a probability field to it, like p(x)

= 1/V(x). With many xi, this is beyond any easy tractability, I think.

- Ted Harding:

Yes, not tractable at all in the analytic sense.

However, this may be well suited to the Metropolis-Hastings algorithm

which is the basis of Markov Chain Monte Carlo.

Basically, suppose you want to sample from a density p(x) which is

proportional to a function g(x) (in your case g(x) = 1/V(x)).

Choose a "proposal distribution" Q(x) (in your case if sampling over a

rectangle it may be OK to take Q uniform, constant density, over the

rectangle).

For each point you intend to sample from p(x), choose a "starting value"

x(0) (it could be the same x(0) each time). Then:

Given any "current point" x(n), choose x(n+1) as follows:

x is sampled from Q(x) (in general from a distribution Q(x|x(n)) but this

may not be needed here)

With probability pA = min[1, (g(x)/g(x(n))*(Q(x(n))/Q(x)) ] in general

min[1, (g(x)/g(x(n)))*(Q(x(n)|x)/Q(x|x(n)) ])

adopt the value x as x(n+1); otherwise (with probability 1 - pA) stay with

x(n+1) = x(n).

Repeat the above many times. This generates a Markov Chain

x(0), x(1), ... , x(n), ...

whose equilibrium probability density is proportional to g(x), i.e. equal

to p(x).

Stop at a suitable point (basically, when n is large enough for the

distribution of X(n) to be sufficiently close to the

equilbrium). The main issue to be settled (which may require emprical

investigation) is: When is n "large enough"? The MCMC literature has a lot

of studies of "convergence diagnostics".

When you have stopped, at x(n) say, this result is a point sampled from

p(x) which you can then place in your region. Repeat all the above until

you have your desired sample.

- comment:

This was the hint I needed. The method works well, although probably some

caution is needed as convergence of the Markov Chain is concerned.

A google search with keyword "Metropolis Hastings algorithm" gave me a lot

of useful additional information.

- Ghandi Pawitan wrote:

you could try to look at Arbia's book, that is Spatial data configuration

in Statistical analysis of regional economics and related problem (1989)

published by Kluwer.

- comment:

This book is quite technical and not easy to read for a humble physicist

like me. It has however a short section on the problem. Furthermore it

gives extensive references. E.g. J.M. Chambers, Computational methods for

data analysis, John Wiley & Sons New York 1977, seems to be another good

source.

=================================================================

Peter Bossew

Institute of Physics and Biophysics,

University of Salzburg

home: A-1090 Vienna, Austria, Georg Sigl-Gasse 13/11, ph: +43-1-3177627

peter.bossew@...

peter.bossew@...

--

* To post a message to the list, send it to ai-geostats@...

* As a general service to the users, please remember to post a summary of any useful responses to your questions.

* To unsubscribe, send an email to majordomo@... with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list

* Support to the list is provided at http://www.ai-geostats.org