5191FW: [GP] Re: quasi-random numbers
- May 3, 2010Resending to the list address.
Subject: RE: [GP] Re: quasi-random numbers
Date: Sun, 2 May 2010 13:13:11 -0400
My question that started this discussion was in fact driven by concerns with initializing an EA. I assume that if the genome has 40 real numbers, I would use a QR sequence that was good in 40 dimensions.
I'm not clear on using a QR sequence later as part of a mutation operator. A mutation operator that looks at each real and decides to mutate it - does that use the same 40-dimension sequence as the initialization, or a one dimensional sequence? Do I need to work backward from the expected number of function evaluatons to the expected number of calls to the mutation operator, and then generate a seqence with that number of items? I'm not sure that I can just substitute a call to a QR function for a call to a PRNG.
David vun Kannon
Date: Sat, 1 May 2010 19:30:56 +0900
Subject: [GP] Re: quasi-random numbers
I think the bottom line in this is, what do we really need from the number generator for EAs (I won't use the term RNG here, because I think that presumes the answer). And is it the same as statisticians require from RNGs?
I think the answer to the latter is clear from a variety of sources. Meysenburg, Foster et al. showed that statistical tests were not a good indicator of RNG perfromance; but even more strongly, the experiments of Teytaud, our own, and a number of other people have shown that randomised low-discrepancy sequences give substantially better performance - particularly in initialisation - than more random-appearing sequences. Halton, Foure etc. sequences (even when randomised) are nothing like uniform samples; they are far too regularly spaced - which is exactly why they work better. But they fail the simplest statistical tests for randomness.
So it sounds like we have contradictory requirements - randomness, but even spread. But I think this might be resolvable. Why do we really require randomness? It seems to me that this is mainly so the EA can't cheat, optimising for regularities in the RNG rather than for the solution space. On the other hand, we want evenness to reduce the effects of sample bias in search.
So our requirement for randomness is closer to that of the cryptographer than the statistician. I think it's probably something like, generator A is better than generator B if the K complexity of an algorithm that could better-than-random predict the results of algorithm A is higher than that for algorithm B ( a proper definition would need to be relativised somewhat like in PAC learning - i.e. predict within epsilon, and with a bound on the probability that it would be this correct after seeing some given amount of the sequence).
But then we also have the desire for evenness. I strongly suspect that these are in conflict, and that the trade-off changes both through the evolutionary process (evenness is more important earlier in the run), and also with problem difficulty (i.e. better RNG quality is more important with tougher problems). This might be why a lot of the work comparing low-discrepancy sequences has concentrated on initialisation, while comparisons of RNGs has generally looked at performance throughout the run.
- << Previous post in topic Next post in topic >>