5189Re: quasi-random numbers
- May 1, 2010I think the bottom line in this is, what do we really need from the number generator for EAs (I won't use the term RNG here, because I think that presumes the answer). And is it the same as statisticians require from RNGs?
I think the answer to the latter is clear from a variety of sources. Meysenburg, Foster et al. showed that statistical tests were not a good indicator of RNG perfromance; but even more strongly, the experiments of Teytaud, our own, and a number of other people have shown that randomised low-discrepancy sequences give substantially better performance - particularly in initialisation - than more random-appearing sequences. Halton, Foure etc. sequences (even when randomised) are nothing like uniform samples; they are far too regularly spaced - which is exactly why they work better. But they fail the simplest statistical tests for randomness.
So it sounds like we have contradictory requirements - randomness, but even spread. But I think this might be resolvable. Why do we really require randomness? It seems to me that this is mainly so the EA can't cheat, optimising for regularities in the RNG rather than for the solution space. On the other hand, we want evenness to reduce the effects of sample bias in search.
So our requirement for randomness is closer to that of the cryptographer than the statistician. I think it's probably something like, generator A is better than generator B if the K complexity of an algorithm that could better-than-random predict the results of algorithm A is higher than that for algorithm B ( a proper definition would need to be relativised somewhat like in PAC learning - i.e. predict within epsilon, and with a bound on the probability that it would be this correct after seeing some given amount of the sequence).
But then we also have the desire for evenness. I strongly suspect that these are in conflict, and that the trade-off changes both through the evolutionary process (evenness is more important earlier in the run), and also with problem difficulty (i.e. better RNG quality is more important with tougher problems). This might be why a lot of the work comparing low-discrepancy sequences has concentrated on initialisation, while comparisons of RNGs has generally looked at performance throughout the run.
- Next post in topic >>