## Re: [PrimeNumbers] Re: Optimized sieving

Expand Messages
• ... Ah, OK, I misunderstood. Yes, it makes perfect sense. Depending on archiotectural issues, you could use 2 smaller bitmask templates, for example 3*5*11*13
Message 1 of 5 , Apr 25, 2002
--- biwema <thya@...> wrote:
> (David)
> > I use bitmap representation with 1 bit per odd number.
> > Instead of using an array and storing an index for each
> > number, I precalculate bitmaps of small divisors.
> > Example, bitmap of size n=3*5*7*11*13 holds the reset
> > bits for one complete period of all 5 numbers. These
> > precalculated bitmaps can then be "anded" with the next
> > window of the sieve, starting at the proper offset,
> > and reset the bits for all 5 of those numbers with one
> > anding operation.
>
> That is exactly what I tried to explain...

Ah, OK, I misunderstood. Yes, it makes perfect sense. Depending on
archiotectural issues, you could use 2 smaller bitmask templates, for
example 3*5*11*13 (=2145) and 7*17*19 (=2261) which together are
smaller than the 3*5*7*11*13 (15015). However, one should almost
certainly sieve only 6n+/-1 rather than all odds, and the 3 term can
be removed.

In gensv I use the technique. On the x86 version I do two bitmaps in
parallel, and on the alpha I do 7 in parallel, as I've got 31 GP Int
registers to play with.

> I don't know if these improvements (used a presieved window; accept
> composite p due to performance reasons) can be used in the Atkin
> sieve.

Probably not. Atkin is non-linear, and has no periodic component.
Apart from the removal of the squaresomes, but then the period is
much larger and the optimisation is far less useful. Note - I don't
know how DB does the removal of squaresome numbers yet, it's "guru
code", and I'm not guru enough to unravel it. (yet :-) )
If anyone has any insights, please let me know.

> (Phil)
> > On the whole small primes aren't that important when
> > considering tasks that are to run for days. The 'small'
> > primes (ones that can remove more than one candidate)
> > typically only take a matter of seconds.
>
> In that case I thought of the situation when we try to find
> triplets,
> bitwins etc, where we have to sieve many ranges to 1000000 or
> 1000000000 in order to merge them before continuing. In that case
> that time gets more important.

In that situation, yes it is more important. The only thing that is
important is to be nice to your cache. Your code will be 5 times
faster if you bear that in mind.

> (Reto)
> >> - Sieving without bitmap:
> >> ...
> >> A hashtable would improve the speed of these searches
> >> significantly.
> (Phil)
> > If you can guarantee never putting more than 3 values
> > in any one bucket (and you are on a 64-bit
> > architecture), then assuming your range of values
> > (after dividing by two iff odd-only) is
> > <=2^(22+HASHBITS) and therefore you can store all three
> > values in one word, you can remove values in constant
> > time with no comparisons/jumps (having said that with
> > only 3 values comparisons were't really a huge
> > problem).
>
> This is a very good idea. As the hashfuntion is random, we cannot
> guarantee that there are at most 3 values per cell, even if the
> average is only 0.5. The problem is, that we cannot move to the
> next
> cell, because we cannot separate between the numbers which are
> already in the cell and those which come from the previous one.
> To solve that problem, we can use the remaining bit (64 bits � 3*21
> bits = 1 bit) band store there the information `cell is full'. In
> that case we move to a tree where we store all the numbers which
> don't fit into the hashtable. That is only O(log n) but does not
> happen very often.

Yup. That's a great use of the extra bit.

> (Jack brennen)
> > Sieving k*2^n+1 with constant k
> > A concrete example might show this in a good way:
> > Sieve with k=3, n=[32..99].
> > Here is the step with p = 307:
> > Q = ceil(sqrt(68)) = 9.
> > r_0 = (3*2^a)%307, for a = 0 ... 8.
> > r_1 = (-1/(2^(32+b*9)))%307, for b = 0 ... 8.
> > r_0 = [3, 6, 12, 24, 48, 96, 192, 77, 154].
> > r_1 = [239, 103, 2, 6, 18, 54, 162, 179, 230].
> > Note that r_0[1] = r_1[3], so 3*2^1 == -1/(2^59) (mod 307).
> > This tells us that 3*2^60+1 is divisible by 307.
>
> (Paul Jobling)
> > The comparison check might be tricky though - if we go
> > through each term in r_0 looking for it in r_1, that is
> > sqrt(N)*sqrt(N) = N operations. Two sorts and a pass
> > through each from bottom to top would be
> > better, off the top of my head.

That's one way, but what sorting algorithm to use...

> Good algorithm. By the way we can hers also use a hashtable to
> store
> r_0. After that we can quickly look up all elements of r_1. That
> can
> be done in O(sqrt(N)).

Yes.

One of my rants it to tell people to use radix-sort whenever they
can. So back to my rhetorical question above, the sort that should be
used is a radix sort. This would mean that the second list doesn't
even need to be sorted, as each element can be checked against the
first list on the fly. Which is effectively what a hash is.
There's almost exactly one bit of information in each of the bits, so
the hash function should simply be a bitstring extraction of the
appropriate length, in exactly the same way that a radix sort would
be.

Phil

__________________________________________________
Do You Yahoo!?
Yahoo! Games - play chess, backgammon, pool and more
http://games.yahoo.com/
Your message has been successfully submitted and would be delivered to recipients shortly.