> (David)

Ah, OK, I misunderstood. Yes, it makes perfect sense. Depending on

> > I use bitmap representation with 1 bit per odd number.

> > Instead of using an array and storing an index for each

> > number, I precalculate bitmaps of small divisors.

> > Example, bitmap of size n=3*5*7*11*13 holds the reset

> > bits for one complete period of all 5 numbers. These

> > precalculated bitmaps can then be "anded" with the next

> > window of the sieve, starting at the proper offset,

> > and reset the bits for all 5 of those numbers with one

> > anding operation.

>

> That is exactly what I tried to explain...

archiotectural issues, you could use 2 smaller bitmask templates, for

example 3*5*11*13 (=2145) and 7*17*19 (=2261) which together are

smaller than the 3*5*7*11*13 (15015). However, one should almost

certainly sieve only 6n+/-1 rather than all odds, and the 3 term can

be removed.

In gensv I use the technique. On the x86 version I do two bitmaps in

parallel, and on the alpha I do 7 in parallel, as I've got 31 GP Int

registers to play with.

> I don't know if these improvements (used a presieved window; accept

Probably not. Atkin is non-linear, and has no periodic component.

> composite p due to performance reasons) can be used in the Atkin

> sieve.

Apart from the removal of the squaresomes, but then the period is

much larger and the optimisation is far less useful. Note - I don't

know how DB does the removal of squaresome numbers yet, it's "guru

code", and I'm not guru enough to unravel it. (yet :-) )

If anyone has any insights, please let me know.

> (Phil)

In that situation, yes it is more important. The only thing that is

> > On the whole small primes aren't that important when

> > considering tasks that are to run for days. The 'small'

> > primes (ones that can remove more than one candidate)

> > typically only take a matter of seconds.

>

> In that case I thought of the situation when we try to find

> triplets,

> bitwins etc, where we have to sieve many ranges to 1000000 or

> 1000000000 in order to merge them before continuing. In that case

> that time gets more important.

important is to be nice to your cache. Your code will be 5 times

faster if you bear that in mind.

> (Reto)

Yup. That's a great use of the extra bit.

> >> - Sieving without bitmap:

> >> ...

> >> A hashtable would improve the speed of these searches

> >> significantly.

> (Phil)

> > If you can guarantee never putting more than 3 values

> > in any one bucket (and you are on a 64-bit

> > architecture), then assuming your range of values

> > (after dividing by two iff odd-only) is

> > <=2^(22+HASHBITS) and therefore you can store all three

> > values in one word, you can remove values in constant

> > time with no comparisons/jumps (having said that with

> > only 3 values comparisons were't really a huge

> > problem).

>

> This is a very good idea. As the hashfuntion is random, we cannot

> guarantee that there are at most 3 values per cell, even if the

> average is only 0.5. The problem is, that we cannot move to the

> next

> cell, because we cannot separate between the numbers which are

> already in the cell and those which come from the previous one.

> To solve that problem, we can use the remaining bit (64 bits � 3*21

> bits = 1 bit) band store there the information `cell is full'. In

> that case we move to a tree where we store all the numbers which

> don't fit into the hashtable. That is only O(log n) but does not

> happen very often.

> (Jack brennen)

That's one way, but what sorting algorithm to use...

> > Sieving k*2^n+1 with constant k

> > A concrete example might show this in a good way:

> > Sieve with k=3, n=[32..99].

> > Here is the step with p = 307:

> > Q = ceil(sqrt(68)) = 9.

> > r_0 = (3*2^a)%307, for a = 0 ... 8.

> > r_1 = (-1/(2^(32+b*9)))%307, for b = 0 ... 8.

> > r_0 = [3, 6, 12, 24, 48, 96, 192, 77, 154].

> > r_1 = [239, 103, 2, 6, 18, 54, 162, 179, 230].

> > Note that r_0[1] = r_1[3], so 3*2^1 == -1/(2^59) (mod 307).

> > This tells us that 3*2^60+1 is divisible by 307.

>

> (Paul Jobling)

> > The comparison check might be tricky though - if we go

> > through each term in r_0 looking for it in r_1, that is

> > sqrt(N)*sqrt(N) = N operations. Two sorts and a pass

> > through each from bottom to top would be

> > better, off the top of my head.

(answered below...)

> Good algorithm. By the way we can hers also use a hashtable to

Yes.

> store

> r_0. After that we can quickly look up all elements of r_1. That

> can

> be done in O(sqrt(N)).

One of my rants it to tell people to use radix-sort whenever they

can. So back to my rhetorical question above, the sort that should be

used is a radix sort. This would mean that the second list doesn't

even need to be sorted, as each element can be checked against the

first list on the fly. Which is effectively what a hash is.

There's almost exactly one bit of information in each of the bits, so

the hash function should simply be a bitstring extraction of the

appropriate length, in exactly the same way that a radix sort would

be.

Phil

__________________________________________________

Do You Yahoo!?

Yahoo! Games - play chess, backgammon, pool and more

http://games.yahoo.com/