- View SourceHi,

Forgive the intrusion as I'm a physicist with an interest in efficient computing rather than a mathematician who has studied number theory.

In order to benchmark system performance I've been playing with an algorithm I created to calculate primes and was surprised by the results and wondered how it compared to other methods and implementations.

I calculated the first 3.2 trillion primes in 38 hours and 32 minutes on standard intel home PC (4-cores, 4GHz, Linux, mean memory usage using mean of ~300MB of memory). In this pass I merely calculated pi(x) rather than output the data as I don't have 25 terabytes of disk space to hand. My results for pi(x) are correct according to other published data for the number of primes up to 1x10^14.

A little research after the fact shows my method is a multi-threaded variant on a sieve of Eratosthenes with an additional sieve to filter multiples of the first 8 integers coprime with 2.

A bit of tinkering shows the algorithm scales reasonably well; calculation time increases by around 15% as x increases by 10-fold but memory usage creaps up to keep the algorithm efficient at high values of x. Also there is a uint64 limit as I haven't used a big number library.

Since this is my first stab I wondered what I could realistically aim for in terms of cycle time for finding primes with realtively low values of x (<1.8x10^19)?

fdj - View SourceThank you Ben.

I have already optimised down to 18 hours for pi(1e14) however primesieve is consistently running at least 4 times faster on my machine.

I am aware that I might be using an odd method to stitch together my sieve segments. Some Javascript here (when you click on the demo link) explains how I have calculated my boundaries:

http://www.daltonfirth.co.uk/tuts/js/prime-finder/prime-finder.html

I am not seeing the efficiency improvements the primesieve author gets when the sieve memory fits into the L2 cache; I will probably have to switch to Tomás Oliveira e Silva's bucket method.

I had a hunch my method, which allows starting on a prime's square for starting values greater than seg_start/start_val, might be efficient, but alas...

I might try improving the storage compression, I'm currently using modulo 30, but I doubt I'll find another 4x optimisation in my code.

JDF

--- In primenumbers@yahoogroups.com, "Ben Buhrow" wrote:

>

> Nearly two orders of magnitude is possible. YAFU and primesieve both compute pi(x) to 3.2e12 using the sieve of Eratosthenes in about 30 minutes on my machine (a pretty fast server). They are both threadable too - with 4 threads YAFU takes about 8 minutes.

>

> - ben.

>

Apologies, I was confusing pi(1e14) ~= 3.2e12 with pi(3.2e12). After running some experiments, it looks like the two programs I mentioned will take about 18 hrs for pi(1e14), so more like a factor of 2.