> There are IN FACT bulk sequences of primes up to 19 digits in length

To quote from the definitive web resource on prime numbers:

> at the MegaNumberS web site.

http://www.utm.edu/research/primes/notes/faq/LongestList.html

"The problem with answering this question is small primes are too

easy to find. The can be found far faster than they can be read from

a hard disk, so no one bothers to keep long lists (say past 10^9).

Long lists just waste storage, and on the Internet, they just waste

bandwidth."

This quote applies whether you're talking about 15 digit primes or

19 digit primes (yes, 19 digit primes are small); in either case, any

decent prime generation program will generate these numbers faster than

you can ship them across the network.

Also, if Chris Harrison has any intention of these lists of primes

ever being used for any real purpose, a few suggestions:

(1) Be consistent in your data format. You have plain text files,

StuffIt files, and ZIP files -- the one ZIP file I downloaded

had an RTF text file inside. Nobody is going to write an RTF

parser to pull ASCII numbers from a data file.

Use a binary encoding; you'll get better (much better)

compression than any generalized compressor like ZIP or StuffIt.

Make it a simple one, such as encoding the gaps between primes

into individual bytes. Document the format and provide source

code for a sample decoder.

(2) Go much bigger than 19 digit primes -- start with 100 digit

primes.

(3) The best option of all -- provide the program which generates

these lists (source code please).- A carefully constructed dump of a sieve will beat most run-length compression schemes, even with huffman (or other entropy) compresion.

For broad ranges, I'd use a 480/2310 comb to the sieve.

For small ranges (a million primes) a 8/30 comb is fine.

With the right code, I guestimate that generating primes should be faster than reading them off disk still, but it's not by a huge margin. Race Dan Bernstein's Primegen and Eratspeed against your hard disk for primes up to a billion if you don't believe me. (there are links on the primepages)

However, my current sieving job is generating 17 digit numbers, and I store the deltas, which average around 7-8 digits. This gives me roughly 2:1 compression. I then BZIP2 them, which gives an almost exact log(10)/log(256) ratio, as the numbers are essentially unpredictable. Even then, I think I could fill a CD in a month.

Twins fall into this latter catagory, and their density means that hard disk storage is worthwhile.

Phil

Mathematics should not have to involve martyrdom;

Support Eric Weisstein, see http://mathworld.wolfram.com

Find the best deals on the web at AltaVista Shopping!

http://www.shopping.altavista.com