## Re: How hard is it to find an AP-k?

Expand Messages
• ... [snip] I retract yesterday s post. Further reflection and numerical experiments have shown that using (k+3) is no better than using (k+4) , and it is
Message 1 of 14 , Jun 6, 2009
• 0 Attachment
--- In primenumbers@yahoogroups.com, "Mike Oakes" <mikeoakes2@...> wrote:
>
> The "score" for an AP-k with d digits is defined to be:
> s = (k+3)*log(d)
[snip]

I retract yesterday's post.

Further reflection and numerical experiments have shown that using "(k+3)" is no better than using "(k+4)",
and it is the latter that has a theoretical motivation, as follows.

A typical search method is to sieve and then test for primality a block of N numbers of the form m*p#+1,
for some fixed prime p and N consecutive values of m.
If the numbers are of size d digits, then by PNT a fraction of about 1/d of the numbers will be prime.
In this block of N, there will be approximately (N/d) primes, and 0.5*(N/d)^2 prime pairs.
Considering each of these pairs as the start of a potential AP-k, the number of AP-k's will be of order N^2/d^k.
For a search to get an even chance of finding an AP-k, we must have N^2/d^k of order 1, i.e. N=d^(k/2).
To perform a primality test on a number of size d digits is of difficulty approximately d^2 (neglecting terms of order log(d)).
The amount of computation involved in finding an AP-k is therefore of order N*d^2=d^(k/2+2).
The log of this (if we multiply by 2) is (k+4)*log(d).

We therefore define the "score" for an AP-k with d digits to be
s = (k+4)*log(d)

NB If factors of order 1 are neglected, this coincides with Chris Caldwell's definition of weight on his "Arithmetic Progressions of Primes" page
http://primes.utm.edu/top20/page.php?id=14
where he presents a detailed theoretical justification, with references, for the algebraic expression he uses.

Inserting the latest records from Jens's page
http://users.cybercity.dk/~dsl522332/math/aprecords.htm
gives this table:-

k d log(d) s=(k+4)*log(d)
- - ------ --------------
3 137514 11.831 82.817
4 11961 9.3894 75.115
5 7009 8.8550 79.695
6 1606 7.3815 73.815
7 1290 7.1624 78.786
8 1057 6.9632 83.558
9 425 6.0521 78.677
10 265 5.5797 78.116
11 195 5.2730 79.095
12 173 5.1533 82.453
13 78 4.3567 74.064
14 69 4.2341 76.214
15 48 3.8712 73.553
16 38 3.6376 72.752
17 29 3.3673 70.713
18 29 3.3673 74.081
19 27 3.2958 75.803
20 21 3.0445 73.068
21 20 2.9957 74.893
22 19 2.9444 76.554
23 19 2.9444 79.499
24 17 2.8332 79.330
25 17 2.8332 82.163

The mean score is 77.166, the median is 76.554 (for k=22).
The range of values is 70.713..82.817, giving a spread of 12.104,
which is better than the spread with "(k+3)" of 13.604.
This new fit has also a smaller relative variance: 0.002085 compared with 0.002258.

Putting c=76.554, the values of d = exp(c/(k+4)) giving that same score would be as follows:-

k d(theor.) d(actual)
- --------- ---------
3 56178.293 137514
4 14317.674 11961
5 4944.3461 7009
6 2112.0198 1606
7 1053.0590 1290
8 589.63282 1057
9 360.96075 425
10 237.01960 265
11 164.61345 195
12 119.65648 173
13 90.303523 78
14 70.316044 69
15 56.213554 48
16 45.956716 38
17 38.299183 29
18 32.450871 29
19 27.894646 27
20 24.282356 21
21 21.373674 20
22 18.998967 19
23 17.036078 19
24 15.395441 17
25 14.010305 17

The rank-ordering is considerably different than when "(k+3)" was used in the formula. The new ordering is:-

rank k d s=(k+4)*log(d)
---- - - --------------
1 8 1057 83.558 was 3rd
2 3 137514 82.817 was 14th
3 12 173 82.453 was 2nd
4 25 17 82.163 was 1st
5 5 7009 79.695 was 15th
6 23 19 79.499
7 24 17 79.330
8 11 195 79.095
9 7 1290 78.786 was 13th
10 9 425 78.677
11 10 265 78.116
12 22 19 76.554 MEDIAN
13 14 69 76.214
14 19 27 75.803
15 4 11961 75.115 was 23rd
16 21 20 74.893
17 18 29 74.081
18 13 78 74.064
19 6 1606 73.815 was 22nd
20 15 48 73.553
21 20 21 73.068
22 16 38 72.752
23 17 29 70.713

The main changes are that the records with smaller k have moved up in the rankings - for k=3, dramatically so!
The k=4 and k=6 records don't now look quite so relatively weak (which should please Ken:-)

The rank order now agrees with Chris's in his above-cited page, for d > 1000 (the lower limit for inclusion in his table).

Mike
• Hi Mike, You re right I like the second ranking better ;-) rank k d s=(k+4)*log(d) ... 1 8 1057 83.558 5 5 7009 79.695 9 7 1290 78.786
Message 2 of 14 , Jun 7, 2009
• 0 Attachment
Hi Mike,
You're right I like the second ranking better ;-)

rank k d s=(k+4)*log(d)
---- - - --------------
1 8 1057 83.558
5 5 7009 79.695
9 7 1290 78.786
10 9 425 78.677
11 10 265 78.116
15 4 11961 75.115
19 6 1606 73.815

Having found the above APs I tend to agree with your difficulty ranking. The AP8 was definitely the hardest.

cheers
Ken

--- In primenumbers@yahoogroups.com, "Mike Oakes" <mikeoakes2@...> wrote:
>
> --- In primenumbers@yahoogroups.com, "Mike Oakes" <mikeoakes2@> wrote:
> >
> > The "score" for an AP-k with d digits is defined to be:
> > s = (k+3)*log(d)
> [snip]
>
> I retract yesterday's post.
>
> Further reflection and numerical experiments have shown that using "(k+3)" is no better than using "(k+4)",
> and it is the latter that has a theoretical motivation, as follows.
>
> A typical search method is to sieve and then test for primality a block of N numbers of the form m*p#+1,
> for some fixed prime p and N consecutive values of m.
> If the numbers are of size d digits, then by PNT a fraction of about 1/d of the numbers will be prime.
> In this block of N, there will be approximately (N/d) primes, and 0.5*(N/d)^2 prime pairs.
> Considering each of these pairs as the start of a potential AP-k, the number of AP-k's will be of order N^2/d^k.
> For a search to get an even chance of finding an AP-k, we must have N^2/d^k of order 1, i.e. N=d^(k/2).
> To perform a primality test on a number of size d digits is of difficulty approximately d^2 (neglecting terms of order log(d)).
> The amount of computation involved in finding an AP-k is therefore of order N*d^2=d^(k/2+2).
> The log of this (if we multiply by 2) is (k+4)*log(d).
>
> We therefore define the "score" for an AP-k with d digits to be
> s = (k+4)*log(d)
>
> NB If factors of order 1 are neglected, this coincides with Chris Caldwell's definition of weight on his "Arithmetic Progressions of Primes" page
> http://primes.utm.edu/top20/page.php?id=14
> where he presents a detailed theoretical justification, with references, for the algebraic expression he uses.
>
> Inserting the latest records from Jens's page
> http://users.cybercity.dk/~dsl522332/math/aprecords.htm
> gives this table:-
>
> k d log(d) s=(k+4)*log(d)
> - - ------ --------------
> 3 137514 11.831 82.817
> 4 11961 9.3894 75.115
> 5 7009 8.8550 79.695
> 6 1606 7.3815 73.815
> 7 1290 7.1624 78.786
> 8 1057 6.9632 83.558
> 9 425 6.0521 78.677
> 10 265 5.5797 78.116
> 11 195 5.2730 79.095
> 12 173 5.1533 82.453
> 13 78 4.3567 74.064
> 14 69 4.2341 76.214
> 15 48 3.8712 73.553
> 16 38 3.6376 72.752
> 17 29 3.3673 70.713
> 18 29 3.3673 74.081
> 19 27 3.2958 75.803
> 20 21 3.0445 73.068
> 21 20 2.9957 74.893
> 22 19 2.9444 76.554
> 23 19 2.9444 79.499
> 24 17 2.8332 79.330
> 25 17 2.8332 82.163
>
> The mean score is 77.166, the median is 76.554 (for k=22).
> The range of values is 70.713..82.817, giving a spread of 12.104,
> which is better than the spread with "(k+3)" of 13.604.
> This new fit has also a smaller relative variance: 0.002085 compared with 0.002258.
>
> Putting c=76.554, the values of d = exp(c/(k+4)) giving that same score would be as follows:-
>
> k d(theor.) d(actual)
> - --------- ---------
> 3 56178.293 137514
> 4 14317.674 11961
> 5 4944.3461 7009
> 6 2112.0198 1606
> 7 1053.0590 1290
> 8 589.63282 1057
> 9 360.96075 425
> 10 237.01960 265
> 11 164.61345 195
> 12 119.65648 173
> 13 90.303523 78
> 14 70.316044 69
> 15 56.213554 48
> 16 45.956716 38
> 17 38.299183 29
> 18 32.450871 29
> 19 27.894646 27
> 20 24.282356 21
> 21 21.373674 20
> 22 18.998967 19
> 23 17.036078 19
> 24 15.395441 17
> 25 14.010305 17
>
> The rank-ordering is considerably different than when "(k+3)" was used in the formula. The new ordering is:-
>
> rank k d s=(k+4)*log(d)
> ---- - - --------------
> 1 8 1057 83.558 was 3rd
> 2 3 137514 82.817 was 14th
> 3 12 173 82.453 was 2nd
> 4 25 17 82.163 was 1st
> 5 5 7009 79.695 was 15th
> 6 23 19 79.499
> 7 24 17 79.330
> 8 11 195 79.095
> 9 7 1290 78.786 was 13th
> 10 9 425 78.677
> 11 10 265 78.116
> 12 22 19 76.554 MEDIAN
> 13 14 69 76.214
> 14 19 27 75.803
> 15 4 11961 75.115 was 23rd
> 16 21 20 74.893
> 17 18 29 74.081
> 18 13 78 74.064
> 19 6 1606 73.815 was 22nd
> 20 15 48 73.553
> 21 20 21 73.068
> 22 16 38 72.752
> 23 17 29 70.713
>
> The main changes are that the records with smaller k have moved up in the rankings - for k=3, dramatically so!
> The k=4 and k=6 records don't now look quite so relatively weak (which should please Ken:-)
>
> The rank order now agrees with Chris's in his above-cited page, for d > 1000 (the lower limit for inclusion in his table).
>
> Mike
>
• Hi Mike, Having now carefully read your posting and while still thinking that the end result gives a good indication of difficulty I have questions about the
Message 3 of 14 , Jun 9, 2009
• 0 Attachment
Hi Mike,

Having now "carefully" read your posting and while still thinking that the end result gives a good indication of difficulty I have questions about the statement "If the numbers are of size d digits, then by PNT a fraction of about 1/d of the numbers will be prime. If the numbers are of size d digits, then by PNT a fraction of about 1/d of the numbers will be prime"

Firstly doesn't the prime number theorem that for a number N the probability of a random number N being prime is
~= 1/log(N)
~= 1/dlog(10)
so for a random 265 digit number the chance of it being prime would be
~= 1/610.185

Secondly using p# further increases this probability.
= (1/(dlog(10))*(1/2*2/3*4/5*....*p-1/p))
So for a 265 digit number of the form 617#+1 the probability would be
= 1/53.12 (not 1/265)
As a practical comparison for the ap10 mentioned (with 265 digits) I found 18,894,798 prps out of 1,000,000,000 candidates meaning that 1/52.92 numbers were prps
With the 1057 digit numbers used for the ap8 ~= 1/174 were prps

Does this affect the rest of your calculations?

Cheers
Ken
• ... Quite right, Ken. I saw the sloppy wording just after hitting Send , but thought I would wait till someone commented rather than send a trivial
Message 4 of 14 , Jun 12, 2009
• 0 Attachment
>
> Hi Mike,
>
> Having now "carefully" read your posting and while still thinking that the end result gives a good indication of difficulty I have questions about the statement "If the numbers are of size d digits, then by PNT a fraction of about 1/d of the numbers will be prime.
> Firstly doesn't the prime number theorem that for a number N the probability of a random number N being prime is
> ~= 1/log(N)
> ~= 1/dlog(10)
> so for a random 265 digit number the chance of it being prime would be
> ~= 1/610.185

Quite right, Ken.
I saw the sloppy wording just after hitting "Send", but thought I would wait till someone commented rather than send a trivial correction.
In fact, my "score" function would only have gained an overall factor of log(10), and since such factors have been removed (they make no difference to the goodness of fit and there is anyway another suppressed factor from the d^2 difficulty-of-primality-testing aspect), that will not affect anything.

> Secondly using p# further increases this probability.
> = (1/(dlog(10))*(1/2*2/3*4/5*....*p-1/p))
> So for a 265 digit number of the form 617#+1 the probability would be
> = 1/53.12 (not 1/265)
> As a practical comparison for the ap10 mentioned (with 265 digits) I found 18,894,798 prps out of 1,000,000,000 candidates meaning that 1/52.92 numbers were prps
> With the 1057 digit numbers used for the ap8 ~= 1/174 were prps
>
> Does this affect the rest of your calculations?

This is a *much" harder one!
I have been exercised on this aspect ever since starting AP-k searches at the start of 2006 - and no doubt you have too!

We need a description of what is happening when we choose the candidate integers from an AP rather than a contiguous set, don't we?

After about 3 days more or less continuous reflection and computer experimentation I think I have at last completely cracked this problem.
I have a Hardy-Littlewood-type conjecture which does exactly what we want (and is able to predict your AP-8 stats to an extraordinary degree of precision:-)
A post on all that is in preparation.

-Mike Oakes
• ... Yum, yum! My saliva buds are already working. David
Message 5 of 14 , Jun 13, 2009
• 0 Attachment
"Mike Oakes" <mikeoakes2@...> wrote:

> After about 3 days more or less continuous reflection and
> computer experimentation I think I have at last completely
> cracked this problem.
> I have a Hardy-Littlewood-type conjecture which does exactly what
> we want (and is able to predict your AP-8 stats to an extraordinary
> degree of precision:-)
> A post on all that is in preparation.

Yum, yum! My saliva buds are already working.

David
• ... Two key references for this post are Chris s wonderfully informative Arithmetic Progressions of Primes page http://primes.utm.edu/top20/page.php?id=14
Message 6 of 14 , Jun 22, 2009
• 0 Attachment
--- In primenumbers@yahoogroups.com, "Mike Oakes" <mikeoakes2@...> wrote:
> A post on all that is in preparation.

Two key references for this post are Chris's wonderfully informative "Arithmetic Progressions of Primes" page
http://primes.utm.edu/top20/page.php?id=14
and references cited therein; plus Jens's comprehensive "Primes in Arithmetic Progression Records" page
http://users.cybercity.dk/~dsl522332/math/aprecords.htm

It follows from the Hardy-Littlewood prime k-tuple conjecture that the number of arithmetic progressions of k terms, all primes and not larger than x, is asymptotically (for large x):-
c_k(x) = D_k*x^2/(2*(k-1)*log(x)^k) .. (1)
where
D_k={product: p prime}beta(p)
with
beta(p)=(1-1/p)^(1-k)*(1/p) p <= k
=(1-1/p)^(1-k)*(1-(k-1)/p) p >= k

Consider the sequence u(a,d,n)=a+d*n, 0<=n<k.
The count of all such sequences with 0 < u(a,d,n) <= x is approximately:-
t_k(N) = x^2/(2*(k-1)) ..(2)

Comparison of (1) and (2) shows that the probability of a typical sequence consisting of *all* primes is just the product of the probabilities for each term separately to be prime (which by the PNT is 1/log(x), since x is the typical size of a term) times the factor D_k, which incorporates the fact that the terms are not independent from one another, prime-factor-wise.

Apply the above probability argument now to a k-term sequence u(a,d,n), where a is drawn at random from a set of Na values {A}, and d is drawn at random from a set of Nd values {D},
giving a total of Na*Nd sequences.

Let p(n) be the probability that u(n)=(a+n*d) be prime, 0<=n<k.

Conjecture: the probability P that u(n) is a sequence consisting of *all* primes is
P=D(k,d)*product{0<=n<k}p(n)
where
D(k,d)={product: p prime, p does not divide d}beta(p)

===

To see how the number of factors in d affects the "scaling" factor D(k,d), here is a table of computed values, with the number of significant digits displayed being approximately the number that can be trusted.
[D(k,1), D(k,1009#) and D(k,10007#) were computed using all primes to 10^8, the others using primes up to 10^7.]

For all q, we have:-
D(2,q#)=1.00000000000

D(3,1)=1.32032363
D(3,2#)=0.6601618
D(3,3#)=0.8802158
D(3,5#)=0.9388968
D(3,7#)=0.9657224
D(3,17#)=0.9861509
D(3,37#)=0.9943893
D(3,67#)=0.9972169
D(3,1009#)=0.99987387
D(3,10007#)=0.999990194

D(4,1)=2.85824860
D(4,2#)=0.7145622
D(4,3#)=0.6351664
D(4,5#)=0.8130129
D(4,7#)=0.8959735
D(4,17#)=0.9581001
D(4,37#)=0.9830958
D(4,67#)=0.9916287
D(4,1009#)=0.99962152
D(4,10007#)=0.999970581

D(5,1)=4.1511809
D(5,2#)=0.5188976
D(5,3#)=0.3074949
D(5,5#)=0.6297495
D(5,7#)=0.7931539
D(5,17#)=0.9160594
D(5,37#)=0.9661407
D(5,67#)=0.9832363
D(5,1009#)=0.999242915
D(5,10007#)=0.999941160

D(6,1)=10.131795
D(6,2#)=0.6332372
D(6,3#)=0.2501678
D(6,5#)=0.4098749
D(6,7#)=0.6637208
D(6,17#)=0.8608535
D(6,37#)=0.9436427
D(6,67#)=0.9720641
D(6,1009#)=0.99873807
D(6,10007#)=0.99990193

D(7,1)=17.298612
D(7,2#)=0.5405817
D(7,3#)=0.1423754
D(7,5#)=0.1866143
D(7,7#)=0.5180388
D(7,17#)=0.7939335
D(7,37#)=0.9158195
D(7,67#)=0.9581607
D(7,1009#)=0.99810703
D(7,10007#)=0.99985290

D(8,1)=53.971949
D(8,2#)=0.8433118
D(8,3#)=0.1480712
D(8,5#)=0.1552639
D(8,7#)=0.3694376
D(8,17#)=0.7173459
D(8,37#)=0.8829862
D(8,67#)=0.9415983
D(8,1009#)=0.99734991
D(8,10007#)=0.99979405

D(9,1)=148.55163
D(9,2#)=1.1605598
D(9,3#)=0.1358497
D(9,5#)=0.1139590
D(9,7#)=0.2324194
D(9,17#)=0.6336562
D(9,37#)=0.8455522
D(9,67#)=0.9224727
D(9,1009#)=0.99646687
D(9,10007#)=0.99972541

D(10,1)=336.03433
D(10,2#)=1.3126344
D(10,3#)=0.1024339
D(10,5#)=0.0687422
D(10,7#)=0.1201712
D(10,17#)=0.5458231
D(10,37#)=0.8040131
D(10,67#)=0.9009026
D(10,1009#)=0.99545810
D(10,10007#)=0.99964695

D(11,1)=511.42229
D(11,2#)=0.9988719
D(11,3#)=0.0519659
D(11,5#)=0.02789898
D(11,7#)=0.04180406
D(11,17#)=0.4570322
D(11,37#)=0.7589416
D(11,67#)=0.8770283
D(11,1009#)=0.99432385
D(11,10007#)=0.99955869

D(12,1)=1312.3197
D(12,2#)=1.2815626
D(12,3#)=0.04444851
D(12,5#)=0.01909049
D(12,7#)=0.02451887
D(12,17#)=0.3704957
D(12,37#)=0.7109748
D(12,67#)=0.8510107
D(12,1009#)=0.9930644
D(12,10007#)=0.99946063

D(13,1)=2364.5990
D(13,2#)=1.1545898
D(13,3#)=0.02669647
D(13,5#)=0.009172838
D(13,7#)=0.010098114
D(13,17#)=0.2892347
D(13,37#)=0.6607992
D(13,67#)=0.8230294
D(13,1009#)=0.9916802
D(13,10007#)=0.99935277

D(14,1)=7820.6003
D(14,2#)=1.9093271
D(14,3#)=0.02943169
D(14,5#)=0.00809012
D(14,7#)=0.00763387
D(14,17#)=0.2158645
D(14,37#)=0.6091338
D(14,67#)=0.7932804
D(14,1009#)=0.9901715
D(14,10007#)=0.99923512

D(15,1)=22938.910
D(15,2#)=2.8001612
D(15,3#)=0.02877575
D(15,5#)=0.00632785
D(15,7#)=0.00511799
D(15,17#)=0.1524021
D(15,37#)=0.5567125
D(15,67#)=0.7619744
D(15,1009#)=0.9885387
D(15,10007#)=0.99910766

D(16,1)=55651.466
D(16,2#)=3.3966978
D(16,3#)=0.02327069
D(16,5#)=0.00409382
D(16,7#)=0.00283808
D(16,17#)=0.1001208
D(16,37#)=0.5042644
D(16,67#)=0.7293338
D(16,1009#)=0.9867825
D(16,10007#)=0.99897041

D(17,1)=91555.117
D(17,2#)=2.7940422
D(17,3#)=0.01276128
D(17,5#)=0.00179599
D(17,7#)=0.00106722
D(17,17#)=0.0594701
D(17,37#)=0.4524954
D(17,67#)=0.6955905
D(17,1009#)=0.9849033
D(17,10007#)=0.99882337

D(18,1)=256474.88
D(18,2#)=3.9134989
D(18,3#)=0.01191613
D(18,5#)=0.001341636
D(18,7#)=0.000683341
D(18,17#)=0.03007454
D(18,37#)=0.4020697
D(18,67#)=0.6609823
D(18,1009#)=0.9829017
D(18,10007#)=0.9986665

D(19,1)=510992.05
D(19,2#)=3.8985631
D(19,3#)=0.007913765
D(19,5#)=0.0007128086
D(19,7#)=0.0003111922
D(19,17#)=0.01081699
D(19,37#)=0.3535932
D(19,67#)=0.6257508
D(19,1009#)=0.9807783
D(19,10007#)=0.9984999

D(20,1)=1900972.8
D(20,2#)=7.251642
D(20,3#)=0.009813495
D(20,5#)=0.0007071368
D(20,7#)=0.0002646138
D(20,17#)=0.007264514
D(20,37#)=0.3075983
D(20,67#)=0.5901376
D(20,1009#)=0.9785338
D(20,10007#)=0.9983236

D(21,1)=6423764.9
D(21,2#)=12.2523717
D(21,3#)=0.01105392
D(21,5#)=0.0006372150
D(21,7#)=0.0002043846
D(21,17#)=0.004431580
D(21,37#)=0.2645322
D(21,67#)=0.5543818
D(21,1009#)=0.9761689
D(21,10007#)=0.9981374

D(22,1)=18606668
D(22,2#)=17.744721
D(22,3#)=0.01067269
D(22,5#)=0.000492191
D(22,7#)=0.000135316
D(22,17#)=0.002317266
D(22,37#)=0.2247477
D(22,67#)=0.5187169
D(22,1009#)=0.9736844
D(22,10007#)=0.9979415

D(23,1)=38734737
D(23,2#)=18.470185
D(23,3#)=0.007406019
D(23,5#)=0.0002732339
D(23,7#)=0.00006438774
D(23,17#)=0.000870855
D(23,37#)=0.1884984
D(23,67#)=0.4833680
D(23,1009#)=0.9710811
D(23,10007#)=0.9977358

D(24,1)=153217040
D(24,2#)=36.5298376
D(24,3#)=0.009764950
D(24,5#)=0.00028821046
D(24,7#)=0.00005821456
D(24,17#)=0.000621857
D(24,37#)=0.1559370
D(24,67#)=0.4485489
D(24,1009#)=0.9683599
D(24,10007#)=0.99752023

D(25,1)=568632600
D(25,2#)=67.7863864
D(25,3#)=0.01208018
D(25,5#)=0.0002852353
D(25,7#)=0.0000493831
D(25,17#)=0.0004166331
D(25,37#)=0.1271171
D(25,67#)=0.4144603
D(25,1009#)=0.9655215
D(25,10007#)=0.9972951

D(26,1)=1941938900
D(26,2#)=115.74876
D(26,3#)=0.0137517
D(26,5#)=0.00025986
D(26,7#)=0.000038548
D(26,17#)=0.00025686
D(26,37#)=0.1019998
D(26,67#)=0.3812867
D(26,1009#)=0.9625671
D(26,10007#)=0.9970601

For k=3..26, we have:-
D(k,130003#)=1.00000000000

In order to confront this conjecture with experiment, I have computed some expected vs. actual probabilities for three scenarios representative of actually-undertaken searches for AP-k's, in particular, those in which the common difference d has very many factors.

===

Consider arithmetic progressions {u(n)=a+n*d, 0<=d<k},
where every term is of the form m*q#+1, with M1<=m<=M2.

Take M1=99000001, M2 = 101000000, so M_mean=10^8, M_count=2*10^7.

If ck is a count of the number of AP-k's,
and "dig" is the mean number of digits in u(n), viz. log(M_mean*q#)/log(10),
and ck_theor=D(k,q#)*x^2/(2*(k-1)*log(x)^k)
with x=M_mean*q#,
then we find the following theoretically-predicted and experimentally-found counts:-

q dig c1_theor c1_exp
1 8.0 108573.62 108540
2 8.3 209272.58 209370
3 8.8 296846.89 296845
5 9.5 343691.77 343261
7 10.3 368145.32 368052
11 11.4 367848.22 368073
13 12.5 362925.64 363158
17 13.7 350995.75 350821
19 15.0 338882.78 339794
23 16.3 324776.66 324653
29 17.8 308757.02 309100
31 19.3 294398.04 294802
37 20.9 279840.28 279217
41 22.5 266260.73 266648
43 24.1 254136.55 254326
47 25.8 242825.31 241998
53 27.5 231984.23 232039
59 29.3 221713.58 221056
61 31.1 212456.16 212742
67 32.9 203702.70 203981

[Of course c2=c1*(c1-1)/2.]

q c3_theor c3_exp
1 211233780.6 202841435
2 756305702.2 728944254
3 2878038897.6 2773618467
5 4764697012.5 4512210251
7 6023101499.9 5675518754
17 5330389807.1 5022131103
37 2723937336.5 2586439661
67 1053634545.6 1020369385

q c4_theor c4_exp
1 16549544.5 15871419
2 57105514.8 55044966
3 205497365.8 197860176
5 472675149.7 439294189
7 685742445.9 629037681
17 605908579.3 554325985
37 251203375.0 232545604
67 71141821.6 67735580

q c5_theor c5_exp
1 978618.2 898748
2 3254342.9 3012277
3 11074405.9 10254290
5 47188217.0 43775930
7 83805770.3 75904604
17 76252302.8 68614763
37 25906665.1 23613957
67 5388424.9 5081205

q c6_theor c6_exp
1 103731.9 97097
2 332445.4 313857
3 1069809.4 1007089
5 4222271.7 3980463
7 10327165.7 9313827
17 10060520.6 8959287
37 2832363.2 2554552
67 434065.9 406473

q c7_theor c7_exp
1 8012.2 6911
2 24746.6 21828
3 75306.3 66759
5 275294.6 244774
7 1236420.0 1116660
17 1356956.5 1201795
37 320516.4 287877
67 36314.9 33676

q c8_theor c8_exp
1 1163.2 1013
2 3462.4 3033
3 9963.7 8952
5 33737.7 30323
7 139119.3 126565
17 184431.7 163219
37 37061.9 33069
67 3115.5 2800

q c9_theor c9_exp
1 152.1 136
2 436.3 359
3 1187.2 1129
5 3723.4 3350
7 14096.7 13059
17 25017.3 22276
37 4345.1 3883
67 272.0 248

q c10_theor c10_exp
1 16.6 19
2 45.9 37
3 118.1 122
5 343.1 341
7 1192.6 1101
17 3361.7 3041
37 513.9 457
67 24.1 24

q c11_theor c11_exp
1 1.23 0
2 3.29 1
3 8.00 7
5 21.54 22
7 68.73 58
17 444.60 389
37 61.08 59
67 2.15 1

q c12_theor c12_exp
1 0.156 0
2 0.401 0
3 0.924 0
5 2.302 5
7 6.745 4
17 57.502 47
37 7.279 4
67 0.193 0

q c13_theor c13_exp
1 0.0140 0
2 0.0347 0
3 0.0755 0
5 0.1742 0
7 0.4688 0
17 7.2216 9
37 0.8677 1
67 0.0174 0

q c14_theor c14_exp
1 0.00232 0
2 0.00554 0
3 0.01140 0
5 0.02438 0
7 0.06021 0
17 0.87312 1
37 0.10331 0
67 0.00158 0

q c15_theor c15_exp
1 0.00034 0
2 0.00079 0
3 0.00154 0
5 0.00304 0
7 0.00690 0
11 0.01975 0
13 0.06839 1
17 0.10045 0
37 0.01227 0
67 0.00014 0

===

As another set of experiments, this time oriented towards the statistics of actual AP-k records, as per Jens's page cited above, consider, again, arithmetic progressions {u(n)=a+n*d, 0<=d<k},
where every term is of the form m*q#+1, with M1<=m<=M2.

Take M1 = 1, M2=2000000000, so M_mean=10^9, M_count=2*10^9.
We find the following theoretical predictions for counts ck of AP-k's:-

** k=3 **
q dig c1 c3 c4 c5
317363 137513.7 142526.25 361905.02 17.19 0.00092
298409 129237.3 150912.59 429621.89 21.61 0.0012

** k=4 **
q dig c1 c3 c4 c5 c6
27751 11961.6 1323796.29 289982931.39 127958.82 63.52 0.034
23159 10005.8 1555099.09 470092004.06 243678.36 142.10 0.088

** k=5 **
q dig c1 c3 c4 c5 c6 c7
16229 7009.5 2141861.98 1228237350.34 876896.25 704.31 0.60 0.00054
5113 2197.7 6022927.71 27310170302.0 54826899.86 123824.60 298.29 0.75

** k=6 **
q dig c1 c4 c5 c6 c7 c8
3739 1607.1 7938323.71 165450204.95 492482.39 1563.62 5.17 0.018
3529 1503.9 8415847.33 208997272.44 659523.38 2219.91 7.78 0.028
3011 1289.5 9634046.26 358902669.50 1296491.56 4995.46 20.05 0.083

** k=7 **
q dig c1 c5 c6 c7 c8 c9
3011 1289.5 9634046.26 1296491.56 4995.46 20.05 0.083 0.00035
3001 1286.0 9656898.99 1311940.73 5066.98 20.38 0.084 0.00036

** k=8 **
q dig c1 c6 c7 c8 c9 c10
2459 1055.7 11477930.33 14284.51 68.30 0.34 0.0017 0.0000086
2411 1035.4 11674422.61 15815.82 76.92 0.38 0.0020 0.000010

** k=9 **
q dig c1 c7 c8 c9 c10 c11
997 424.3 25284285.59 17171.15 185.93 2.05 0.023 0.00026
941 400.4 26574383.43 24323.82 276.80 3.22 0.038 0.00045
757 324.2 31826848.17 85922.60 1170.78 16.28 0.23 0.0033
701 301.3 33877610.42 132997.69 1928.83 28.55 0.43 0.0065
499 215.1 45063613.24 978740.14 18871.29 371.34 7.42 0.15
463 201.7 47569642.89 1429072.60 29082.78 604.01 12.74 0.27

** k=10 **
q dig c1 c8 c9 c10 c11 c12
617 264.6 37815477.56 4645.93 76.75 1.29 0.022 0.00038
607 259.1 38504142.29 5366.94 90.27 1.54 0.027 0.00047
463 201.7 47569642.89 29082.78 604.01 12.74 0.27 0.0059

** k=11 **
q dig c1 c9 c10 c11 c12 c13
449 193.7 49209748.70 819.14 17.87 0.39 0.0088 0.00020
401 172.6 54198048.78 1950.83 46.86 1.14 0.028 0.00069
349 149.4 61132838.21 5755.11 155.84 4.27 0.12 0.0033
281 124.4 71150930.76 22489.10 708.20 22.57 0.73 0.024

** k=12 **
q dig c1 c10 c11 c12 c13 c14
401 172.6 54198048.78 46.86 1.14 0.028 0.00069 0.000017
283 126.8 70023070.54 603.89 18.94 0.60 0.019 0.00061
229 100.3 84886301.52 4114.22 156.20 5.99 0.23 0.0090
199 90.9 91980354.17 9152.84 376.24 15.61 0.65 0.027
173 77.2 104883968.89 33805.65 1582.07 74.71 3.55 0.17
157 70.5 112761422.84 69464.48 3491.45 177.07 9.04 0.46

** k=13 **
q dig c1 c11 c12 c13 c14 c15
173 77.2 104883968.89 1582.07 74.71 3.55 0.17 0.0082
139 64.0 121865274.43 8151.93 446.18 24.59 1.36 0.076
127 57.6 132424730.49 20177.45 1197.92 71.61 4.30 0.26
103 49.4 149105671.60 73303.06 4883.26 327.42 22.07 1.49

** k=14 **
q dig c1 c12 c13 c14 c15 c16
151 68.4 115642693.09 239.12 12.52 0.66 0.035 0.0019
131 59.7 128712444.82 854.53 49.69 2.91 0.17 0.010
101 47.4 153932686.06 7117.19 492.08 34.19 2.39 0.17

** k=15 **
q dig c1 c13 c14 c15 c16 c17
101 47.4 153932686.06 492.08 34.19 2.39 0.17 0.012

** k=16 **
q dig c1 c14 c15 c16 c17 c18
73 37.6 183222748.55 370.57 30.47 2.51 0.21 0.017
71 35.7 190132666.62 611.87 52.05 4.44 0.38 0.032
43 25.1 244018322.25 17022.92 1801.94 190.50 20.10 2.11

** k=17 **
q dig c1 c15 c16 c17 c18 c19
53 28.5 223848160.09 541.91 53.41 5.26 0.52 0.051
43 25.1 244018322.25 1801.94 190.50 20.10 2.11 0.22

** k=18 **
q dig c1 c16 c17 c18 c19 c20
53 28.5 223848160.09 53.41 5.26 0.52 0.051 0.0050
31 20.3 279897280.35 1305.69 150.81 17.28 1.96 0.22
23 17.3 306055912.16 3761.57 442.62 51.13 5.78 0.64

** k=19 **
q dig c1 c17 c18 c19 c20 c21
31 20.3 279897280.35 150.81 17.28 1.96 0.22 0.024
23 17.3 306055912.16 442.62 51.13 5.78 0.64 0.068

** k=20 **
q dig c1 c18 c19 c20 c21 c22
23 17.3 306055912.16 51.13 5.78 0.64 0.068 0.0071

** k=21 **
q dig c1 c19 c20 c21 c22 c23
23 17.3 306055912.16 5.78 0.64 0.068 0.0071 0.00070

** k=22 **
q dig c1 c20 c21 c22 c23 c24
23 17.3 306055912.16 0.64 0.068 0.0071 0.00070 0.000067

** k=23 **
q dig c1 c21 c22 c23 c24 c25
23 17.3 306055912.16 0.068 0.0071 0.00070 0.000067 0.0000059

** k=24 **
q dig c1 c22 c23 c24 c25
23 17.3 306055912.16 0.0071 0.00070 0.000067 0.0000059

** k=25 **
q dig c1 c23 c24 c25
23 17.3 306055912.16 0.00070 0.000067 0.0000059

Note (1): The k=8, q=2459 stats agree beautifully with Ken's actual counts - see his post:

Note (2): If, instead of M_count=2*10^9, one were to choose a different value, then the above counts are all to be multiplied by (M_count/2*10^9)^2; this is an excellent way of estimating what block size woud be required to obtain a predicted count of 1.0.

===

As a final set of experiments, oriented this time towards the statistics of AP-k records for large k, consider arithmetic progressions {u(n)=a+n*d, 0<=d<k}, where every term is of the form a+q#, with a_min<=a<=a_max.

Taking always a_max=a_min+10^12, we find the following theoretical predictions for counts of AP-k's:-

a_min=10^16, q=23, dig=17:-
c20_theor=131491.513618
c21_theor=15277.0009101
c22_theor=1715.86594350
c23_theor=185.099954827
c24_theor=19.0095562007
c25_theor=1.83517574326
c26_theor=0.163311641770

a_min=10^18, q=43, dig=19:-
c20_theor=605995.069496
c21_theor=87300.2072397
c22_theor=12504.4660274
c23_theor=1779.79385925
c24_theor=251.582566249
c25_theor=35.2976557067
c26_theor=4.91254506617

a_min=10^20, q=53, dig=21:-
c20_theor=111588.394854
c21_theor=14827.7334326
c22_theor=1962.13961648
c23_theor=258.478592621
c24_theor=33.8849092473
c25_theor=4.41903581907
c26_theor=0.573116905364

Comments (pos. or neg.) welcome - especially from AP-k practitioners!

-Mike Oakes
• ... which is nonsense. I meant: .. consider arithmetic progressions {u(n)=a+n*d, 0
Message 7 of 14 , Jun 22, 2009
• 0 Attachment
--- In primenumbers@yahoogroups.com, "Mike Oakes" <mikeoakes2@...> wrote:
>
> As a final set of experiments, oriented this time towards the
> statistics of AP-k records for large k, consider arithmetic
> progressions {u(n)=a+n*d, 0<=d<k}, where every term is of the
> form a+q#, with a_min<=a<=a_max.

which is nonsense. I meant:
.. consider arithmetic progressions {u(n)=a+n*d, 0<=n<k}, where d=q#, with a_min<=a<=a_max.

Mike
• ... http://primes.utm.edu/top20/page.php?id=14 Yes, that Grosswald conjecture is rather remarkable, About your tables of ck_theor and ck_exp for k 2: the
Message 8 of 14 , Jun 22, 2009
• 0 Attachment
"Mike Oakes" <mikeoakes2@...> wrote:

> Chris's wonderfully informative
> "Arithmetic Progressions of Primes" page
http://primes.utm.edu/top20/page.php?id=14

Yes, that Grosswald conjecture is rather remarkable,

the theory side seems to be systematically high.
Not by much, but still more than the 1/sqrt(N) fluctuations.

Can you please spell out the comparison with Ken's data:

> 189,178,618,167 ap3s
> 543,008,443 ap4s
> 3,117,826 ap5s

It seems that you did not so far show your expectations
for c3,c4,c5, where his stats are good. (I take your point
about c6 and c7 looking good.)

Thanks for an interesting posting.

David
• ... My program output says:- counts_theor(3,2459)=189008276732. counts_theor(4,2459)=723074851.939 counts_theor(5,2459)=3111847.13368 Note that Ken qualified
Message 9 of 14 , Jun 22, 2009
• 0 Attachment
>
> Can you please spell out the comparison with Ken's data:
>
> > 189,178,618,167 ap3s
> > 543,008,443 ap4s
> > 3,117,826 ap5s
>
> It seems that you did not so far show your expectations
> for c3,c4,c5, where his stats are good. (I take your point
> about c6 and c7 looking good.)
>

My program output says:-
counts_theor(3,2459)=189008276732.
counts_theor(4,2459)=723074851.939
counts_theor(5,2459)=3111847.13368

Note that Ken qualified his ap4s count by
"(psuedo eg. 1,2,3,5 as only processed AP3's with an even difference)"

Mike
• ... Very nice: congrats! David
Message 10 of 14 , Jun 22, 2009
• 0 Attachment
"Mike Oakes" <mikeoakes2@...> wrote:

> > 189,178,618,167 ap3s
> > 3,117,826 ap5s

> counts_theor(3,2459)=189008276732.
> counts_theor(5,2459)=3111847.13368

Very nice: congrats!

David
• Hi Mike, A few quick comparisons with some of my other searches. AP9 With m_count = 1*10^9 and your formula (M_count/2*10^9)^2 = 1/4 Gives q dig c1 c7 c8 c9
Message 11 of 14 , Jun 22, 2009
• 0 Attachment
Hi Mike,
A few quick comparisons with some of my other searches.

AP9
With m_count = 1*10^9 and your formula (M_count/2*10^9)^2 = 1/4
Gives

q dig c1 c7 c8 c9
Theory 997 424.3 12642142 4293 46 0
Actual 997 424.3 12658662 4331 53 0

If you want to compare further
Ap6s = 409947
Ap5s = 40570738
Ap4s = 3207896756 (this might have been pseudo's I'd have to check my C++ code which I don't have access to at the moment)
Ap3's = lost count

AP10
With m_count = 1*10^9
Gives

q dig c1 c8 c9 c10
Theory 617 264.6 18907739 1161 19 0
Actual 617 264.6 18894798 1133 14 0

Ap7s = 71214
I only have valid counts down to this level as I only processed AP3s with a difference of 3 (looking inside them for an Ap7)

All in all your theory nicely agrees with my results and should help me greatly for future searches.
Thanks Mike!
A very nice result.

Cheers
Ken
• ... counts_theor(1,997)=12642142.7943 counts_theor(2,997)=7.99118808941 E13 counts_theor(3,997)=505064537331. counts_theor(4,997)=4255649343.14
Message 12 of 14 , Jun 23, 2009
• 0 Attachment
>
> Hi Mike,
> A few quick comparisons with some of my other searches.
>
> AP9
> With m_count = 1*10^9 and your formula (M_count/2*10^9)^2 = 1/4
> Gives
>
> q dig c1 c7 c8 c9
> Theory 997 424.3 12642142 4293 46 0
> Actual 997 424.3 12658662 4331 53 0
>
> If you want to compare further
> Ap6s = 409947
> Ap5s = 40570738
> Ap4s = 3207896756 (this might have been pseudo's I'd have to check my C++ code which I don't have access to at the moment)
> Ap3's = lost count
>

counts_theor(1,997)=12642142.7943
counts_theor(2,997)=7.99118808941 E13
counts_theor(3,997)=505064537331.
counts_theor(4,997)=4255649343.14
counts_theor(5,997)=40334993.7203
counts_theor(6,997)=407728.896596
counts_theor(7,997)=4292.73732756
counts_theor(8,997)=46.4810674074
counts_theor(9,997)=0.513708974304
counts_theor(10,997)=0.00576689464294
counts_theor(11,997)=0.0000655399693707
counts_theor(12,997)=0.000000752280002281

> AP10
> With m_count = 1*10^9
> Gives
>
> q dig c1 c8 c9 c10
> Theory 617 264.6 18907739 1161 19 0
> Actual 617 264.6 18894798 1133 14 0
>
> Ap7s = 71214
> I only have valid counts down to this level as I only processed AP3s with a difference of 3 (looking inside them for an Ap7)
>

counts_theor(1,617)=18895165.9208
counts_theor(2,617)=1.78513638140 E14
counts_theor(3,617)=1.68615682803 E12
counts_theor(4,617)=21230924677.6
counts_theor(5,617)=300675388.024
counts_theor(6,617)=4541098.43467
counts_theor(7,617)=71426.2578907
counts_theor(8,617)=1155.29887582
counts_theor(9,617)=19.0717479232
counts_theor(10,617)=0.319764874922
counts_theor(11,617)=0.00542713429128
counts_theor(12,617)=0.0000930206866839
counts_theor(13,617)=0.00000160729956878

Mike
• ... That was as per 6 Jun. Just over 5 months on, the revised table is:- rank k d s=(k+4)*log(d) ... 1 8 1057 83.558 2 3 137514 82.817 3 12
Message 13 of 14 , Nov 17, 2009
• 0 Attachment
--- In primenumbers@yahoogroups.com, "Mike Oakes" <mikeoakes2@...> wrote:
>
> Inserting the latest records from Jens's page
> http://users.cybercity.dk/~dsl522332/math/aprecords.htm
> gives this table:-
>
> rank k d s=(k+4)*log(d)
> ---- - - --------------
> 1 8 1057 83.558
> 2 3 137514 82.817
> 3 12 173 82.453
> 4 25 17 82.163
> 5 5 7009 79.695
> 6 23 19 79.499
> 7 24 17 79.330
> 8 11 195 79.095
> 9 7 1290 78.786
> 10 9 425 78.677
> 11 10 265 78.116
> 12 22 19 76.554 MEDIAN
> 13 14 69 76.214
> 14 19 27 75.803
> 15 4 11961 75.115
> 16 21 20 74.893
> 17 18 29 74.081
> 18 13 78 74.064
> 19 6 1606 73.815
> 20 15 48 73.553
> 21 20 21 73.068
> 22 16 38 72.752
> 23 17 29 70.713
>

That was as per 6 Jun.
Just over 5 months on, the revised table is:-

rank k d s=(k+4)*log(d)
---- - - ------ -------
1 8 1057 83.558
2 3 137514 82.817
3 12 173 82.453
4 25 17 82.163
5 5 7009 79.695
6 24 18 80.930
7 23 19 79.499
8 11 196 79.172
9 7 1290 78.786
10 9 425 78.677
11 10 274 78.584
12 17 42 78.491 was 23rd
13 6 2145 76.709 was 19th
14 22 19 76.554
15 14 69 76.214
16 19 27 75.803
17 15 54 75.791 was 20th
18 4 11961 75.115 was 15th
19 21 20 74.893 was 16th
20 16 42 74.753
21 18 29 74.081 was 17th
22 13 78 74.064 was 18th
23 20 21 73.068

Note that the lowest score was 70.713 and is now 73.068.
The comments are against k values which have changed rank by more than 2.
The first 11 positions are almost unchanged.

The latest table might help in suggesting to people (but perhaps I shouldn't be giving these clues:-) which records to try for next.

-Mike Oakes
Your message has been successfully submitted and would be delivered to recipients shortly.