Re: Lucas tests with Fermat tests for free
- --- In email@example.com, "djbroadhurst" <d.broadhurst@...> wrote:
>In FFT land, this neat version would save 2 forward transforms for each bit, compared to my rather clumsy left-to-right algorithm given in section 4,
> --- In firstname.lastname@example.org,
> "paulunderwooduk" <paulunderwood@> wrote:
> > but surely Grantham's b^((n-1)/2) can be tested first,
> > making it 1 + 2 selfridge?
> Yes. I wrote 2 + 1 for Frobenius only to paraphrase CP
> Algorithm 3.5.9. But batch processing should be done as 1 + 2,
> as per BPSW 1 + 2. Note that in your 1 + 2 method of Section 9
> where "batch testing was sped up by pre-screening with the
> 1 selfridge test", we can use CP to speed up your (less often)
> used 2-selfridge Lucas test, using the CP Lucas chain
> V(2*n) = V(n)^2 - 2 mod N
> V(2*n+1) = V(n+1)*V(n) - c mod N
> where c = (4+x)^2/(2*x+5) - 2 mod N
> is precomputed, using modular inversion.
> Then you would have a squaring and a multiplication,
> as against the two multiplications in your
> left-right binary method of Section 4.
> Squaring and multiplication are taken as equally
> costly for selfridge counting, but a squaring may be
> cheaper than a multiplication, in practice?
> But as that small difference affects only O(1/log(N))
> of the cases tested in batch testing, who really cares?
--- In email@example.com, "paulunderwooduk" <paulunderwood@...> wrote:
> I ran various "minimal \lambda+2" tests on Gilbert Mozzo's 20,000 digit PRP, 5890*10^19996+2^66422-3 (x=1), using a 2.4GHz core:
> 0m32.374s pfgw64 (3-prp)
> 1m9.876s pfgw64 -t
> 1m53.535s pfgw64 -tp
> 3m0.483s pfgw64 -tc
> 5m12.972s pfgw64 scriptify
> 4m4.811s gmp (-O3/no pgo)
> 4m9.148 pari-gp
> 1m15s theoretical Woltman implementation
I compiled a better version of my code with gmp 5.0.5, on a different box running at 3.6GHz and got some better timings
17.505s pfgw (3-prp)
1m1.986s pfgw -tp
1m13.789s gmp (-O3/no pgo)