## 23964Lehman factorization algorithm, theoretical comparison with SQUFOF

Expand Messages
• Jan 1, 2012
In this post I explain R.S.Lehman's factorization algorithm, provide a proof and runtime analysis, and compare with SQUFOF.

1. The algorithm. Let T>0 be a tunable constant. [I will prove validity if T>=1.]

Given odd integer N>2, finds nontrivial factor (or proves N prime).
B = floor(N^(1/3) * T);
Trial divide N up to bound B.
if(factor found) return(Factor);
k=0;
LoopForever{
k++; if(k*T^3 > B){ return(N is prime); }
if(k even){ m=2; r=1; }else{ m=4; r=(k+N)%4; }
for(all integers a with 0<=a*a-4*k*N<=B*B and with a%m==r){
c = a*a-4*k*N;
Test whether c is square;
if(issquare){
b=sqrt(c);
return( gcd(a+b, N) );
}
}
}

2. Proof of validity (mostly due to Don Zagier, with some improvements by me):
If we get past trial division up to N^(1/3) then N has <=2 prime factors.
If 0<T<1 however then N could have 3 or more prime factors (which is also ok).

If N>2 is prime then c cannot ever be square:
indeed if 4*k*N divides a^2-b^2 then N does and
if N is PRIME then N must divide either a-b or a+b hence N<=a+b.
We know a*a<4*N^(4/3)/T^2+T^2*N^(2/3)
and we know b<=a hence this cannot happen
if N is large enough with T fixed.
With T=1 it suffices to check by hand validity for
N=3,5,7,11,13,17,19,23 and the general argument
then works for all N>=29.

If we decrease T below 1 and get too small then we may fail for some N;
I am not sure what the minimum allowable T would be.

So N=p*q with p<=q and p prime. We have
T*N^(1/3) <= p <= q = N/p <= T^(-1)*N^(2/3)
1 <= q/p <= T^(-2)*N^(1/3)
Now consider the regular continued fraction (RCF) for q/p.
Let r/s and r'/s' be two successive convergents
such that r*s <= T^(-2)*N^(1/3) < r'*s' <= N.
Then s/r and s'/r' also are successive convergents to p/q.
|q/p - r/s| <= |r'/s' - r/s| = 1/(s*s') hence |p*r-q*s| <= p/s'
|p/q - s/r| <= |s'/r' - s/r| = 1/(r*r') hence |p*r-q*s| <= q/r'
hence geometric mean shows
|p*r-q*s| <= sqrt( (p*q)/(r'*s') ) = sqrt(N/(r'*s')) <= N^(1/3) * T
and with
k=r*s
a=p*s+q*r
we have
a*a-4*k*N=(p*s-q*r)^2 <= N^(2/3) * T^2 = B*B
k*T^3 <= T*N^(1/3) = B
hence the squareness test succeeds (the congruences for a are
easily checked by considerations mod 4).
QED.

3. Runtime analysis.
For each value of k there are at most
(1/2) * (1 + sqrt(4*k*N + T^2 * N^(2/3)) - sqrt(4*k*N))
values of a to try, which is both upper bounded by (using
or monotonicity of sqrt and its the derivative)
and asymptotically equal to
1/2 + T^2*N^(1/6)*k^(-1/2)/8, [side note: Lehman agrees]
and on average over both k-parities only 3/8 + T^2*N^(1/6)*k^(-1/2)*3/32.
And since sum(k<=x)k^(-1/2)<=2*(x+1)^(1/2) we see the total number of squareness tests should be <=(1/8)*T*N^(1/3) + N^(1/3)/(2*T^2) + 1 even in the worst case.
And should be <=(3/16)*T*N^(1/3) + (3/8)*N^(1/3)/T^2 + 1 averaged over hard N.
The latter is minimized by choosing T=2^(2/3)=1.5874 getting #squareness tests<=0.4465*N^(1/3) + 1.
The former minimized by same T but now getting
#squareness tests (averaged)<=0.2977*N^(1/3) + 1.
These are rigorous upper bounds on the number of squareness tests.

Average time (even averaged over hard N meaning N for which the trial division stage yields nothing) ought to be at least a factor 2 faster than these
worst case bounds.

If T=1 (or any fixed T) then the trial division stage should be faster
by a log or loglog factor than the N^(1/3) time for
the square-testing stage (since primes are sparse in the integers by
a log factor) so is neglectible in comparison.
It might not really be neglectible since N is not large enough for logN or loglogN
to be large enough for neglectibility, though, in which case
multiply the runtime estimate by maybe 1.5.

4. Programming tricks:
A. In the inner loop c is bounded between 0 and T^2*N^(2/3)
when we update a to anew=a+stepsize. This saves a multiplicaiton in
the inner loop.

B. The square-testing can use modular info to avoid computing sqrt most of the time.

C. It helps on typical N to try the k with many divisors first.
Any ordering of the k's is valid, but if you try k's with large numbers of
divisors first, then you'll usually factor N sooner.
Lehman argues that k with D divisors has likelyhood proportional to D of
yielding a factor on any particular squarity test.
The average order of the #divisors(N) function is lnN so one
might conjecture that Lehman average time would be sped up
by a log factor by such tricks.

Lehman suggests one way to accomplish such a favorable ordering, but I
think a better programmer could do a better job of this than Lehman did.

5. Comparison with Shanks SQUFOF.
Shanks in unpublished work completed by J.Gower & S.Wagstaff
argues that for average "hard" N (meaning N=product of two primes and N=3 mod 4)
the "average number of forms SQUFOF must examine before finding proper
square form" is 1.775*N^(1/4) asymptotically.

CROSSOVER: The N which causes
#ShanksSQUFOF and #Lehman average squareness tests to be equal,
is N=2^(42.9).

For larger N, SQUFOF should win; for smaller N, Lehman should win.

However, "runtime" and "#squareness tests" are not exactly the same thing.

Shanks seems to have a nastier inner loop than Lehman since it has more operations and
includes among them, a division. On the brighter side, Shanks can live with narrower words, most numbers are N^(1/2) at most with Shanks, while with Lehman it is N^(2/3).
If we were to assume Shanks was 4 times nastier at same number of squareness tests, then the crossover point would move to N=2^(66.9).

The true crossover then (I guess) - for hard N would be somewhere between
43 and 67 bits long.

For "easy" N I would think Lehman would enjoy further advantages versus SQUFOF
because the trial division would then be a win.

Also even if we assume SQUFOF pays no
penalty for its nastier inner loop and demand Lehman use its worst case bound while SQUFOF is allowed to use the average-N bound, we'd get a crossover N = 23.9 bits long,
and 16.9 bits if Lehman is multipled by 1.5 to account for trial division labor not being negligible.
So I regard it as completely certain Lehman is better than SQUFOF for small
enough N and the crossover ought to be 17 to 69 bits.

J.Milan in his INRIA experimental study found SQUFOF was the best factoring method among those he tried on hard N <=59 bits
long, while for hard N>=59 bits long, best was Quadratic Sieve.
Milan did not try Lehman. My analysis here indicates this was a mistake since Lehman
ought to dominate SQUFOF over much of, if not all of, its home range.

I would be interested if anybody follows up on this by actually programming and testing.
--Warren D. Smith
• Show all 2 messages in this topic