- Mar 6, 2012My factoring algorithm SMODA has self-destructed to the extent it

now seems less attractive than the quadratic sieve as a factorer.

----More detailed explanation:

The problem was embarrassingly stupid and would have been noticed as soon as an

implementation was built but Andy JENNINGS saved time by spotting the

problem before that.

To review, SMODA depended on the first idea that, if there were a magic oracle

whom you could ask "O Oracle, here is B and N, now please generate for

me a set of B-smooth numbers

near to N or near to multiples of N... such that these numbers are (i)

otherwise pretty random,

(ii) come approximately as close (in terms of additive error)

to being a multiple of N as possible existencewise,

(iii) come at a small cost, e.g. polynomial in logN cost, per number found",

then factoring would be possible at speeds rivalling or exceeding the

Number Field Sieve.

The second idea was that instead of depending on magic, it was

actually possible to build

the oracle by combining a certain fairly small N-independent database

with some simple

greedy construction algorithms. The database was hard to find, hence

still somewhat oracle-like,

but it seemed possible to find it fast enough that challenging world

records might be feasible.

Now various potential problems were pointed out, which actually did

not seem to be problems,

such as

1) database too expensive to find?

but actually just today my computer run is ending, and has constructed

preliminary database up to 2^400 size. Jim White & I were building

datasets of the "world's most extensive Stormer pair collection" and

it seems clear it with feasible number of computers

our current codes could reach 2^500 -- and likely a good deal further

with as-yet-unimplemented algorithms.

2) Why not use LLL lattice reduction methods to construct database?

As far as I can tell they are an uncompetitively poor idea.

The real problem is this. The greedoid which uses the database to

find smooth numbers

(call them S) near to N or near to multiples of N, works in a manner

somewhat analogous to the way binary numbers can be constructed lying

arbitrarily near to any real. One keeps

on adding smaller and smaller "correction vectors" in a certain vector

space, and proves that a certain natural vector logarithmic distance

measure keeps getting smaller, indeed

at least a factor 2 smaller each whack.

That works. BUT, the problem is, when you make that natural

logarithmic distance smaller, that does NOT imply that the actual

difference |S - multipleofN| keeps getting smaller back in the

ordinary world of integer numbers, rather than the vector world.

Instead, this difference gets smaller for a while, but once |S-N| is

of order about squareroot(N), it almost always would start to grow

again.

Simple example: if I had, say,

10111 =approx= 10000 with error=111

and I used multiplication by the rational 101/100 to try to improve

the approximation, I'd get

1011100 =approx= 1010000 with error=1100

which is the opposite of an improvement in the additive error

(although it does improve the *relative* error).

Improving the relative error is easy and was essentially what I was doing.

Although in principle it basically is always possible to improve the

additive error,

(e.g. if we use 271/274 not 100/101 then get 2740081 which is 81 mod 10000,

or if we use 991/1002 we get 10020001 which is 1 mod 10000)

I don't see how SMODA can gaurantee that without an unfeasibly

enormous database

far larger than I had in mind.

So the net effect is that SMODA should stop working approximately at

the point where you try to exceed the performance of the quadratic

sieve. And it probably will be worse than QS since it is more

complicated.

This is a pity because SMODA and associated theory and data was quite big and

aesthetic looking, but now is seen to be a "castle in the air."

Is there any hope to salvage SMODA? I'm not sure. You'd need to

invent a way to use

the database that would be far more powerful than the simple greedoids

I had in mind.

(But it would be permissible to consume a good deal more compute-time

than the greedoid

consumed.) One way to try do that would be to find a rational number

(such as 271/274

in example above) that yields improvement (this is easy by Euclid

alg.), but doing so in

such a way its numerator was smooth (not so easy). For example based

on the two good

rationals above 271/264 and 991/1002, and 91/92 which is also good,

we could look for smooth numbers 271a+991b+91c

in various ways such as sieving or lattice techniques.

Another salvage attempt would be to have stronger guarantees about

the database

than those I'd had in mind. For example, we believe based on both theoretical

models and computer evidence that very small databases based on very small prime

sets [of order only about (logN)^2 primes] should *exist* but I had given up

on trying to construct databases that small because they asymptotically seemed

much too hard to find. If, however, we nevertheless assumed

SMODA had access to such a super-small super-good database, then SMODA

might be able to become a lot more powerful due to "cancellations"

arising from common prime factors re-occuring enough of the time.

So, salvaging SMODA may or may not be possible. It is worth some thought.

If anybody wants to have my SMODA parts I/II paper draft/shipwrecks and/or our

the Smith+White datasets of Stormer pairs, let me know.

Warren D. Smith

http://rangevoting.org