- On Tue, 03 April 2001, "Paul Leyland" wrote:
> If the number is very large, it's usually worthwhile reducing it modulo

Yup, GMP uses 256 rather than 64. In a base-2 representation of your numbers this is simply looking at the lowest byte.

> a few small numbers and checking to see whether that is a quadratic

> residue. As a simple example, we know instantly that

> 89798739759179673149047609731096790317607 is not a perfect square

> because the last two digits are 07 and only 00, 25, e1, e4, o6 and e9

> (where e is an even digit and o is an odd digit) are squares modulo 100.

>

> Using the bases 64, 63, 65 and 11 (in that order) will remove 709/715

> (i.e. over 99%) of non-square candidates with four remainder operations

> and four table look-ups and no heavy square root computations. The

> first remainder might be easy to implemnent with a logical AND.

63 is 3^2*7

65 is 5*13

The reason these were chosen is that they

a) share no common factors

b) have small factors (each makes 2 independent judgements)

c) are calculable using shift and add.

However, I'd deviate slightly from Paul's recommendation, if your processor is sufficiently capable (almost anything past an 8086 should do).

63*65 = (2^6-1)*(2^6+1) = 2^12-1 = 4095

This number too has the property that the residue can be calculated with just shifts and adds.

X (mod 4095) == ([_X/4096_] + (X rem 4096)) (mod 4095)

(with Knuthian [_ _] notation for 'floor' or 'round down', and

'rem' = remainder)

e.g. for a bignum B with hex value ....PQRSTUVW

B (mod 4095) == (....PQRST*4096 + UVW) (mod 4095)

== (....PQRST*1 + UVW) (mod 4095)

iteratively == (... + .PQ + RST + UVW) (mod 4095)

so basically, you keep accumulating 12 bit chunks of the number

(UVW, RST, .PQ, ... are all 12 bits) in a running total.

Note, however, that as you're summing you must make sure you don't overflow.

This is almost always 'memory bound' (i.e the processor can do the shifts and adds faster than the memory can provide the data) on a modern CPU. So basically the optimisation is to sum the 'best' modulus that will fit into your registers nicely.

'best' here means 'has the largest number of discrete odd factors. Pauls examples have 2. Mine has 4.

That _doesn't_ mean my method will be faster. If you can implement Paul's modulae ~25% faster than my modulae, then do Paul's method, as the 63 will give you a reject answer quicker than me 75% of the time. The other 25% of the time he takes twice as long.

Expectation(6-bit modulae) = .75*t6 + .25*(2*t6) = 1.25*t6

Expectation(12-bit modulus) = t12

Paul's modulae immediately give you answers which can be checked using a 64-bit (or 64byte) lookup table (in the mod 65 case don't check to see if 0 is valid, it's obviously a possible square).

My modulus leaves you with a 12 bit value, which you don't want to use in a lookup table. However, you can apply Paul's modulae to my residue in constant time, and end up with the same answer as Paul.

Some time in the past I did create a list of all 2^x+/-1 which are useful for calculating residues, and post it to this list (or was it the primeform list?). However, between {63,65} and {4095}, I'm sure that you have enough to get started, and maybe you can experiment with other modulae too.

Phil

Mathematics should not have to involve martyrdom;

Support Eric Weisstein, see http://mathworld.wolfram.com

Find the best deals on the web at AltaVista Shopping!

http://www.shopping.altavista.com > From: Phil Carmody [mailto:fatphil@...]

...

> Yup, GMP uses 256 rather than 64. In a base-2 representation

and d) they use a very small amount of memory for the tables and very

> of your numbers this is simply looking at the lowest byte.

>

> 63 is 3^2*7

> 65 is 5*13

>

> The reason these were chosen is that they

> a) share no common factors

> b) have small factors (each makes 2 independent judgements)

> c) are calculable using shift and add.

little computation to find them, though the latter only needs to be done

once and the results built directly into your source code.

However, your trick of working 63 and 65 in parallel is rather neat!

I'd not seen that before. Whether the extra cost is worth it, I don't

know. It's almost certainly architecture dependent.

I must give credit where it's due: the 64, 63, 65, 11 trick is very old

and not my invention, though I did independently rediscover it when I

first met computational number theory many years ago. It appears in

Cohen's CCANT from where I stole the 709/715 value rather than

re-calculating it. Actually, Cohen has 6/715 for the probability of

failure and I modified it to suit.

Paul- In a message dated 4/3/2001 7:01:39 AM Pacific Daylight Time,

Mnemonix@... writes:

> Does anybody know of a quick test to see whether a number squareroots to

Observed heuristically, if you sum the digits of a square recursively to a

> give a whole number? sqrt() is too computationally "heavy" and I don't need

> the actual squareroot - I just need the check to see if the number

> squareroots to give a whole number.

>

single digit, the squares will sum to 1, 4, 7, or 9 only. For example 5329-->

19-->10-->1.

Also the square sums have roots with specific sums:

1 squares have roots that sum to 1 or 8

4 squares have roots that sum to 2 or 7

7 squares have roots that sum to 4 or 5

9 squares have roots that sum to 3, 6 or 9

I think this is equivalent to one of the methods shown earlier and only rules

out some numbers.

Steve Wagler

[Non-text portions of this message have been removed] > Observed heuristically, if you sum the digits of a square

Indeed. It's the quadratic residue test with modulus 9. It's subsumed

> recursively to a

> single digit, the squares will sum to 1, 4, 7, or 9 only. For

> example 5329-->

> 19-->10-->1.

>

> Also the square sums have roots with specific sums:

>

> 1 squares have roots that sum to 1 or 8

> 4 squares have roots that sum to 2 or 7

> 7 squares have roots that sum to 4 or 5

> 9 squares have roots that sum to 3, 6 or 9

>

> I think this is equivalent to one of the methods shown

> earlier and only rules out some numbers.

in the mod 63 test I gave and it's equivalent in the other proposals.

Paul