Loading ...
Sorry, an error occurred while loading the content.

Re: [PrimeNumbers] single-->double multiply?

Expand Messages
  • Jack Brennen
    Something like this should work for x86-64 gcc (all variables are uint64_t): asm( mulq %3 : =a (F), =d (G) : 0 (a), g (b)); That should multiply a*b
    Message 1 of 7 , Jan 4, 2012
    • 0 Attachment
      Something like this should work for x86-64 gcc (all variables
      are uint64_t):

      asm("mulq %3" :
      "=a" (F), "=d" (G) :
      "0" (a), "g" (b));

      That should multiply a*b and put the 128-bit result into G (high half)
      and F (low half).

      It won't work on any other architecture or compiler though...



      On 1/4/2012 12:13 PM, WarrenS wrote:
      > Let a and b be 64-bit unsigned integers.
      >
      > Suppose I want to compute a*b
      > which has 128 bits and stick the lower& upper halves into two 64-bit integers F,G.
      >
      > How do I do it? Is there a compiler builtin that will access hardware capability
      > to do this?
      >
      >
      >
      >
      >
      > ------------------------------------
      >
      > Unsubscribe by an email to: primenumbers-unsubscribe@yahoogroups.com
      > The Prime Pages : http://www.primepages.org/
      >
      > Yahoo! Groups Links
      >
      >
      >
      >
      >
    • Jack Brennen
      Slight fix and optimization: asm( mulq %3 : =a (F), =d (G) : %0 (a), rm (b)); Using %0 allows the compiler to swap the position of a and b if that s
      Message 2 of 7 , Jan 4, 2012
      • 0 Attachment
        Slight fix and optimization:

        asm("mulq %3" :
        "=a" (F), "=d" (G) :
        "%0" (a), "rm" (b));

        Using "%0" allows the compiler to swap the position of a and b
        if that's more efficient. Using "rm" instead of "g" prohibits
        the compiler from trying to multiply by an immediate value if
        b happens to be a constant.




        On 1/4/2012 12:54 PM, Jack Brennen wrote:
        > Something like this should work for x86-64 gcc (all variables
        > are uint64_t):
        >
        > asm("mulq %3" :
        > "=a" (F), "=d" (G) :
        > "0" (a), "g" (b));
        >
        > That should multiply a*b and put the 128-bit result into G (high half)
        > and F (low half).
        >
        > It won't work on any other architecture or compiler though...
        >
        >
        >
        > On 1/4/2012 12:13 PM, WarrenS wrote:
        >> Let a and b be 64-bit unsigned integers.
        >>
        >> Suppose I want to compute a*b
        >> which has 128 bits and stick the lower& upper halves into two 64-bit integers F,G.
        >>
        >> How do I do it? Is there a compiler builtin that will access hardware capability
        >> to do this?
        >>
        >>
        >>
        >>
        >>
        >> ------------------------------------
        >>
        >> Unsubscribe by an email to: primenumbers-unsubscribe@yahoogroups.com
        >> The Prime Pages : http://www.primepages.org/
        >>
        >> Yahoo! Groups Links
        >>
        >>
        >>
        >>
        >>
        >
        >
        >
        > ------------------------------------
        >
        > Unsubscribe by an email to: primenumbers-unsubscribe@yahoogroups.com
        > The Prime Pages : http://www.primepages.org/
        >
        > Yahoo! Groups Links
        >
        >
        >
        >
        >
      • Chuck Lasher
        Regarding the use of the multiply, I concur with exploiting the hardware wherever possible. There is also a technique which is hardware & compiler portable
        Message 3 of 7 , Jan 4, 2012
        • 0 Attachment
          Regarding the use of the multiply, I concur with exploiting the
          hardware wherever possible.

          There is also a technique which is hardware & compiler portable if not
          using gcc on the x86 platform (with or w/o sse)

          We all know it as 'Montgomery Math', named after mathematician Peter
          Montgomery (Microsoft Research)

          Not only does the technique allow portability (even to a GPU where 128
          bits is not available in the SDK),
          but 128 bit math on a 32 or 64 bit OS w/ or w/o GCC.

          but you will also find 'Montgomery Modular Reduction'. This allows 32
          -> 64 -> 128 bit and variations on it

          for just about anything. The 'cost' of this is the extra step to add
          the carry bit during the multiply IFF it is set.

          It's a great way to handle modular reductions and/or multiplies when
          searching for prime numbers.


          Please do look it up and I think you'll be pleased at the versatility of
          the techniques credited to Peter.

          Please pay attention to the 'high half' and 'low half' and you'll find
          even a 16 bit machine can do it if you overlay
          the union {} / struct {} definitions and keep them properly aligned.
          a * b (both 64 bit) can be done in 3 steps on any hardware / OS
          even if you want that 128 bit result.

          We use it all the time. I wish I could give you an example of working
          code, but don't have any here to paste (C code).

          Chuck


          -----
          No virus found in this message.
          Checked by AVG - www.avg.com
          Version: 2012.0.1901 / Virus Database: 2109/4722 - Release Date: 01/04/12
        • WarrenS
          Related question: If we have two 64-bit unsigned integers a,b and wish to divide the double-length a*2^64 + b by a single 64-bit unsigned integer c to get
          Message 4 of 7 , Jan 14, 2012
          • 0 Attachment
            Related question:

            If we have two 64-bit unsigned integers a,b and wish to divide the
            double-length
            a*2^64 + b
            by a single 64-bit unsigned integer c to get single-length
            quotient=q and remainder=r...
            how do we do it using hardware?

            And what happens if the quotient q does not fit in a single-length?
          • Warren Smith
            sorry, I m not much of a programmer. Last question re multiply, got a nice answer from Brennan saying how to within C program, to call the assembler to do the
            Message 5 of 7 , Jan 14, 2012
            • 0 Attachment
              sorry, I'm not much of a programmer. Last question re multiply, got a nice
              answer from Brennan saying how to within C program, to call the assembler
              to do the job.

              I assume there is similar answer here.

              You did that, but did not explain how to do it in C. In C you need to
              speak both C and assembler and understand how to get them to talk, but yes.

              Really if the C compiler writers had a smidgen of brains, they'd have
              supplied a hook so that everybody could easily do these things without
              needing to deal with assembler themselves.


              [Non-text portions of this message have been removed]
            • Paul Leyland
              ... What kind of hardware would you like to use? It s pretty easy to do it with almost any FPGA but if you want an ASIC it s likely to cost you serious money.
              Message 6 of 7 , Jan 14, 2012
              • 0 Attachment
                > how do we do it using hardware?

                What kind of hardware would you like to use? It's pretty easy to do it
                with almost any FPGA but if you want an ASIC it's likely to cost you
                serious money. It's also pretty easy to do with an abacus and an
                instruction manual, but likely to be rather slow.

                In other words, what's your real question?

                Paul
              Your message has been successfully submitted and would be delivered to recipients shortly.