Loading ...
Sorry, an error occurred while loading the content.

Specification of integer division in Eiffel.

Expand Messages
  • Robert Will
    hi, [I believe this stuff belongs to eiffel-nice-library, and I only CC d to -language, since the topic showed up there first. I suggest continuing on
    Message 1 of 1 , Aug 7, 2003
    • 0 Attachment
      hi,

      [I believe this stuff belongs to eiffel-nice-library, and I only
      CC'd to -language, since the topic showed up there first. I suggest
      continuing on -library only.]

      Integer division - that is, the functions // and \\ - is not specified
      in the Eiffel Library Kernel Standard. This might at first be no
      surprise, since +, - and their friends have no complete specification,
      either. (Just the axioms from NUMERIC.) But the difference is, that
      there exists a convention for integer addition (which is a subset of
      real addition), but not for integer division. Actually the former is
      only true if we ignore overflows which I will generously do in all of
      this post.

      Now, it is clear that ELKS should specify that, otherwise programs
      can't make any use those functions any more. As a consequence, the
      specification must also be _complete_ and leave no undefined cases,
      otherwise programs can't call the functions in those cases, or at
      least they can't use the result. To design the specification we have
      the usual design criteria (a) to make it useful, and (b) to make it
      simple, so its just like what we do in our day jobs.

      First, let's agree that any sensible // and \\ operator will obey the
      following two laws:

      reversibility: (i // j) * j + (i \\ j) = i

      range_of_remainder: -j.abs < (i \\ j) < +j.abs

      The first equation will define the remainder, if a dividing function
      has been specified. Incidentally these two laws imply a very useful
      third one.

      even_division: (\exists q with q * j = i) = ( (i \\ j) = 0 )


      Also, we'll agree that

      non_negative_case: 0 <= i and 0 < j implies 0 <= (i \\ j) < j

      although this axiom does already exclude the possibility of rounding
      division, i.e. "i // j = (i / j).rounded".

      So far, all is simple. To fix the remaining cases, I'll present you
      the program that pointed me to the problem this Monday. I copy only
      as much code to illustrate the point (and remove the hacks around
      SmartEiffelnesses), nevertheless you may safely skip this, since there
      are plenty of other arguments below ;)

      class RESIZABLE_ARRAY_LIST[T]
      -- Mutable sequences, an implementation of.
      -- Full source code to be published soon,
      -- announcement in comp.lang.eiffel.
      ...
      blocks : LIST[NATIVE_ARRAY[T]]
      block_size, offset : INTEGER
      ...
      -- Representation invariant:
      at( i : INTEGER ) : T is
      do
      Result := blocks.at((offset + i - 1) // block_size + 1)
      .at((offset + i - 1) \\ block_size)
      -- +/-1 required since
      -- - LISTs are indexed from 1
      -- - blocks of memory (NATIVE_ARRAYs) are indexed from 0
      -- - remainder is 0-based
      end
      ...
      -- Consistency invariant:
      block_size > 0
      0 <= offset < block_size
      blocks.for_all( ( b : NATIVE_ARRAY[T] )
      do Result := b.size = block_size end )
      -- Have exactly as many blocks as needed:
      -- (Check via block of last element which has index `size'.)
      blocks.size = (offset + size - 1) // block_size + 1
      ...
      end -- class RESIZABLE_ARRAY_LIST[T]


      Now the use of // and \\ in `at' will already be correct for any
      version of the operators that complies to the axioms above, since `at'
      inherits the precondition `1 <= i' and that makes all division
      operands non-negative. The invariant, however, may experience the
      case `offset = size = 0' resulting in the equation
      `0 = (-1) // block_size + 1', which is true given the right definition
      of //. The actual class is more complicated and contains that
      equation as an assertion and condition in several places including a
      loop invariant in a loop that is one page long. You can imagine what
      happens if all that code needs to include two cases for that statement
      (`is_empty implies ...; not is_empty implies ...') and the code that
      establishes the assertions is adapted accordingly.

      In short, what my code needs is

      positive_divisor_remainder: 0 < j implies 0 <= (i \\ j) < j

      and this is a consequence of

      div_definition: i // j = (i / j).floor
      mod_definition: i \\ j = i - j * (i / j).floor

      which also implies

      negative_divisor_remainder: j < 0 implies j < (i \\ j) <= 0

      straightness: ( (a + b) // j ) = (a // j) + (b // j)
      + ( (a \\ j) + (b \\ j) ) // j

      (Sorry, I didn't find the proof.) And the latter is required in my
      loop: I have `b = -1' and to maintain the loop invariant I will forget
      a block each time `a \\ j = 0'. On the other hand, the semantics
      found in SmartEiffel (and many other places) seem to be:

      quot_definition: i // j = (i / j).truncated_to_integer
      rem_definition: i \\ j = i - j * (i / j).truncated_to_integer

      That function doesn't have a specification in ELKS, either, but I
      guess it should be the following:

      deferred class interface REAL_GENERAL -- as in ETL3
      ...
      truncated_to_integer : INTEGER is
      ensure
      Current >= 0 implies Result = floor
      Current >= 0 implies Result = ceiling
      ...
      end -- deferred class interface REAL_GENERAL

      Anyway this version has the disadvantage, that it "behaves strange
      around zero". For any `j > 0' the division result will be 0 for all
      `-j < i < j', while all other division results `x' occur only for `x*j
      <= i < j'. I really don't know of something positive to say about
      except that it complies to

      philippe_ribets_postulate:
      -i // j = -(i // j) = i // -j = -( -i // -j)

      As for the hardware, I don't want to check all those manuals (neither
      check whether those manuals are correct), so I just had a look at
      Knuth's MMIX which should be a good, general example of a modern
      computer. MMIX's DIV instruction uses the div-semantics as above, and
      Knuth complains "Commercial machines are usually deficient in their
      support for integer arithmetic. For example, they almost never
      produce the true quotient `(i/j).floor' and true remainder `i.mod(j)'
      when `i' is negative or `j' is negative." And he doesn't give any
      hint that other hardware does agree on a fixed other definition.

      Indeed, the lecture of Knuth reminded me that the integer rest, `mod'
      cannot really be "a thing people don't agree on". Modulo arithmetic
      is the base of half of computer science! All additions and
      subtractions on fixed-size integers are already implicit modulo
      operations! What would it be, if the language had a \\ that didn't
      have the same semantics? Of course, modulo arithmetic is only used
      with positive divisors, but if we fix the semantics for those, we can
      as well fix the entire semantics, since the decision between div/mod
      and quot/rem has then been taken. Both variants satisfy the law

      negative_divisor: (i // -j) = (-i // j)

      Finally I think that there are two ways to fix ELKS
      a) do "The Right Thing"
      b) give the choice to the user.

      a) would be to specify \\ and // with div/mod semantics.

      b) would be to
      - offer features `div', `mod', `quot', `rem' with the respective
      semantics (like other many languages do, e.g. Ada, ML, Haskell)
      - and offer // and \\ with the domain constrained to non-negative
      integers (i.e., natural numbers).

      Of course the former approach is much simpler and I think it would
      satisfy most of the people. Should we make the language more
      complicated for the rest? I leave that to you ...


      PS: If someone will update ELKS sometime, he should also include a
      function for integer powers. This is a function sufficiently
      different from real exponentiation to be available separately.
      One is a combinatorial function defined recursively, the other is
      a generalisation of the exponential function.


      Robert

      --
      Eviter tout le mauvais, faire le bien et nettoyer son propre coeur:
      c'est la doctrine des Buddhas.
    Your message has been successfully submitted and would be delivered to recipients shortly.