Loading ...
Sorry, an error occurred while loading the content.

Minimal set of rules to please the GC when interfacing with C

Expand Messages
  • rixed@happyleptic.org
    Hello ! I m coding a small set of programs that do a lot of computation with integers. The bottleneck was in a function that, given two bit vectors, return the
    Message 1 of 6 , Jul 31, 2009
    • 0 Attachment
      Hello !

      I'm coding a small set of programs that do a lot of computation with integers.
      The bottleneck was in a function that, given two bit vectors, return the union
      of the first with the second one shifted by some position (my bit vectors are
      merely int arrays, which sizes never exceed a few hundreds ints).

      After many versions of this function I ended up writing this function in C, (a
      good occasion to test the C <-> Ocaml interface BTW).

      For the curious the resulting function is roughly twice as fast as the OCaml
      version. The key is that I'm able in C to initialize the resulting array in a
      single loop instead of being called for each value as with Array.init.

      So far so good.

      I have several questions, though.

      The function takes three values as a parameter : first int array, second int
      array then the shift (an int).

      In this function I only do one alloc, once (for allocating the result int
      array). this is either a caml_alloc_small() or a caml_alloc_shr() depending of
      the result size).

      According to the manual, I must root the input parameters as well as the return
      value using CAMLparam3/CAMLlocal and unroot them with CAMLreturn. But, given
      that I do only one alloc, the GC can only run once there (I suppose Im not in a
      threaded environment where the CPU could be yield to another OCaml thread,
      given it's even possible to interrupt a C function in that case). So, I do not
      see the point in rooting the return value. Also, the input parameters are
      not the last references of these input arrays, so can not be freed by the GC
      during this allocation (OK this is not true in the general case but it is in
      mine).

      But the GC can certainly move them around (when compacting the heap). In that
      case, is CAMLparam a good protection against this ? I mean, can the GC move a
      value referenced by a CAMLparam ? If it does, then I suppose the parameter
      pointed by the CAMLparam is updated as well, but that means any copy of it in
      another C variable will be wrong, which seams very dangerous. So I suppose the
      GC does not move what's rooted in the CAMLparam. Am I right ?

      Also, when allocating with caml_alloc_shr() the manual insits that the field
      values must then be filled with Store_field() instead of just writing into
      Field(). Looking at the code, its for calling caml_modify which record the
      references from the major to the minor heap. But if you want to write unboxed
      values then this is useless (after some tests the Modify macro check for Is_block
      anyway). Am I right ? Maybe it's obvious, but if not then the manual should
      probably state it.

      To to sum it up :

      - is it safe to assume that values rooted in CAMLparam will not change after
      an alloc (ie the GC will not move the block and update these parameters) ?

      - is it always safe to write into Field() when you write unboxed values (ie
      not a pointer) ?
    • Richard Jones
      On Fri, Jul 31, 2009 at 10:41:50AM +0200, rixed@happyleptic.org wrote: [...] This is definitely a question for caml-list. ... Another OCmal thread cannot
      Message 2 of 6 , Aug 1, 2009
      • 0 Attachment
        On Fri, Jul 31, 2009 at 10:41:50AM +0200, rixed@... wrote:
        [...]

        This is definitely a question for caml-list.

        > According to the manual, I must root the input parameters as well as
        > the return value using CAMLparam3/CAMLlocal and unroot them with
        > CAMLreturn. But, given that I do only one alloc, the GC can only
        > run once there (I suppose Im not in a threaded environment where the
        > CPU could be yield to another OCaml thread, given it's even possible
        > to interrupt a C function in that case).

        Another OCmal thread cannot interrupt unless you give up the global
        lock (caml_enter_blocking_section ... caml_leave_blocking_section
        somewhere in your C function).

        > So, I do not
        > see the point in rooting the return value.

        It seems your reasoning here is sound, however:

        > Also, the input parameters are
        > not the last references of these input arrays, so can not be freed by the GC
        > during this allocation (OK this is not true in the general case but it is in
        > mine).

        ... is not sound. As you say below, the GC can move things about as
        well as freeing them. You may find that the values (which are really
        pointers) for the input parameters end up pointing at some other bit
        of memory.

        > But the GC can certainly move them around (when compacting the heap). In that
        > case, is CAMLparam a good protection against this ? I mean, can the GC move a
        > value referenced by a CAMLparam ? If it does, then I suppose the parameter
        > pointed by the CAMLparam is updated as well, but that means any copy of it in
        > another C variable will be wrong, which seams very dangerous. So I suppose the
        > GC does not move what's rooted in the CAMLparam. Am I right ?

        This is correct.

        Note that it's possible (but not very desirable) to disable
        compaction, so the GC won't move things around. You have to do it on
        a global basis though.

        > Also, when allocating with caml_alloc_shr() the manual insits that
        > the field values must then be filled with Store_field() instead of
        > just writing into Field(). Looking at the code, its for calling
        > caml_modify which record the references from the major to the minor
        > heap. But if you want to write unboxed values then this is useless
        > (after some tests the Modify macro check for Is_block anyway). Am I
        > right ? Maybe it's obvious, but if not then the manual should
        > probably state it.

        Tricky question - better to ask the experts. IMHO the rules
        surrounding the use of Field vs Store_field vs caml_modify are as
        clear as mud.

        > To to sum it up :
        >
        > - is it safe to assume that values rooted in CAMLparam will not change after
        > an alloc (ie the GC will not move the block and update these parameters) ?

        No. AIUI they can be moved around, but if you've rooted them using
        CAMLparam/CAMLlocal then the GC will update your pointers.

        > - is it always safe to write into Field() when you write unboxed values (ie
        > not a pointer) ?

        This bit I'm not sure about.

        Rich.

        --
        Richard Jones
        Red Hat
      • Cedric Cellier
        ... Thank you for clearing this ! I tried to look at the GC compacting code but was unable to find anything to back my expectation (as ocaml internals are
        Message 3 of 6 , Aug 1, 2009
        • 0 Attachment
          > > - is it safe to assume that values rooted in CAMLparam will not change after
          > > an alloc (ie the GC will not move the block and update these parameters) ?
          >
          > No. AIUI they can be moved around, but if you've rooted them using
          > CAMLparam/CAMLlocal then the GC will update your pointers.

          Thank you for clearing this !
          I tried to look at the GC compacting code but was unable to find anything
          to back my expectation (as ocaml internals are still quite obscure to me,
          what I was really looking for was a comment related to not moving
          values accessible from C).

          So, lets consider this case :

          CAMLprim void change_an_array_of_array(value array)
          {
          CAMLparam1(array);

          value *first = Field(array, 0);
          value *last = Field(array, Wosize_val(array));
          for (value *v = first; v < last; v++) {
          // Replace the ith field of the given array by a new array
          *v = alloc_a_new_array_and_initialize_it();
          }

          CAMLreturn0;
          }

          Here we use a pointer (v) to iterate along the array elements and
          do some allocation (thus triggering the GC) while doing so.

          So this code is flawed, because the GC might removes the array from
          under the pointer's head (so to speak) ?
        • Richard Jones
          ... For a true array of array (ie. a array array) the fields in the top array will contain values, _not_ pointers to values. Quick introduction to the
          Message 4 of 6 , Aug 2, 2009
          • 0 Attachment
            On Sun, Aug 02, 2009 at 08:24:32AM +0200, Cedric Cellier wrote:
            > CAMLprim void change_an_array_of_array(value array)
            > {
            > CAMLparam1(array);
            >
            > value *first = Field(array, 0);
            > value *last = Field(array, Wosize_val(array));

            For a true "array of array" (ie. 'a array array) the fields in
            the top array will contain values, _not_ pointers to values.

            Quick introduction to the runtime ...

            A value is either (a) an ordinary pointer to an OCaml block, or (b) an
            integer with the least significant bit set to 1 and the integer
            shifted left by 1 bit (or a value can be some other stuff, but we'll
            ignore that).

            +-------------------------------------------------+
            | a pointer |
            +-------------------------------------------------+

            or:

            +---------------------------------------------+---+
            | an integer (31 or 63 bits usable) | 1 |
            +---------------------------------------------+---+

            An OCaml block is an array of words preceeded by a header. The value
            points to the zeroth word in the array. The header is at a negative
            wordsize offset, ie. value[-4] or value[-8].

            +--------------+--------------+--------------+ - - - - - -
            | header | word 0 | word 1 |
            +--------------+--------------+--------------+ - - - - - -
            (see ^
            mlvalues.h) |
            value

            The number of words is Wosize_val(value), which is a C macro that gets
            this out of the header. On 32 bit architectures, the header is 4
            bytes long, but it contains a tag byte and some bits used by the
            garbage collector, and this is the reason for the infamous limit on
            arrays to 16 MBytes (== 4 M-words == 2^22). Solution: use 64 bit.

            The number of words may be 0 (for a zero-length array, written [| |]).
            It may be 1 -- eg. for storing an Int32. It may be 1, 2, 3, .. --
            quite a common case for storing some variants ("Foo of int"). Or it
            may be large, for storing large arrays and long strings.

            One question which the garbage collector needs to answer, is whether
            those words are themselves values (as in your array-of-array case), or
            are opaque data (eg. a string). That makes a difference when doing
            garbage collection, because the GC mustn't try to follow references if
            they're really opaque string data. The header contains this
            information in the tag byte. If the tag byte is >= No_scan_tag (251)
            then it's opaque data like a string.

            OCaml blocks are located in one of two places: briefly in the minor
            heap, which is a static chunk of memory (256K?) that is continuously
            recycled for small short-lived allocations, or in the major heap. The
            major heap is made up of large chunks of memory allocated out of the C
            heap using malloc. (Many OCaml blocks usually live inside one C
            malloc).

            Most long-lived allocations will be moved at least once (from the
            minor to the major heap), and may be moved around the major heap at
            will during compaction. The exception is for large allocations which
            go straight to the major heap, but these might still be moved during
            compaction (although IIRC in the current implementation this doesn't
            ever happen).

            Because values are just pointers, you can have a value which points
            outside the OCaml heap -- eg. a C pointer to some other memory that
            you malloced yourself from C and then just cast into a value using
            (value)ptr. The garbage collector uses a bitmap [a hash in OCaml >=
            3.11] so that it can quickly determine if a pointer points into the
            OCaml heap, and it will not follow pointers outside its own heap. It
            just ignores them.

            The upshot is:

            (1) Use CAMLparam and CAMLlocal to register any values, including
            local 'value' variables.

            (2) Don't worry about pointers to C malloced stuff. You can treat
            those as pointers or as values, and the GC will ignore them.

            Having said that, it's often useful to associate an OCaml finalizer
            and a C pointer together, so many libraries will wrap them together.
            For example, if you want to call a C free function automagically when
            an OCaml value is garbage collected. The OCaml runtime has a thing
            called a Custom block to handle this case.

            > So this code is flawed, because the GC might removes the array from
            > under the pointer's head (so to speak) ?

            If I'm reading the code correctly, you want to use 'CAMLlocal1(v)', and
            remember that a value _is_ a pointer already.

            Rich.

            --
            Richard Jones
            Red Hat
          • Cedric Cellier
            -[ Sun, Aug 02, 2009 at 02:50:34PM +0100, Richard Jones ]---- ... OK, it answered my question. Thank you very much. Anyway, is there some documentation
            Message 5 of 6 , Aug 2, 2009
            • 0 Attachment
              -[ Sun, Aug 02, 2009 at 02:50:34PM +0100, Richard Jones ]----
              > If I'm reading the code correctly, you want to use 'CAMLlocal1(v)',

              OK, it answered my question.
              Thank you very much.

              Anyway, is there some documentation available that explain the ocaml internals ?
              I mean, beside the papers about the GC algorithms and the original ideas behind
              Ocaml abstract machine, someone ever wrote an ocaml code walkthrough documenting
              the various types, modules, etc, used by ocaml implementation ?
            • Richard Jones
              ... Not especially. I found the C code used to implement the GC is relatively easy to follow. Chapter 18 of the manual which I guess you ve already read is
              Message 6 of 6 , Aug 2, 2009
              • 0 Attachment
                On Sun, Aug 02, 2009 at 06:11:50PM +0200, Cedric Cellier wrote:
                > -[ Sun, Aug 02, 2009 at 02:50:34PM +0100, Richard Jones ]----
                > > If I'm reading the code correctly, you want to use 'CAMLlocal1(v)',
                >
                > OK, it answered my question.
                > Thank you very much.
                >
                > Anyway, is there some documentation available that explain the ocaml
                > internals ? I mean, beside the papers about the GC algorithms and
                > the original ideas behind Ocaml abstract machine, someone ever wrote
                > an ocaml code walkthrough documenting the various types, modules,
                > etc, used by ocaml implementation ?

                Not especially.

                I found the C code used to implement the GC is relatively easy to
                follow.

                Chapter 18 of the manual which I guess you've already read is
                invaluable for understanding the memory format:

                http://caml.inria.fr/pub/docs/manual-ocaml/manual032.html

                Xavier Clerc documented the bytecode format and some other internals
                here:

                http://cadmium.x9c.fr/downloads.html

                There are various useful list posts which document things such as the
                string type:

                http://caml.inria.fr/pub/ml-archives/caml-list/2002/08/e109df224ff0150b302033e2002dbf87.en.html

                and how the '2n+1' integer representation is optimized:

                http://caml.inria.fr/pub/ml-archives/caml-list/2004/07/e86a25aa6c6a6a7d08dd7eb50cfd5d52.fr.html

                The header files are well documented - start with <mlvalues.h>

                Expert FAQ: http://caml.inria.fr/pub/old_caml_site/FAQ/FAQ_EXPERT-eng.html
                and speed FAQ: http://caml.inria.fr/pub/old_caml_site/ocaml/speed.html

                Rich.

                --
                Richard Jones
                Red Hat
              Your message has been successfully submitted and would be delivered to recipients shortly.