Loading ...
Sorry, an error occurred while loading the content.
 

OCaml for efficient virtual machines

Expand Messages
  • Ian Zerny
    A friend and I are currently evaluating OCaml as the implementation language for a virtual machine. The interpreted language is yet undecided, but we expect it
    Message 1 of 6 , Nov 1, 2008
      A friend and I are currently evaluating OCaml as the implementation
      language for a virtual machine. The interpreted language is yet
      undecided, but we expect it to be some intermediate bytecode for a
      simple language such as Scheme or Self.

      Currently we have a few issues and would greatly appreciate any
      pointers or comments on them.

      * Dynamically Compiled Code.
      We need to dynamically compile code in the running VM. Our current
      attack on this is to use a native ia32 emitter in OCaml and pass
      the emitted code to a C function that can then start
      executing it. We have looked at arrays and buffers as byte containers,
      but the fact that OCaml tags non-pointers in these containers
      makes this method cumbersome. We would need to remove the code tags in
      C before executing it. Also, if we do not undo the removal of tags we
      would hinder the garbage collector from collecting some unused
      objects. Now if OCaml has some collection that can contain raw bytes
      and is garbage collection aware we could use that. Something like
      Self's byte array. We have not been able to find a reference to such a
      container in OCaml.

      * Threaded Code.
      We would like our bytecode interpreter to be as fast as possible in
      order to reduce the amount of compiling that needs to be done. A
      common technique is to use a ``threaded code'' interpreter. Here every
      operation is given by a label and the current operation can jump
      directly to the following operation. This can greatly reduce the
      amount of miss-predictions in the branch prediction unit.
      We can implement this with a (named or unnamed) function for each
      opcode and in each of them the invocation of the next opcode is in
      tail-position.

      A simple interpreter:

      let op0 r pc ops = r;;
      let op1 r pc ops = ops.(pc) (r+1) (pc+1) ops;;
      let op2 r pc ops = ops.(pc) (r-1) (pc+1) ops;;
      let interp ops = ops.(0) 0 1 ops;;

      We have omitted the appropriate function to create the initial opcode
      array and the code requires the -rectypes in order to compile. This compiles
      to near optimal code and all parameters reside in registers. However,
      the tail calls are indirect jumps that must first go through
      `caml_apply3'.

      Assembly output for op0 and op1:

      op0:
      ret
      op1:
      movl -2(%ecx, %ebx, 2), %edx
      addl $2, %ebx
      addl $2, %eax
      jmp caml_apply3
      # movl 8(%edx), %esi
      # jmp *%esi

      Here we have noted in the comments what we would have liked to see.
      Why OCaml does not do this is still not clear to us. In the code for
      caml_apply3 there is a check on some property of the invoked
      function. It seems that under some situations the tail call can result
      in a call instead of a jump. This puzzles us a bit as the fact that
      the application is in tail position of the caller should be enough, no?
      We would really like to understand why `caml_apply3' works this way.

      If we perform our optimization by hand we beat the C version of the
      same interpreter compiled with GCC (with labels-as-values extension).

      The most efficient interpreter we have been able to produce directly
      with OCaml is the defunctionalized version of the above.
      Unfortunately, that again causes all operations to jump to the same
      `apply' method dispatching over an ADT of operations. This is however
      still faster than the double dispatch incurred by `caml_apply3'.


      Thank you for taking the time to read this post. We look forward to
      any remarks.

      Kind regards, Ian and Thomas.


      Assembly of caml_apply3:
      caml_apply3:
      subl $8, %esp
      movl 4(%edx), %esi
      cmpl $7, %esi # what are we checking?
      jne .L447
      movl 8(%edx), %esi
      addl $8, %esp
      jmp *%esi # we always end here
      .align 16
      .L447:
      movl %ecx, 4(%esp)
      movl %ebx, 0(%esp)
      movl (%edx), %ecx
      movl %edx, %ebx
      call *%ecx
      .L449:
      movl %eax, %ebx
      movl (%ebx), %ecx
      movl 0(%esp), %eax
      call *%ecx
      .L450:
      movl %eax, %ebx
      movl (%ebx), %ecx
      movl 4(%esp), %eax
      addl $8, %esp
      jmp *%ecx
    • Jon Harrop
      ... What about an OCaml string or Bigarray for your data structure? This is perhaps not what you re after but I would recommend using LLVM to write a JIT in
      Message 2 of 6 , Nov 1, 2008
        On Saturday 01 November 2008 15:21:40 Ian Zerny wrote:
        > A friend and I are currently evaluating OCaml as the implementation
        > language for a virtual machine. The interpreted language is yet
        > undecided, but we expect it to be some intermediate bytecode for a
        > simple language such as Scheme or Self.

        What about an OCaml string or Bigarray for your data structure?

        This is perhaps not what you're after but I would recommend using LLVM to
        write a JIT in OCaml instead. That is almost as easy, is likely to give much
        better performance and, in particular, completely evades these tagging woes.

        --
        Dr Jon Harrop, Flying Frog Consultancy Ltd.
        http://www.ffconsultancy.com/?e
      • remi.vanicat@gmail.com
        ... Firstly, you will probably have more precise answer on the main caml-list, your question are not really beginner level question. ... Did you look at
        Message 3 of 6 , Nov 1, 2008
          Ian Zerny <ian@...> writes:

          > A friend and I are currently evaluating OCaml as the implementation
          > language for a virtual machine. The interpreted language is yet
          > undecided, but we expect it to be some intermediate bytecode for a
          > simple language such as Scheme or Self.

          Firstly, you will probably have more precise answer on the main
          caml-list, your question are not really beginner level question.

          >
          > Currently we have a few issues and would greatly appreciate any
          > pointers or comments on them.
          >
          > * Dynamically Compiled Code.

          Did you look at metaocaml ? it probably one of the next way to
          dynamically generate code for ocaml.


          --
          RĂ©mi Vanicat
        • Jon Harrop
          ... FWIW, I found MetaOCaml to be fine for staging a term-level interpreter but failed to get any kind of performance improvement out of it when dealing with
          Message 4 of 6 , Nov 1, 2008
            On Saturday 01 November 2008 19:19:08 remi.vanicat@... wrote:
            > Ian Zerny <ian@...> writes:
            > > * Dynamically Compiled Code.
            >
            > Did you look at metaocaml ? it probably one of the next way to
            > dynamically generate code for ocaml.

            FWIW, I found MetaOCaml to be fine for staging a term-level interpreter but
            failed to get any kind of performance improvement out of it when dealing with
            lower-level languages like bytecodes.

            --
            Dr Jon Harrop, Flying Frog Consultancy Ltd.
            http://www.ffconsultancy.com/?e
          • Ian Zerny
            ... Of course. I have no idea how we managed not to see that string fulfils our requirements. Thank you for the pointer. ... Yes. We have looked at the LLVM
            Message 5 of 6 , Nov 12, 2008
              Jon Harrop wrote:
              > On Saturday 01 November 2008 15:21:40 Ian Zerny wrote:
              >> A friend and I are currently evaluating OCaml as the implementation
              >> language for a virtual machine. The interpreted language is yet
              >> undecided, but we expect it to be some intermediate bytecode for a
              >> simple language such as Scheme or Self.
              >
              > What about an OCaml string or Bigarray for your data structure?

              Of course. I have no idea how we managed not to see that string fulfils
              our requirements. Thank you for the pointer.

              > This is perhaps not what you're after but I would recommend using LLVM to
              > write a JIT in OCaml instead. That is almost as easy, is likely to give much
              > better performance and, in particular, completely evades these tagging woes.

              Yes. We have looked at the LLVM support in OCaml and it looks splendid.
              However, our project is to build the optimizing VM so using LLVM could
              be considered cheating.

              With respect to our question about tail-calls. We have still not found a
              remedy to the problem that can directly be expressed in OCaml. We can
              however hack our build system to manipulate the intermediate assembly
              output of OCaml to remove the indirect jump in well defined places. We
              would still be interested in learning of any other approach to this problem.

              Thank you for the quick reply and sorry our response has been delayed.
              Kind regards, Ian
            • Ian Zerny
              ... Yes. We did not really know where our questions would be appropriate. This list seemed like the safe choice. ... Ok, we will take look at it. Thank you for
              Message 6 of 6 , Nov 12, 2008
                remi.vanicat@... wrote:
                > Ian Zerny <ian@...> writes:
                >
                >> A friend and I are currently evaluating OCaml as the implementation
                >> language for a virtual machine. The interpreted language is yet
                >> undecided, but we expect it to be some intermediate bytecode for a
                >> simple language such as Scheme or Self.
                >
                > Firstly, you will probably have more precise answer on the main
                > caml-list, your question are not really beginner level question.

                Yes. We did not really know where our questions would be appropriate.
                This list seemed like the safe choice.

                >> Currently we have a few issues and would greatly appreciate any
                >> pointers or comments on them.
                >>
                >> * Dynamically Compiled Code.
                >
                > Did you look at metaocaml ? it probably one of the next way to
                > dynamically generate code for ocaml.

                Ok, we will take look at it.

                Thank you for you answers.
                Kind regards, Ian
              Your message has been successfully submitted and would be delivered to recipients shortly.