Loading ...
Sorry, an error occurred while loading the content.
 

Making Electrons Jitter ..

Expand Messages
  • zuluzombie1
    I m a long time programmer (mainly OO) and an Ocaml newbie. I wrote the following code to perform the business model of a program to randomly jitter lots of
    Message 1 of 18 , Sep 17, 2004
      I'm a long time programmer (mainly OO) and an Ocaml newbie.

      I wrote the following code to perform the 'business model'
      of a program to randomly jitter lots of 'electrons' on a 2D
      grid, similar to the old Python demo program Electrons.py.
      I tried to be strictly 'ML', no objects, no imperative
      (Ocaml 3.08.1).
      It seems to work, but it's about the same speed as Java
      (1.4.2, Sun) ..

      My main question is ..

      Is this reasonable ML code?

      I don't mean indentation (although critiques of that
      are gratefully accepted) but is the program structured
      properly .. or did I miss the point of Ocaml ..

      I've
      been doing OO imperative for about 15 years (Smalltalk too)
      so it's difficult to wrap my mind around Ocaml.

      Any comments appreciated. (Any comments on why it's
      slow also welcome .. Note: I tried mutable arrays and
      that was slower) ..

      Thanks in Advance,

      Tom



      open Unix

      type position = {x:float; y:float}
      type electron = {oldPosition:position; newPosition:position}

      let main () =

      let aPosition = {x=1.0; y=2.0} in
      let anElectron = {oldPosition=aPosition; newPosition=aPosition} in
      let new_position_from aPosition dx dy =
      {x=aPosition.x+.dx; y=aPosition.y+.dy} in
      let new_electron_from anElectron dx dy =
      {oldPosition=anElectron.oldPosition;
      newPosition=new_position_from anElectron.oldPosition dx dy}; in

      let new_array_from anArray =
      let jitter = 2.0 in
      let addJitter anElectron =
      new_electron_from anElectron
      (jitter -. 2.0*.jitter*.Random.float(1.0))
      (jitter -. 2.0*.jitter*.Random.float(1.0));
      and updated_electron_from anElectron =
      {oldPosition=anElectron.newPosition;
      newPosition=anElectron.newPosition}
      in
      List.map addJitter anArray;
      List.map updated_electron_from anArray
      in

      let rec move_electron_times count anArray =
      if count <=0 then anArray
      else
      move_electron_times (count-1) (new_array_from anArray)
      in

      let rec create_electrons_aux theList = function
      | 0 -> theList
      | n -> create_electrons_aux (anElectron :: theList) (n - 1)
      in

      let create_electrons n = create_electrons_aux [] n in
      let actualList = create_electrons 200 in

      let t1 = gettimeofday ()
      and t2() = gettimeofday ()
      in
      t1;
      move_electron_times 200000 actualList;
      print_string("total time = " ^ string_of_float(t2() -. t1) ^ "\n")
      ;;

      main();;
    • Remi Vanicat
      On Fri, 17 Sep 2004 08:43:59 -0000, zuluzombie1 ... There are several strange thing in it, I believe they are mistake (or at least inefficiency) ... Only a
      Message 2 of 18 , Sep 17, 2004
        On Fri, 17 Sep 2004 08:43:59 -0000, zuluzombie1
        <zuluzombie1@...> wrote:
        >
        > I'm a long time programmer (mainly OO) and an Ocaml newbie.
        >
        > I wrote the following code to perform the 'business model'
        > of a program to randomly jitter lots of 'electrons' on a 2D
        > grid, similar to the old Python demo program Electrons.py.
        > I tried to be strictly 'ML', no objects, no imperative
        > (Ocaml 3.08.1).
        > It seems to work, but it's about the same speed as Java
        > (1.4.2, Sun) ..
        >
        > My main question is ..
        >
        > Is this reasonable ML code?

        There are several strange thing in it, I believe they are mistake (or
        at least inefficiency)

        > open Unix
        >
        > type position = {x:float; y:float}

        Only a precision about indentation : I prefer { x:float; y:float }
        (look at the added space), but it is a question of style.

        > type electron = {oldPosition:position; newPosition:position}
        >
        > let main () =
        >
        > let aPosition = {x=1.0; y=2.0} in
        > let anElectron = {oldPosition=aPosition; newPosition=aPosition} in
        > let new_position_from aPosition dx dy =
        > {x=aPosition.x+.dx; y=aPosition.y+.dy} in

        Here I would also add space around the = and +, and after the { and before the }

        > let new_electron_from anElectron dx dy =
        > {oldPosition=anElectron.oldPosition;
        > newPosition=new_position_from anElectron.oldPosition dx dy}; in
        >
        > let new_array_from anArray =

        if it is a list, it's better to name the variable aList (array do
        exist in ocaml).

        > let jitter = 2.0 in
        > let addJitter anElectron =
        > new_electron_from anElectron
        > (jitter -. 2.0*.jitter*.Random.float(1.0))
        > (jitter -. 2.0*.jitter*.Random.float(1.0));
        > and updated_electron_from anElectron =
        > {oldPosition=anElectron.newPosition;
        > newPosition=anElectron.newPosition}
        > in
        > List.map addJitter anArray;

        List.map build a new list, there you put this new list to trash. Did
        you want to use List.iter, or is there a mistake ?

        > List.map updated_electron_from anArray
        > in
        >
        > let rec move_electron_times count anArray =
        > if count <=0 then anArray
        > else
        > move_electron_times (count-1) (new_array_from anArray)
        > in
        >
        > let rec create_electrons_aux theList = function
        > | 0 -> theList
        > | n -> create_electrons_aux (anElectron :: theList) (n - 1)
        > in
        >
        > let create_electrons n = create_electrons_aux [] n in
        > let actualList = create_electrons 200 in
        >
        > let t1 = gettimeofday ()
        > and t2() = gettimeofday ()
        > in
        > t1;
        > move_electron_times 200000 actualList;
        > print_string("total time = " ^ string_of_float(t2() -. t1) ^ "\n")
        > ;;
        >
        > main();;
        >


        For matter of efficiency, There is an inefficiency with ocaml usage of
        float : they are boxed. The compiler try to optimize this, but It may
        be there that you lost most of your speed....
      • Daniel.Andor@physics.org
        ... For such simulations (diffusion of electrons, right?), even if you use the imperative features of Ocaml, you can benefit from the type checking and
        Message 3 of 18 , Sep 17, 2004
          On Friday 17 September 2004 9:43 am, zuluzombie1 wrote:
          > I wrote the following code to perform the 'business model'
          > of a program to randomly jitter lots of 'electrons' on a 2D
          > grid, similar to the old Python demo program Electrons.py.
          > I tried to be strictly 'ML', no objects, no imperative
          > (Ocaml 3.08.1).

          For such simulations (diffusion of electrons, right?), even if you use the
          imperative features of Ocaml, you can benefit from the type checking and
          functional aspects "on the way".

          As a simple example, in your jitter routine

          let move_all temp =
          let move temperature pos =
          let jit = 2.0 *. (sqrt temperature) in
          ....
          List.iter (move temp) particles ;
          ..

          you can benefit from closures.

          > List.map addJitter anArray;
          > List.map updated_electron_from anArray
          > in

          Are you sure this actually does what you want it to do? You've thrown away
          the result of the first List.map.

          D
        • Daniel Andor
          ... This is worth bearing in mind, but it is also true that arrays of floats and records of floats are unboxed. So the only way you would benefit from using
          Message 4 of 18 , Sep 17, 2004
            On Friday 17 September 2004 10:12 am, Remi Vanicat wrote:
            > For matter of efficiency, There is an inefficiency with ocaml usage of
            > float : they are boxed. The compiler try to optimize this, but It may
            > be there that you lost most of your speed....

            This is worth bearing in mind, but it is also true that arrays of floats and
            records of floats are unboxed. So the only way you would benefit from using
            an array in the electron case is if you stored the particles in a contiguous
            array of size 2*N.

            Otherwise, my guess is that the slowness is due to the continuous
            re-allocation of large data structures.

            I think Tom should use an imperative approach as described in section 1.5 of
            the Ocaml manual.

            D
          • Richard Jones
            I think your code is fairly reasonable. If speed is of the essence, I would use a mutable structure, which avoids making so many allocations[1]: type position
            Message 5 of 18 , Sep 17, 2004
              I think your code is fairly reasonable.

              If speed is of the essence, I would use a mutable structure, which
              avoids making so many allocations[1]:

              type position = { mutable x : float; mutable y : float };;

              let move_electron e dx dy =
              e.x <- e.x +. dx; e.y <- e.y +. dy;;

              Of course, this is an imperative approach, so you'll have to rewrite a
              lot of your other code.

              If you're unsure where the time is going, try profiling your code
              using the gprof method. See this page:

              http://www.merjis.com/developers/ocaml_tutorial/ch11#Profiling

              Rich.

              [1] I think: someone can tell me whether structures of floats are
              unboxed, like 2-element arrays of floats ...?

              --
              Richard Jones. http://www.annexia.org/ http://www.j-london.com/
              Merjis Ltd. http://www.merjis.com/ - improving website return on investment
              Perl4Caml lets you use any Perl library in your type-safe Objective
              CAML programs. http://www.merjis.com/developers/perl4caml/


              [Non-text portions of this message have been removed]
            • Daniel Andor
              ... Half way down: http://caml.inria.fr/ocaml/numerical.html
              Message 6 of 18 , Sep 17, 2004
                On Friday 17 September 2004 1:08 pm, Richard Jones wrote:
                > [1] I think: someone can tell me whether structures of floats are
                > unboxed, like 2-element arrays of floats ...?

                Half way down:
                http://caml.inria.fr/ocaml/numerical.html
              • William D. Neumann
                ... The problem here isn t really mutable v. immutable, it s that the pseudo-random number generator is just too slow. A profile of the original code shows:
                Message 7 of 18 , Sep 17, 2004
                  On Fri, 17 Sep 2004, Richard Jones wrote:

                  > I think your code is fairly reasonable.
                  >
                  > If speed is of the essence, I would use a mutable structure, which
                  > avoids making so many allocations[1]:

                  The problem here isn't really mutable v. immutable, it's that the
                  pseudo-random number generator is just too slow. A profile of the
                  original code shows:

                  granularity: each sample hit covers 4 byte(s) for 0.08% of 11.97 seconds

                  % cumulative self
                  time seconds seconds name
                  36.8 4.40 4.40 _camlRandom__rawfloat_131 [1]
                  35.9 8.70 4.30 _camlRandom__bits_94 [2]
                  8.7 9.74 1.04 _camlElectron0__addJitter_346 [3]
                  3.4 10.15 0.41 _camlList__map_86 [4]
                  2.8 10.48 0.33 _camlElectron0__new_electron_from_339 [5]
                  2.0 10.72 0.24 _sweep_slice [6]
                  1.5 10.90 0.18 _caml_oldify_one [7]
                  1.4 11.07 0.17 _caml_alloc_shr [8]
                  1.3 11.23 0.16 _camlElectron0__updated_electron_from_347 [9]
                  1.3 11.38 0.15 _mark_slice [10]
                  1.2 11.52 0.14 _caml_oldify_mopup [11]
                  1.1 11.65 0.13 _caml_fl_allocate [12]
                  0.9 11.76 0.11 _caml_fl_merge_block [13]
                  0.4 11.81 0.05 _caml_darken [14]
                  0.3 11.85 0.04 _caml_oldify_local_roots [15]
                  0.3 11.88 0.03 _caml_do_local_roots [16]
                  0.3 11.91 0.03 _caml_do_roots [17]
                  0.2 11.93 0.02 _allocate_block [18]
                  0.2 11.95 0.02 _caml_call_gc [19]
                  0.1 11.96 0.01 _camlElectron0__new_array_from_343 [20]
                  0.1 11.97 0.01 _malloc [21]

                  While a profile of a version useing mutable fields and an array of
                  electrons shows:

                  granularity: each sample hit covers 4 byte(s) for 0.10% of 10.36 seconds

                  % cumulative self
                  time seconds seconds name
                  40.3 4.18 4.18 _camlRandom__rawfloat_131 [1]
                  39.1 8.23 4.05 _camlRandom__bits_94 [2]
                  6.9 8.95 0.72 _caml_modify [3]
                  6.3 9.60 0.65 _camlStd_exit__code_end [4]
                  2.1 9.82 0.22 _camlArray__iter_127 [5]
                  1.9 10.02 0.20 _camlElectron2__addJitter_348 [6]
                  1.8 10.21 0.19 _camlElectron2__iterElectron_334 [7]
                  1.4 10.35 0.14 _caml_darken [8]
                  0.1 10.36 0.01 _mark_slice [9]

                  Bottom line, use a faster PRNG.

                  William D. Neumann

                  ---

                  "Well I could be a genius, if I just put my mind to it.
                  And I...I could do anything, if only I could get 'round to it.
                  Oh we were brought up on the space-race, now they expect you to clean toilets.
                  When you've seen how big the world is, how can you make do with this?
                  If you want me, I'll be sleeping in - sleeping in throughout these glory days."

                  -- Jarvis Cocker

                  Think of XML as Lisp for COBOL programmers.

                  -- Tony-A (some guy on /.)
                • Brian Hurt
                  ... This shouldn t be that expensive. The list is short enough (200 elements) that it should stay nicely in cache. He should never kick off a major
                  Message 8 of 18 , Sep 17, 2004
                    On Fri, 17 Sep 2004, Daniel Andor wrote:

                    > Otherwise, my guess is that the slowness is due to the continuous
                    > re-allocation of large data structures.

                    This shouldn't be that expensive. The list is short enough (200 elements)
                    that it should stay nicely in cache. He should never kick off a major
                    collection, as nothing should ever leave the youngest generation.

                    --
                    "Usenet is like a herd of performing elephants with diarrhea -- massive,
                    difficult to redirect, awe-inspiring, entertaining, and a source of
                    mind-boggling amounts of excrement when you least expect it."
                    - Gene Spafford
                    Brian
                  • zuluzombie1
                    Wow, thanks for all the responses! I ll respond in detail tomorrow (just got home from work .. it s 2 am local time) .. I put together a quick web page with
                    Message 9 of 18 , Sep 18, 2004
                      Wow, thanks for all the responses!

                      I'll respond in detail tomorrow (just got home from work .. it's
                      2 am local time) ..

                      I put together a quick web page with some timings (and code)

                      http://www.faroth.com/pub/ocaml/Electrons.html#modelonly

                      which shows, as discussed, that the PRNG time is significant.
                      I removed both PRNG calls, in both the Java code and the Ocaml
                      code which unfortunately resulted in a dead heat in execution speed
                      I'm hoping Ocaml will yield a speed increase ..


                      Again, thanks a lot for making my learning experience more pleasant.

                      Tom
                    • Alwyn
                      ... Unfortunately, I don t see any way of improving the speed, apart from changing the random number algorithm. I ve rewritten your program twice in a
                      Message 10 of 18 , Sep 18, 2004
                        On 18 Sep 2004, at 10:11, zuluzombie1 wrote:

                        > I put together a quick web page with some timings (and code)
                        >
                        > http://www.faroth.com/pub/ocaml/Electrons.html#modelonly
                        >
                        > which shows, as discussed, that  the PRNG time is significant.
                        > I removed both PRNG calls, in both the Java code and the Ocaml
                        > code which unfortunately resulted in a dead heat in execution speed
                        > I'm hoping Ocaml will yield a speed increase ..

                        Unfortunately, I don't see any way of improving the speed, apart from
                        changing the random number algorithm. I've rewritten your program twice
                        in a shamelessly 'imperative' way (once with a pure array of 'float'
                        and once with records containing 'float') and got little or no speed
                        advantage from that. The only comfort I can give you is that all O'Caml
                        versions run faster than unoptimised C++ compiled with GCC.

                        You're welcome to compare these on your own system, if you like:

                        ~/Scrap $ cat electrons.ml
                        open Unix

                        type position = {x:float; y:float}
                        type electron = {oldPosition:position; newPosition:position}
                        let main () =

                        let aPosition = {x=1.0; y=2.0} in
                        let anElectron = {oldPosition=aPosition; newPosition=aPosition} in
                        let new_position_from aPosition dx dy =
                        {x=aPosition.x+.dx; y=aPosition.y+.dy} in
                        let new_electron_from anElectron dx dy =
                        {oldPosition=anElectron.oldPosition;
                        newPosition=new_position_from anElectron.oldPosition dx dy} in

                        let new_array_from anArray =
                        let jitter = 2.0 in
                        let rand () = float_of_int (Random.int 10000000) /. 10000000.0 in
                        let addJitter anElectron =
                        new_electron_from anElectron
                        (jitter -. 2.0*.jitter*.rand ())
                        (jitter -. 2.0*.jitter*.rand ())
                        and updated_electron_from anElectron =
                        {oldPosition=anElectron.newPosition;
                        newPosition=anElectron.newPosition}
                        in
                        List.map updated_electron_from (List.map addJitter anArray)

                        in

                        let rec move_electron_times count anArray =
                        if count <=0 then ()
                        else
                        move_electron_times (count-1) (new_array_from anArray)
                        in

                        let rec create_electrons_aux theList = function
                        | 0 -> theList
                        | n -> create_electrons_aux (anElectron :: theList) (n - 1)
                        in

                        let create_electrons n = create_electrons_aux [] n in
                        let actualList = create_electrons 200 in

                        let t1 = gettimeofday ()
                        and t2 = gettimeofday
                        in
                        Random.self_init ();
                        move_electron_times 20000 actualList;
                        print_string("total time = " ^ string_of_float(t2() -. t1) ^ "\n")
                        ;;

                        main();;
                        ~/Scrap $ ocamlopt -o electrons unix.cmxa electrons.ml
                        ~/Scrap $ ./electrons
                        total time = 6.98878383636
                        ~/Scrap $ cat electrons2.ml
                        let no_of_electrons = 200
                        let iterations = 20000

                        type position = {mutable x : float; mutable y : float}
                        type electron = {mutable old_position : position;
                        mutable new_position : position}

                        let move_electrons num_times =

                        let init_elements i =
                        {old_position = {x = 1.0; y = 2.0};
                        new_position = {x = 1.0; y = 2.0}} in

                        let add_jitter x =
                        let jitter = 2.0 and
                        rand () = float_of_int (Random.int 10000000) /. 10000000.0 in
                        x +. jitter -. 2.0 *. jitter *. rand () in

                        for time = 1 to num_times do
                        let electrons = Array.init no_of_electrons init_elements in
                        for i = 0 to no_of_electrons - 1 do
                        let new_position = electrons.(i).new_position in
                        electrons.(i).old_position <- new_position;
                        new_position.x <- add_jitter new_position.x;
                        new_position.y <- add_jitter new_position.y
                        done
                        done

                        let main () =
                        let t1 = Unix.gettimeofday () in
                        Random.self_init ();
                        move_electrons iterations;
                        let t2 = Unix.gettimeofday () in
                        print_string ("total time = " ^ string_of_float(t2 -. t1) ^ "\n");;

                        main ()
                        ~/Scrap $ ocamlopt -o electrons2 unix.cmxa electrons2.ml
                        ~/Scrap $ ./electrons2
                        total time = 7.56373596191
                        ~/Scrap $ cat electrons.cc
                        #include <iostream>
                        #include <vector>
                        #include <cstdlib>
                        #include <ctime>

                        using namespace std;

                        const size_t no_of_electrons = 200;
                        const size_t iterations = 20000;

                        struct Position {float x; float y;};
                        struct Electron {Position old_position; Position new_position;};

                        struct Position position = {1.0, 2.0};
                        struct Electron electron = {position, position};

                        double frandom()
                        {
                        return double(rand()) / RAND_MAX;
                        }

                        double add_jitter(double x)
                        {
                        const double jitter = 2.0;
                        return x + jitter - 2.0 * jitter * frandom();
                        }

                        void move_electrons(size_t num_times)
                        {
                        for (size_t time = 1; time <= num_times; time++) {
                        vector<Electron> electrons(no_of_electrons, electron);
                        for (vector<Electron>::iterator i = electrons.begin();
                        i != electrons.end(); ++i) {
                        (*i).old_position = (*i).new_position;
                        (*i).new_position.x = add_jitter((*i).new_position.x);
                        (*i).new_position.y = add_jitter((*i).new_position.y);
                        }
                        }
                        }

                        int main()
                        {
                        const clock_t t1 = clock();
                        srand(t1);
                        move_electrons(iterations);
                        const clock_t t2 = clock();
                        cout << "total time = " << double(t2 - t1) / CLOCKS_PER_SEC << endl;
                        }
                        ~/Scrap $ c++ -o electrons3 electrons.cc
                        ~/Scrap $ ./electrons3
                        total time = 8.46
                        ~/Scrap $ c++ -O3 -o electrons3 electrons.cc
                        ~/Scrap $ ./electrons3
                        total time = 2.08
                        ~/Scrap $


                        Alwyn
                      • Alwyn
                        ... Actually, this slightly more functional C++ version does a little better: ~/Scrap $ cat electrons.cc #include #include #include
                        Message 11 of 18 , Sep 18, 2004
                          On 18 Sep 2004, at 20:41, Alwyn wrote:

                          > The only comfort I can give you is that all O'Caml
                          > versions run faster than unoptimised C++ compiled with GCC.
                          >
                          > You're welcome to compare these on your own system, if you like:

                          Actually, this slightly more 'functional' C++ version does a little
                          better:

                          ~/Scrap $ cat electrons.cc
                          #include <iostream>
                          #include <vector>
                          #include <algorithm>
                          #include <cstdlib>
                          #include <ctime>

                          using namespace std;

                          const size_t no_of_electrons = 200;
                          const size_t iterations = 20000;

                          struct Position {float x; float y;};
                          struct Electron {Position old_position; Position new_position;};

                          struct Position position = {1.0, 2.0};
                          struct Electron electron = {position, position};

                          class adjust_electron {
                          static double frandom()
                          {
                          return double(rand()) / RAND_MAX;
                          }

                          static double add_jitter(double x)
                          {
                          const double jitter = 2.0;
                          return x + jitter - 2.0 * jitter * frandom();
                          }

                          public:
                          void operator()(Electron& e) {
                          e.old_position = e.new_position;
                          e.new_position.x = add_jitter(e.new_position.x);
                          e.new_position.y = add_jitter(e.new_position.y);
                          }
                          };

                          void move_electrons(size_t num_times)
                          {
                          for (size_t time = 1; time <= num_times; time++) {
                          vector<Electron> electrons(no_of_electrons, electron);
                          for_each(electrons.begin(), electrons.end(), adjust_electron());
                          }
                          }

                          int main()
                          {
                          const clock_t t1 = clock();
                          srand(t1);
                          move_electrons(iterations);
                          const clock_t t2 = clock();
                          cout << "total time = " << double(t2 - t1) / CLOCKS_PER_SEC << endl;
                          }
                          ~/Scrap $ c++ -o electrons3 electrons.cc
                          ~/Scrap $ ./electrons3
                          total time = 6.36
                          ~/Scrap $ cat electrons2.ml
                          let no_of_electrons = 200
                          let iterations = 20000

                          type position = {mutable x : float; mutable y : float}
                          type electron = {mutable old_position : position;
                          mutable new_position : position}

                          let move_electrons num_times =

                          let init_elements i =
                          {old_position = {x = 1.0; y = 2.0};
                          new_position = {x = 1.0; y = 2.0}} in

                          let add_jitter x =
                          let jitter = 2.0 and
                          rand () = float_of_int (Random.int 10000000) /. 10000000.0 in
                          x +. jitter -. 2.0 *. jitter *. rand () in

                          let adjust_electron e =
                          e.old_position <- e.new_position;
                          e.new_position.x <- add_jitter e.new_position.x;
                          e.new_position.y <- add_jitter e.new_position.y in

                          for time = 1 to num_times do
                          let electrons = Array.init no_of_electrons init_elements in
                          Array.iter adjust_electron electrons
                          done

                          let main () =
                          let t1 = Unix.gettimeofday () in
                          Random.self_init ();
                          move_electrons iterations;
                          let t2 = Unix.gettimeofday () in
                          print_string ("total time = " ^ string_of_float(t2 -. t1) ^ "\n");;

                          main ()
                          ~/Scrap $ ocamlopt -o electrons2 unix.cmxa electrons2.ml
                          ~/Scrap $ ./electrons2
                          total time = 7.72092485428
                          ~/Scrap $


                          Alwyn
                        • Alwyn
                          ... I ve been tinkering with this a little more... My previous examples created the array on each iteration, which is not what you want. I ve also put in a
                          Message 12 of 18 , Sep 19, 2004
                            On 18 Sep 2004, at 10:11, zuluzombie1 wrote:

                            > I put together a quick web page with some timings (and code)
                            >
                            > http://www.faroth.com/pub/ocaml/Electrons.html#modelonly
                            >
                            > which shows, as discussed, that  the PRNG time is significant.
                            > I removed both PRNG calls, in both the Java code and the Ocaml
                            > code which unfortunately resulted in a dead heat in execution speed
                            > I'm hoping Ocaml will yield a speed increase ..

                            I've been tinkering with this a little more...

                            My previous examples created the array on each iteration, which is not
                            what you want. I've also put in a somewhat faster PRNG algorithm based
                            on Ramdom.bits, which, I presume, should be good enough for present
                            purposes.

                            Here are some representative timings:

                            This is your original program with a couple of modifications. I don't
                            think you wanted:
                            List.map addJitter anArray;
                            List.map updated_electron_from anArray
                            since the result of the first 'map' is thrown away. I've substituted;
                            List.map updated_electron (List.map addJitter anArray)
                            ~/Scrap $ ./electrons
                            total time = 6.94855690002

                            This is an imperative version using an array of 'float'.
                            ~/Scrap $ ./electrons1
                            total time = 4.1938958168

                            This has an array using records of type 'electron'. It is thus written
                            at a somewhat higher level but is still mainly imperative in style.
                            ~/Scrap $ ./electrons2
                            total time = 4.67642593384

                            This is C++ using std::vector<electron> and std::for_each with a
                            function object, The compiler is GCC 3.3; no optimisations specified.
                            ~/Scrap $ ./electrons3
                            total time = 5.21
                            With optimisation at level 3, I get:
                            ~/Scrap $ ./electrons3
                            total time = 1.93

                            This uses lists of 'electron' records and is functional in style:
                            ~/Scrap $ ./electrons4
                            total time = 4.71771407127

                            For comparison, this is how the Java program on your Web page runs for
                            me:
                            ~/Scrap $ java -jar javaelectrons/electrons.jar
                            It took 11474 milliseconds to do 20000 generations

                            This takes well over twice the time of all O'Caml versions except the
                            first, which is what I'd expect from Java. It is also around the speed
                            I got for your original version using Random.float.

                            Assuming I have got the algorithm right this time, it is clear that
                            there is little difference between the O'Caml versions; the benefit of
                            using a low-level array of 'float' is relatively small, and a
                            moderately 'functional' approach with lists extracts so little a time
                            penalty that this would be the approach that I would choose, all other
                            things being equal. If speed really is of the essence, I doubt if well
                            optimised C++ can be beaten - and there are much better optimising
                            compilers available than GCC.

                            Here are the listings, so that you can do your own comparisons:

                            ~/Scrap $ cat electrons.ml
                            open Unix

                            type position = {x:float; y:float}
                            type electron = {oldPosition:position; newPosition:position}
                            let main () =

                            let aPosition = {x=1.0; y=2.0} in
                            let anElectron = {oldPosition=aPosition; newPosition=aPosition} in
                            let new_position_from aPosition dx dy =
                            {x=aPosition.x+.dx; y=aPosition.y+.dy} in
                            let new_electron_from anElectron dx dy =
                            {oldPosition=anElectron.oldPosition;
                            newPosition=new_position_from anElectron.oldPosition dx dy} in

                            let new_array_from anArray =
                            let jitter = 2.0 in
                            let rand () = float_of_int (Random.int 10000000) /. 10000000.0 in
                            let addJitter anElectron =
                            new_electron_from anElectron
                            (jitter -. 2.0*.jitter*.rand ())
                            (jitter -. 2.0*.jitter*.rand ())
                            and updated_electron_from anElectron =
                            {oldPosition=anElectron.newPosition;
                            newPosition=anElectron.newPosition}
                            in
                            List.map updated_electron_from (List.map addJitter anArray)

                            in

                            let rec move_electron_times count anArray =
                            if count <=0 then ()
                            else
                            move_electron_times (count-1) (new_array_from anArray)
                            in

                            let rec create_electrons_aux theList = function
                            | 0 -> theList
                            | n -> create_electrons_aux (anElectron :: theList) (n - 1)
                            in

                            let create_electrons n = create_electrons_aux [] n in
                            let actualList = create_electrons 200 in

                            let t1 = gettimeofday ()
                            and t2 = gettimeofday
                            in
                            Random.self_init ();
                            move_electron_times 20000 actualList;
                            print_string("total time = " ^ string_of_float(t2() -. t1) ^ "\n")
                            ;;

                            main();;
                            ~/Scrap $ cat electrons1.ml
                            let no_of_electrons = 200
                            let iterations = 20000

                            let array_length = no_of_electrons * 4
                            let electrons = Array.make array_length 0.0

                            let initialise_array () =
                            let rec loop i =
                            if i < array_length - 1 then begin
                            electrons.(i) <- 1.0;
                            electrons.(i + 1) <- 2.0;
                            electrons.(i + 2) <- 1.0;
                            electrons.(i + 3) <- 2.0;
                            loop (i + 4)
                            end in
                            loop 0

                            let move_electrons num_times =
                            let max_rand = 2.0 ** 30.0 -. 1.0 in
                            let add_jitter x =
                            let jitter = 2.0 and
                            (* rand () = Random.float (1.0) in *)
                            rand () = float_of_int (Random.bits ()) /. max_rand in
                            x +. jitter -. 2.0 *. jitter *. rand () in

                            let rec adjust_electrons i =
                            if i < array_length - 1 then begin
                            electrons.(i) <- electrons.(i + 2);
                            electrons.(i + 1) <- electrons.(i + 3);
                            electrons.(i + 2) <- add_jitter electrons.(i + 2);
                            electrons.(i + 3) <- add_jitter electrons.(i + 3);
                            adjust_electrons (i + 4)
                            end in

                            let rec main_loop i =
                            if i <= num_times then begin
                            adjust_electrons 0;
                            main_loop (succ i);
                            end in
                            main_loop 1

                            let main () =
                            let t1 = Unix.gettimeofday () in
                            Random.self_init ();
                            initialise_array ();
                            move_electrons iterations;
                            let t2 = Unix.gettimeofday () in
                            print_string ("total time = " ^ string_of_float(t2 -. t1) ^ "\n");;

                            main ()
                            ~/Scrap $ cat electrons2.ml
                            let no_of_electrons = 200
                            let iterations = 20000

                            type position = {mutable x : float; mutable y : float}
                            type electron = {mutable old_position : position;
                            mutable new_position : position}

                            let move_electrons electrons num_times =

                            let max_rand = 2.0 ** 30.0 -. 1.0 in
                            let add_jitter x =
                            let jitter = 2.0 and
                            (* rand () = Random.float (1.0) in *)
                            rand () = float_of_int (Random.bits ()) /. max_rand in
                            x +. jitter -. 2.0 *. jitter *. rand () in

                            let adjust_electron e =
                            e.old_position <- e.new_position;
                            e.new_position.x <- add_jitter e.new_position.x;
                            e.new_position.y <- add_jitter e.new_position.y in

                            for time = 1 to num_times do
                            Array.iter adjust_electron electrons
                            done

                            let main () =
                            let t1 = Unix.gettimeofday () in
                            Random.self_init ();
                            let init_elements i =
                            {old_position = {x = 1.0; y = 2.0};
                            new_position = {x = 1.0; y = 2.0}} in
                            let electrons = Array.init no_of_electrons init_elements in
                            move_electrons electrons iterations;
                            let t2 = Unix.gettimeofday () in
                            print_string ("total time = " ^ string_of_float(t2 -. t1) ^ "\n");;

                            main ()
                            ~/Scrap $ cat electrons.cc
                            #include <iostream>
                            #include <vector>
                            #include <algorithm>
                            #include <cstdlib>
                            #include <ctime>

                            using namespace std;

                            const size_t no_of_electrons = 200;
                            const size_t iterations = 20000;

                            struct Position {float x; float y;};
                            struct Electron {Position old_position; Position new_position;};

                            struct Position position = {1.0, 2.0};
                            struct Electron electron = {position, position};

                            class adjust_electron {
                            static double frandom()
                            {
                            return double(rand()) / RAND_MAX;
                            }

                            static double add_jitter(double x)
                            {
                            const double jitter = 2.0;
                            return x + jitter - 2.0 * jitter * frandom();
                            }

                            public:
                            void operator()(Electron& e) {
                            e.old_position = e.new_position;
                            e.new_position.x = add_jitter(e.new_position.x);
                            e.new_position.y = add_jitter(e.new_position.y);
                            }
                            };

                            void move_electrons(vector<Electron>& electrons, size_t num_times)
                            {
                            for (size_t time = 1; time <= num_times; time++)
                            for_each(electrons.begin(), electrons.end(), adjust_electron());
                            }

                            int main()
                            {
                            const clock_t t1 = clock();
                            srand(t1);
                            vector<Electron> electrons(no_of_electrons, electron);
                            move_electrons(electrons, iterations);
                            const clock_t t2 = clock();
                            cout << "total time = " << double(t2 - t1) / CLOCKS_PER_SEC << endl;
                            }
                            ~/Scrap $ cat electrons4.ml
                            let no_of_electrons = 200
                            let iterations = 20000

                            type position = {x : float; y : float}
                            type electron = {old_position : position; new_position : position}

                            let make_list () =
                            let make_element () =
                            {old_position = {x = 1.0; y = 2.0};
                            new_position = {x = 1.0; y = 2.0}} in
                            let rec loop j e =
                            if j > no_of_electrons then
                            e
                            else
                            loop (succ j) (make_element () :: e) in
                            loop 1 []

                            let move_electrons electrons =

                            let max_rand = 2.0 ** 30.0 -. 1.0 in
                            let add_jitter x =
                            let jitter = 2.0 and
                            (* rand () = Random.float (1.0) in *)
                            rand () = float_of_int (Random.bits ()) /. max_rand in
                            x +. jitter -. 2.0 *. jitter *. rand () in

                            let adjust_electron e =
                            {old_position = e.new_position;
                            new_position = {x = add_jitter e.new_position.x;
                            y = add_jitter e.new_position.y}} in

                            List.map adjust_electron electrons

                            let main () =
                            let t1 = Unix.gettimeofday () in
                            Random.self_init ();
                            let electrons = make_list () in
                            for i = 1 to iterations do
                            ignore (move_electrons electrons)
                            done;
                            let t2 = Unix.gettimeofday () in
                            print_string ("total time = " ^ string_of_float(t2 -. t1) ^ "\n");;

                            main ()


                            Alwyn
                          • zuluzombie1
                            ... [of the business model for Electron jitter ala Electrons.py] http://www.faroth.com/pub/ocaml/Electrons.html#modelonly ... Thanks again for everyone s help
                            Message 13 of 18 , Sep 20, 2004
                              > I put together a quick web page with some timings (and code)
                              [of the business model for Electron jitter ala Electrons.py]

                              http://www.faroth.com/pub/ocaml/Electrons.html#modelonly

                              > which shows, as discussed, that the PRNG time is significant.
                              > I removed both PRNG calls, in both the Java code and the Ocaml
                              > code which unfortunately resulted in a dead heat in execution speed
                              > I'm hoping Ocaml will yield a speed increase ..

                              Thanks again for everyone's help especially David Alwyn who rewrote
                              the code several ways to improve the speed, and Chris who also
                              rewrote it.

                              I've updated the website with David's code (and timings on my
                              machine). I'll put Chris's code up tomorrow.
                              at:

                              http://www.faroth.com/pub/ocaml/Electrons.html#modelonly

                              I'd like to summarize here what I've learned (with everyone's help)

                              1) I had an error in my code .. the List was being thrown away ..
                              Oops ..

                              2) Floats are boxed so it was suggested that an Array of floats
                              would be faster .. David's experiment implies that this is not
                              significant (in this case at least)

                              3) The imperative approach was suggested (that would certainly
                              make me feel more comfortable :-) since I've still not had
                              the Ocaml 'eureka' experience). This also appears, in this
                              case, to not be significant.

                              4) Profiling was suggested (I did originally try the ocamlprof
                              which at least helped tell me that the number of executions
                              for each line was correct) using gprof.

                              5) This led to the PRNG issue which turns out to be quite slow
                              (at least compared to Java)
                              I got an email from Chris who indicated that Java's PRNG is ..

                              public double nextDouble() {
                              return (((long)next(26) 27) + next(27))
                              / (double)(1L 53);
                              }
                              and a lot simpler than Ocaml's.

                              6) Even with no PRNG's (in neither the Java not the Ocaml code)
                              it appears that Java and Ocaml are similar in speed for this
                              type of app. This was a surprise to me in light of the
                              Great Computer Shootout page

                              http://shootout.alioth.debian.org/bench/ary/

                              which indicated that 'Array Access' was about 6 times faster
                              in Ocaml than Java.

                              Java 0.27 secs
                              Ocaml 0.043 secs

                              I noted that the test was performed with
                              a command line parameter of 9000. I ran both programs on
                              my machine and got similar results. When I ran both programs
                              again (Java and Ocaml Array Access code at the Shootout Site)
                              with 1000000 as a command line parameter I got execution times
                              of:

                              Java 15 secs
                              Ocaml 12 secs

                              I'm guessing that this means either the VM Load time (although
                              I believe he filters this out) is now less significant,
                              or the hotspot interpreter now has the time to 'figure'
                              out the 'hotspot'.

                              It looks like I won't get a speed increase by switching to
                              Ocaml but (if I ever get the 'eureka' experience) I
                              might get a 'power' increase from closures.

                              Thanks again for all the help,

                              Tom
                            • Alwyn
                              ... The relative speed is a largely a quality of implementation issue. Java will always be limited because it has to run from bytecode, but the Hotspot engine
                              Message 14 of 18 , Sep 20, 2004
                                On 20 Sep 2004, at 10:33, zuluzombie1 wrote:

                                >   I'm guessing that this means either the VM Load time (although
                                >   I believe he filters this out) is now less significant,
                                >   or the hotspot interpreter now has the time to 'figure'
                                >   out the 'hotspot'.
                                >
                                > It looks like I won't get a speed increase by switching to
                                > Ocaml but (if I ever get the 'eureka' experience)  I
                                > might get a 'power' increase from closures.

                                The relative speed is a largely a quality of implementation issue. Java
                                will always be limited because it has to run from bytecode, but the
                                Hotspot engine can make quite aggressive optimisations, given time. On
                                the other hand, the only optimisation 'ocamlopt' does is to inline
                                short functions. (I note that the Ocaml test in the 'shootout' was
                                compiled in 'unsafe' mode (i.e. without bounds checking); this is
                                slightly unfair, as you cannot do this in Java!)

                                As for a 'eureka moment', I doubt that will come. Some problem domains
                                are much easier to tackle using a functional approach, but yours isn't
                                one of them. Even so, `O'Caml is as good an 'all-rounder' as I've ever
                                seen. With C++, I often have to fight with the language to get it to do
                                what I want, while O'Caml is much more docile.


                                Alwyn
                              • Richard Jones
                                ... It s fair to say that you ll only get the near C performance from OCaml if you: * use an imperative approach (arrays, for loops) * compile with -unsafe
                                Message 15 of 18 , Sep 20, 2004
                                  On Mon, Sep 20, 2004 at 09:33:32AM -0000, zuluzombie1 wrote:
                                  > It looks like I won't get a speed increase by switching to
                                  > Ocaml but (if I ever get the 'eureka' experience) I
                                  > might get a 'power' increase from closures.

                                  It's fair to say that you'll only get the "near C" performance from
                                  OCaml if you:

                                  * use an imperative approach (arrays, for loops)
                                  * compile with -unsafe

                                  Also when compiling code which uses floats, it appears you may need to
                                  inline some functions by hand if they can't be inlined automatically
                                  by ocamlopt, or at least that is my reading of
                                  http://caml.inria.fr/ocaml/numerical.html

                                  Rich.

                                  --
                                  Richard Jones. http://www.annexia.org/ http://www.j-london.com/
                                  Merjis Ltd. http://www.merjis.com/ - improving website return on investment
                                  Perl4Caml lets you use any Perl library in your type-safe Objective
                                  CAML programs. http://www.merjis.com/developers/perl4caml/


                                  [Non-text portions of this message have been removed]
                                • Remi Vanicat
                                  ... Welln not only. There are severall optimization, like the unboxing one (that is do not rebox all float after all operation, but only when needed), the tail
                                  Message 16 of 18 , Sep 20, 2004
                                    On Mon, 20 Sep 2004 12:51:01 +0100, Alwyn <dt015a1979@...> wrote:
                                    > On 20 Sep 2004, at 10:33, zuluzombie1 wrote:
                                    >
                                    > > I'm guessing that this means either the VM Load time (although
                                    > > I believe he filters this out) is now less significant,
                                    > > or the hotspot interpreter now has the time to 'figure'
                                    > > out the 'hotspot'.
                                    > >
                                    > > It looks like I won't get a speed increase by switching to
                                    > > Ocaml but (if I ever get the 'eureka' experience) I
                                    > > might get a 'power' increase from closures.
                                    >
                                    > The relative speed is a largely a quality of implementation issue. Java
                                    > will always be limited because it has to run from bytecode, but the
                                    > Hotspot engine can make quite aggressive optimisations, given time. On
                                    > the other hand, the only optimisation 'ocamlopt' does is to inline
                                    > short functions.

                                    Welln not only. There are severall optimization, like the unboxing one
                                    (that is do not rebox all float after all operation, but only when
                                    needed), the tail call and tail recursive optimization... By the way,
                                    I don't know how good or poorly is optimized the code we are speaking
                                    of with regard to the unboxing optimizatation.
                                  • Alwyn
                                    ... Je vous l accorde, monsieur. :-) :-) ... Well, when I write code like this: let electrons = Array.make array_length 0.0 let initialise_array () =
                                    Message 17 of 18 , Sep 20, 2004
                                      On 20 Sep 2004, at 15:30, Remi Vanicat wrote:

                                      > On Mon, 20 Sep 2004 12:51:01 +0100, Alwyn
                                      > <dt015a1979@...> wrote:
                                      > > The relative speed is a largely a quality of implementation issue.
                                      > Java
                                      > > will always be limited because it has to run from bytecode, but the
                                      > > Hotspot engine can make quite aggressive optimisations, given time.
                                      > On
                                      > > the other hand, the only optimisation 'ocamlopt' does is to inline
                                      > > short functions.
                                      >
                                      > Welln not only. There are severall optimization, like the unboxing one
                                      > (that is do not rebox all float after all operation, but only when
                                      > needed), the tail call and tail recursive optimization...

                                      Je vous l'accorde, monsieur. :-) :-)

                                      > By the way,
                                      > I don't know how good or poorly is optimized the code we are speaking
                                      > of with regard to the unboxing optimizatation.

                                      Well, when I write code like this:

                                      let electrons = Array.make array_length 0.0

                                      <snip>

                                      let initialise_array () =
                                         let rec loop i =
                                           if i < array_length - 1 then begin
                                             electrons.(i) <- 1.0;
                                             electrons.(i + 1) <- 2.0;
                                             electrons.(i + 2) <- 1.0;
                                             electrons.(i + 3) <- 2.0;
                                             loop (i + 4)
                                           end in
                                         loop 0

                                      <snip>

                                         let rec adjust_electrons i =
                                           if i < array_length - 1 then begin
                                             electrons.(i) <- electrons.(i + 2);
                                             electrons.(i + 1) <- electrons.(i + 3);
                                             electrons.(i + 2) <- add_jitter electrons.(i + 2);
                                             electrons.(i + 3) <- add_jitter electrons.(i + 3);
                                             adjust_electrons (i + 4)
                                           end

                                      it is not elegant, but I would at least expect the floating point
                                      accesses to be well optimised. The first function has manual loop
                                      unrolling too! (And it is rather faster than passing a function to
                                      Array.init.) But I must admit it did not occur to me to me to inline
                                      'add_jitter' by hand.


                                      Alwyn
                                    • Alwyn
                                      ... Yes, but such loops have quite limited applicability in O Caml. ... Yes, the effect of this can vary from nil to something quite substantial. ... I was
                                      Message 18 of 18 , Sep 21, 2004
                                        On 20 Sep 2004, at 13:08, Richard Jones wrote:

                                        > It's fair to say that you'll only get the "near C" performance from
                                        > OCaml if you:
                                        >
                                        > * use an imperative approach (arrays, for loops)

                                        Yes, but such loops have quite limited applicability in O'Caml.

                                        > * compile with -unsafe

                                        Yes, the effect of this can vary from nil to something quite
                                        substantial.

                                        > Also when compiling code which uses floats, it appears you may need to
                                        > inline some functions by hand if they can't be inlined automatically
                                        > by ocamlopt, or at least that is my reading of
                                        > http://caml.inria.fr/ocaml/numerical.html

                                        I was inclined to disbelieve you on this one, but I tried:
                                        let max_rand = 2.0 ** 30.0 -. 1.0 and
                                        jitter = 2.0 in
                                        let rec adjust_electrons i =
                                        if i < array_length then begin
                                        electrons.(i) <- electrons.(i + 2);
                                        electrons.(i + 1) <- electrons.(i + 3);
                                        electrons.(i + 2) <-
                                        electrons.(i + 2) +. jitter -. 2.0 *. jitter *.
                                        float_of_int (Random.bits ()) /. max_rand;
                                        electrons.(i + 3) <-
                                        electrons.(i + 3) +. jitter -. 2.0 *. jitter *.
                                        float_of_int (Random.bits ()) /. max_rand;
                                        adjust_electrons (i + 4)
                                        end in

                                        and together with '-unsafe' got a pretty substantial performance
                                        increase:

                                        ~/Scrap $ ./electrons1
                                        total time = 3.09126091003

                                        However, it is still nowhere near as good as optimised C++ with GCC:

                                        ~/Scrap $ ./electrons3
                                        total time = 1.94

                                        Seeing that GCC has relatively poor optimisation support, at least on
                                        my platform (the IBM compiler is reputed to much better, and so is the
                                        Intel compiler on the Intel architecture), I would suggest that people
                                        who want performance at all costs should look at C or C++, or, in the
                                        case of pure numerical computing, Fortran.


                                        Alwyn
                                      Your message has been successfully submitted and would be delivered to recipients shortly.