Loading ...
Sorry, an error occurred while loading the content.
 

Re: "ocaml_beginners"::[] Making Electrons Jitter ..

Expand Messages
  • Remi Vanicat
    On Fri, 17 Sep 2004 08:43:59 -0000, zuluzombie1 ... There are several strange thing in it, I believe they are mistake (or at least inefficiency) ... Only a
    Message 1 of 18 , Sep 17, 2004
      On Fri, 17 Sep 2004 08:43:59 -0000, zuluzombie1
      <zuluzombie1@...> wrote:
      >
      > I'm a long time programmer (mainly OO) and an Ocaml newbie.
      >
      > I wrote the following code to perform the 'business model'
      > of a program to randomly jitter lots of 'electrons' on a 2D
      > grid, similar to the old Python demo program Electrons.py.
      > I tried to be strictly 'ML', no objects, no imperative
      > (Ocaml 3.08.1).
      > It seems to work, but it's about the same speed as Java
      > (1.4.2, Sun) ..
      >
      > My main question is ..
      >
      > Is this reasonable ML code?

      There are several strange thing in it, I believe they are mistake (or
      at least inefficiency)

      > open Unix
      >
      > type position = {x:float; y:float}

      Only a precision about indentation : I prefer { x:float; y:float }
      (look at the added space), but it is a question of style.

      > type electron = {oldPosition:position; newPosition:position}
      >
      > let main () =
      >
      > let aPosition = {x=1.0; y=2.0} in
      > let anElectron = {oldPosition=aPosition; newPosition=aPosition} in
      > let new_position_from aPosition dx dy =
      > {x=aPosition.x+.dx; y=aPosition.y+.dy} in

      Here I would also add space around the = and +, and after the { and before the }

      > let new_electron_from anElectron dx dy =
      > {oldPosition=anElectron.oldPosition;
      > newPosition=new_position_from anElectron.oldPosition dx dy}; in
      >
      > let new_array_from anArray =

      if it is a list, it's better to name the variable aList (array do
      exist in ocaml).

      > let jitter = 2.0 in
      > let addJitter anElectron =
      > new_electron_from anElectron
      > (jitter -. 2.0*.jitter*.Random.float(1.0))
      > (jitter -. 2.0*.jitter*.Random.float(1.0));
      > and updated_electron_from anElectron =
      > {oldPosition=anElectron.newPosition;
      > newPosition=anElectron.newPosition}
      > in
      > List.map addJitter anArray;

      List.map build a new list, there you put this new list to trash. Did
      you want to use List.iter, or is there a mistake ?

      > List.map updated_electron_from anArray
      > in
      >
      > let rec move_electron_times count anArray =
      > if count <=0 then anArray
      > else
      > move_electron_times (count-1) (new_array_from anArray)
      > in
      >
      > let rec create_electrons_aux theList = function
      > | 0 -> theList
      > | n -> create_electrons_aux (anElectron :: theList) (n - 1)
      > in
      >
      > let create_electrons n = create_electrons_aux [] n in
      > let actualList = create_electrons 200 in
      >
      > let t1 = gettimeofday ()
      > and t2() = gettimeofday ()
      > in
      > t1;
      > move_electron_times 200000 actualList;
      > print_string("total time = " ^ string_of_float(t2() -. t1) ^ "\n")
      > ;;
      >
      > main();;
      >


      For matter of efficiency, There is an inefficiency with ocaml usage of
      float : they are boxed. The compiler try to optimize this, but It may
      be there that you lost most of your speed....
    • Daniel.Andor@physics.org
      ... For such simulations (diffusion of electrons, right?), even if you use the imperative features of Ocaml, you can benefit from the type checking and
      Message 2 of 18 , Sep 17, 2004
        On Friday 17 September 2004 9:43 am, zuluzombie1 wrote:
        > I wrote the following code to perform the 'business model'
        > of a program to randomly jitter lots of 'electrons' on a 2D
        > grid, similar to the old Python demo program Electrons.py.
        > I tried to be strictly 'ML', no objects, no imperative
        > (Ocaml 3.08.1).

        For such simulations (diffusion of electrons, right?), even if you use the
        imperative features of Ocaml, you can benefit from the type checking and
        functional aspects "on the way".

        As a simple example, in your jitter routine

        let move_all temp =
        let move temperature pos =
        let jit = 2.0 *. (sqrt temperature) in
        ....
        List.iter (move temp) particles ;
        ..

        you can benefit from closures.

        > List.map addJitter anArray;
        > List.map updated_electron_from anArray
        > in

        Are you sure this actually does what you want it to do? You've thrown away
        the result of the first List.map.

        D
      • Daniel Andor
        ... This is worth bearing in mind, but it is also true that arrays of floats and records of floats are unboxed. So the only way you would benefit from using
        Message 3 of 18 , Sep 17, 2004
          On Friday 17 September 2004 10:12 am, Remi Vanicat wrote:
          > For matter of efficiency, There is an inefficiency with ocaml usage of
          > float : they are boxed. The compiler try to optimize this, but It may
          > be there that you lost most of your speed....

          This is worth bearing in mind, but it is also true that arrays of floats and
          records of floats are unboxed. So the only way you would benefit from using
          an array in the electron case is if you stored the particles in a contiguous
          array of size 2*N.

          Otherwise, my guess is that the slowness is due to the continuous
          re-allocation of large data structures.

          I think Tom should use an imperative approach as described in section 1.5 of
          the Ocaml manual.

          D
        • Richard Jones
          I think your code is fairly reasonable. If speed is of the essence, I would use a mutable structure, which avoids making so many allocations[1]: type position
          Message 4 of 18 , Sep 17, 2004
            I think your code is fairly reasonable.

            If speed is of the essence, I would use a mutable structure, which
            avoids making so many allocations[1]:

            type position = { mutable x : float; mutable y : float };;

            let move_electron e dx dy =
            e.x <- e.x +. dx; e.y <- e.y +. dy;;

            Of course, this is an imperative approach, so you'll have to rewrite a
            lot of your other code.

            If you're unsure where the time is going, try profiling your code
            using the gprof method. See this page:

            http://www.merjis.com/developers/ocaml_tutorial/ch11#Profiling

            Rich.

            [1] I think: someone can tell me whether structures of floats are
            unboxed, like 2-element arrays of floats ...?

            --
            Richard Jones. http://www.annexia.org/ http://www.j-london.com/
            Merjis Ltd. http://www.merjis.com/ - improving website return on investment
            Perl4Caml lets you use any Perl library in your type-safe Objective
            CAML programs. http://www.merjis.com/developers/perl4caml/


            [Non-text portions of this message have been removed]
          • Daniel Andor
            ... Half way down: http://caml.inria.fr/ocaml/numerical.html
            Message 5 of 18 , Sep 17, 2004
              On Friday 17 September 2004 1:08 pm, Richard Jones wrote:
              > [1] I think: someone can tell me whether structures of floats are
              > unboxed, like 2-element arrays of floats ...?

              Half way down:
              http://caml.inria.fr/ocaml/numerical.html
            • William D. Neumann
              ... The problem here isn t really mutable v. immutable, it s that the pseudo-random number generator is just too slow. A profile of the original code shows:
              Message 6 of 18 , Sep 17, 2004
                On Fri, 17 Sep 2004, Richard Jones wrote:

                > I think your code is fairly reasonable.
                >
                > If speed is of the essence, I would use a mutable structure, which
                > avoids making so many allocations[1]:

                The problem here isn't really mutable v. immutable, it's that the
                pseudo-random number generator is just too slow. A profile of the
                original code shows:

                granularity: each sample hit covers 4 byte(s) for 0.08% of 11.97 seconds

                % cumulative self
                time seconds seconds name
                36.8 4.40 4.40 _camlRandom__rawfloat_131 [1]
                35.9 8.70 4.30 _camlRandom__bits_94 [2]
                8.7 9.74 1.04 _camlElectron0__addJitter_346 [3]
                3.4 10.15 0.41 _camlList__map_86 [4]
                2.8 10.48 0.33 _camlElectron0__new_electron_from_339 [5]
                2.0 10.72 0.24 _sweep_slice [6]
                1.5 10.90 0.18 _caml_oldify_one [7]
                1.4 11.07 0.17 _caml_alloc_shr [8]
                1.3 11.23 0.16 _camlElectron0__updated_electron_from_347 [9]
                1.3 11.38 0.15 _mark_slice [10]
                1.2 11.52 0.14 _caml_oldify_mopup [11]
                1.1 11.65 0.13 _caml_fl_allocate [12]
                0.9 11.76 0.11 _caml_fl_merge_block [13]
                0.4 11.81 0.05 _caml_darken [14]
                0.3 11.85 0.04 _caml_oldify_local_roots [15]
                0.3 11.88 0.03 _caml_do_local_roots [16]
                0.3 11.91 0.03 _caml_do_roots [17]
                0.2 11.93 0.02 _allocate_block [18]
                0.2 11.95 0.02 _caml_call_gc [19]
                0.1 11.96 0.01 _camlElectron0__new_array_from_343 [20]
                0.1 11.97 0.01 _malloc [21]

                While a profile of a version useing mutable fields and an array of
                electrons shows:

                granularity: each sample hit covers 4 byte(s) for 0.10% of 10.36 seconds

                % cumulative self
                time seconds seconds name
                40.3 4.18 4.18 _camlRandom__rawfloat_131 [1]
                39.1 8.23 4.05 _camlRandom__bits_94 [2]
                6.9 8.95 0.72 _caml_modify [3]
                6.3 9.60 0.65 _camlStd_exit__code_end [4]
                2.1 9.82 0.22 _camlArray__iter_127 [5]
                1.9 10.02 0.20 _camlElectron2__addJitter_348 [6]
                1.8 10.21 0.19 _camlElectron2__iterElectron_334 [7]
                1.4 10.35 0.14 _caml_darken [8]
                0.1 10.36 0.01 _mark_slice [9]

                Bottom line, use a faster PRNG.

                William D. Neumann

                ---

                "Well I could be a genius, if I just put my mind to it.
                And I...I could do anything, if only I could get 'round to it.
                Oh we were brought up on the space-race, now they expect you to clean toilets.
                When you've seen how big the world is, how can you make do with this?
                If you want me, I'll be sleeping in - sleeping in throughout these glory days."

                -- Jarvis Cocker

                Think of XML as Lisp for COBOL programmers.

                -- Tony-A (some guy on /.)
              • Brian Hurt
                ... This shouldn t be that expensive. The list is short enough (200 elements) that it should stay nicely in cache. He should never kick off a major
                Message 7 of 18 , Sep 17, 2004
                  On Fri, 17 Sep 2004, Daniel Andor wrote:

                  > Otherwise, my guess is that the slowness is due to the continuous
                  > re-allocation of large data structures.

                  This shouldn't be that expensive. The list is short enough (200 elements)
                  that it should stay nicely in cache. He should never kick off a major
                  collection, as nothing should ever leave the youngest generation.

                  --
                  "Usenet is like a herd of performing elephants with diarrhea -- massive,
                  difficult to redirect, awe-inspiring, entertaining, and a source of
                  mind-boggling amounts of excrement when you least expect it."
                  - Gene Spafford
                  Brian
                • zuluzombie1
                  Wow, thanks for all the responses! I ll respond in detail tomorrow (just got home from work .. it s 2 am local time) .. I put together a quick web page with
                  Message 8 of 18 , Sep 18, 2004
                    Wow, thanks for all the responses!

                    I'll respond in detail tomorrow (just got home from work .. it's
                    2 am local time) ..

                    I put together a quick web page with some timings (and code)

                    http://www.faroth.com/pub/ocaml/Electrons.html#modelonly

                    which shows, as discussed, that the PRNG time is significant.
                    I removed both PRNG calls, in both the Java code and the Ocaml
                    code which unfortunately resulted in a dead heat in execution speed
                    I'm hoping Ocaml will yield a speed increase ..


                    Again, thanks a lot for making my learning experience more pleasant.

                    Tom
                  • Alwyn
                    ... Unfortunately, I don t see any way of improving the speed, apart from changing the random number algorithm. I ve rewritten your program twice in a
                    Message 9 of 18 , Sep 18, 2004
                      On 18 Sep 2004, at 10:11, zuluzombie1 wrote:

                      > I put together a quick web page with some timings (and code)
                      >
                      > http://www.faroth.com/pub/ocaml/Electrons.html#modelonly
                      >
                      > which shows, as discussed, that  the PRNG time is significant.
                      > I removed both PRNG calls, in both the Java code and the Ocaml
                      > code which unfortunately resulted in a dead heat in execution speed
                      > I'm hoping Ocaml will yield a speed increase ..

                      Unfortunately, I don't see any way of improving the speed, apart from
                      changing the random number algorithm. I've rewritten your program twice
                      in a shamelessly 'imperative' way (once with a pure array of 'float'
                      and once with records containing 'float') and got little or no speed
                      advantage from that. The only comfort I can give you is that all O'Caml
                      versions run faster than unoptimised C++ compiled with GCC.

                      You're welcome to compare these on your own system, if you like:

                      ~/Scrap $ cat electrons.ml
                      open Unix

                      type position = {x:float; y:float}
                      type electron = {oldPosition:position; newPosition:position}
                      let main () =

                      let aPosition = {x=1.0; y=2.0} in
                      let anElectron = {oldPosition=aPosition; newPosition=aPosition} in
                      let new_position_from aPosition dx dy =
                      {x=aPosition.x+.dx; y=aPosition.y+.dy} in
                      let new_electron_from anElectron dx dy =
                      {oldPosition=anElectron.oldPosition;
                      newPosition=new_position_from anElectron.oldPosition dx dy} in

                      let new_array_from anArray =
                      let jitter = 2.0 in
                      let rand () = float_of_int (Random.int 10000000) /. 10000000.0 in
                      let addJitter anElectron =
                      new_electron_from anElectron
                      (jitter -. 2.0*.jitter*.rand ())
                      (jitter -. 2.0*.jitter*.rand ())
                      and updated_electron_from anElectron =
                      {oldPosition=anElectron.newPosition;
                      newPosition=anElectron.newPosition}
                      in
                      List.map updated_electron_from (List.map addJitter anArray)

                      in

                      let rec move_electron_times count anArray =
                      if count <=0 then ()
                      else
                      move_electron_times (count-1) (new_array_from anArray)
                      in

                      let rec create_electrons_aux theList = function
                      | 0 -> theList
                      | n -> create_electrons_aux (anElectron :: theList) (n - 1)
                      in

                      let create_electrons n = create_electrons_aux [] n in
                      let actualList = create_electrons 200 in

                      let t1 = gettimeofday ()
                      and t2 = gettimeofday
                      in
                      Random.self_init ();
                      move_electron_times 20000 actualList;
                      print_string("total time = " ^ string_of_float(t2() -. t1) ^ "\n")
                      ;;

                      main();;
                      ~/Scrap $ ocamlopt -o electrons unix.cmxa electrons.ml
                      ~/Scrap $ ./electrons
                      total time = 6.98878383636
                      ~/Scrap $ cat electrons2.ml
                      let no_of_electrons = 200
                      let iterations = 20000

                      type position = {mutable x : float; mutable y : float}
                      type electron = {mutable old_position : position;
                      mutable new_position : position}

                      let move_electrons num_times =

                      let init_elements i =
                      {old_position = {x = 1.0; y = 2.0};
                      new_position = {x = 1.0; y = 2.0}} in

                      let add_jitter x =
                      let jitter = 2.0 and
                      rand () = float_of_int (Random.int 10000000) /. 10000000.0 in
                      x +. jitter -. 2.0 *. jitter *. rand () in

                      for time = 1 to num_times do
                      let electrons = Array.init no_of_electrons init_elements in
                      for i = 0 to no_of_electrons - 1 do
                      let new_position = electrons.(i).new_position in
                      electrons.(i).old_position <- new_position;
                      new_position.x <- add_jitter new_position.x;
                      new_position.y <- add_jitter new_position.y
                      done
                      done

                      let main () =
                      let t1 = Unix.gettimeofday () in
                      Random.self_init ();
                      move_electrons iterations;
                      let t2 = Unix.gettimeofday () in
                      print_string ("total time = " ^ string_of_float(t2 -. t1) ^ "\n");;

                      main ()
                      ~/Scrap $ ocamlopt -o electrons2 unix.cmxa electrons2.ml
                      ~/Scrap $ ./electrons2
                      total time = 7.56373596191
                      ~/Scrap $ cat electrons.cc
                      #include <iostream>
                      #include <vector>
                      #include <cstdlib>
                      #include <ctime>

                      using namespace std;

                      const size_t no_of_electrons = 200;
                      const size_t iterations = 20000;

                      struct Position {float x; float y;};
                      struct Electron {Position old_position; Position new_position;};

                      struct Position position = {1.0, 2.0};
                      struct Electron electron = {position, position};

                      double frandom()
                      {
                      return double(rand()) / RAND_MAX;
                      }

                      double add_jitter(double x)
                      {
                      const double jitter = 2.0;
                      return x + jitter - 2.0 * jitter * frandom();
                      }

                      void move_electrons(size_t num_times)
                      {
                      for (size_t time = 1; time <= num_times; time++) {
                      vector<Electron> electrons(no_of_electrons, electron);
                      for (vector<Electron>::iterator i = electrons.begin();
                      i != electrons.end(); ++i) {
                      (*i).old_position = (*i).new_position;
                      (*i).new_position.x = add_jitter((*i).new_position.x);
                      (*i).new_position.y = add_jitter((*i).new_position.y);
                      }
                      }
                      }

                      int main()
                      {
                      const clock_t t1 = clock();
                      srand(t1);
                      move_electrons(iterations);
                      const clock_t t2 = clock();
                      cout << "total time = " << double(t2 - t1) / CLOCKS_PER_SEC << endl;
                      }
                      ~/Scrap $ c++ -o electrons3 electrons.cc
                      ~/Scrap $ ./electrons3
                      total time = 8.46
                      ~/Scrap $ c++ -O3 -o electrons3 electrons.cc
                      ~/Scrap $ ./electrons3
                      total time = 2.08
                      ~/Scrap $


                      Alwyn
                    • Alwyn
                      ... Actually, this slightly more functional C++ version does a little better: ~/Scrap $ cat electrons.cc #include #include #include
                      Message 10 of 18 , Sep 18, 2004
                        On 18 Sep 2004, at 20:41, Alwyn wrote:

                        > The only comfort I can give you is that all O'Caml
                        > versions run faster than unoptimised C++ compiled with GCC.
                        >
                        > You're welcome to compare these on your own system, if you like:

                        Actually, this slightly more 'functional' C++ version does a little
                        better:

                        ~/Scrap $ cat electrons.cc
                        #include <iostream>
                        #include <vector>
                        #include <algorithm>
                        #include <cstdlib>
                        #include <ctime>

                        using namespace std;

                        const size_t no_of_electrons = 200;
                        const size_t iterations = 20000;

                        struct Position {float x; float y;};
                        struct Electron {Position old_position; Position new_position;};

                        struct Position position = {1.0, 2.0};
                        struct Electron electron = {position, position};

                        class adjust_electron {
                        static double frandom()
                        {
                        return double(rand()) / RAND_MAX;
                        }

                        static double add_jitter(double x)
                        {
                        const double jitter = 2.0;
                        return x + jitter - 2.0 * jitter * frandom();
                        }

                        public:
                        void operator()(Electron& e) {
                        e.old_position = e.new_position;
                        e.new_position.x = add_jitter(e.new_position.x);
                        e.new_position.y = add_jitter(e.new_position.y);
                        }
                        };

                        void move_electrons(size_t num_times)
                        {
                        for (size_t time = 1; time <= num_times; time++) {
                        vector<Electron> electrons(no_of_electrons, electron);
                        for_each(electrons.begin(), electrons.end(), adjust_electron());
                        }
                        }

                        int main()
                        {
                        const clock_t t1 = clock();
                        srand(t1);
                        move_electrons(iterations);
                        const clock_t t2 = clock();
                        cout << "total time = " << double(t2 - t1) / CLOCKS_PER_SEC << endl;
                        }
                        ~/Scrap $ c++ -o electrons3 electrons.cc
                        ~/Scrap $ ./electrons3
                        total time = 6.36
                        ~/Scrap $ cat electrons2.ml
                        let no_of_electrons = 200
                        let iterations = 20000

                        type position = {mutable x : float; mutable y : float}
                        type electron = {mutable old_position : position;
                        mutable new_position : position}

                        let move_electrons num_times =

                        let init_elements i =
                        {old_position = {x = 1.0; y = 2.0};
                        new_position = {x = 1.0; y = 2.0}} in

                        let add_jitter x =
                        let jitter = 2.0 and
                        rand () = float_of_int (Random.int 10000000) /. 10000000.0 in
                        x +. jitter -. 2.0 *. jitter *. rand () in

                        let adjust_electron e =
                        e.old_position <- e.new_position;
                        e.new_position.x <- add_jitter e.new_position.x;
                        e.new_position.y <- add_jitter e.new_position.y in

                        for time = 1 to num_times do
                        let electrons = Array.init no_of_electrons init_elements in
                        Array.iter adjust_electron electrons
                        done

                        let main () =
                        let t1 = Unix.gettimeofday () in
                        Random.self_init ();
                        move_electrons iterations;
                        let t2 = Unix.gettimeofday () in
                        print_string ("total time = " ^ string_of_float(t2 -. t1) ^ "\n");;

                        main ()
                        ~/Scrap $ ocamlopt -o electrons2 unix.cmxa electrons2.ml
                        ~/Scrap $ ./electrons2
                        total time = 7.72092485428
                        ~/Scrap $


                        Alwyn
                      • Alwyn
                        ... I ve been tinkering with this a little more... My previous examples created the array on each iteration, which is not what you want. I ve also put in a
                        Message 11 of 18 , Sep 19, 2004
                          On 18 Sep 2004, at 10:11, zuluzombie1 wrote:

                          > I put together a quick web page with some timings (and code)
                          >
                          > http://www.faroth.com/pub/ocaml/Electrons.html#modelonly
                          >
                          > which shows, as discussed, that  the PRNG time is significant.
                          > I removed both PRNG calls, in both the Java code and the Ocaml
                          > code which unfortunately resulted in a dead heat in execution speed
                          > I'm hoping Ocaml will yield a speed increase ..

                          I've been tinkering with this a little more...

                          My previous examples created the array on each iteration, which is not
                          what you want. I've also put in a somewhat faster PRNG algorithm based
                          on Ramdom.bits, which, I presume, should be good enough for present
                          purposes.

                          Here are some representative timings:

                          This is your original program with a couple of modifications. I don't
                          think you wanted:
                          List.map addJitter anArray;
                          List.map updated_electron_from anArray
                          since the result of the first 'map' is thrown away. I've substituted;
                          List.map updated_electron (List.map addJitter anArray)
                          ~/Scrap $ ./electrons
                          total time = 6.94855690002

                          This is an imperative version using an array of 'float'.
                          ~/Scrap $ ./electrons1
                          total time = 4.1938958168

                          This has an array using records of type 'electron'. It is thus written
                          at a somewhat higher level but is still mainly imperative in style.
                          ~/Scrap $ ./electrons2
                          total time = 4.67642593384

                          This is C++ using std::vector<electron> and std::for_each with a
                          function object, The compiler is GCC 3.3; no optimisations specified.
                          ~/Scrap $ ./electrons3
                          total time = 5.21
                          With optimisation at level 3, I get:
                          ~/Scrap $ ./electrons3
                          total time = 1.93

                          This uses lists of 'electron' records and is functional in style:
                          ~/Scrap $ ./electrons4
                          total time = 4.71771407127

                          For comparison, this is how the Java program on your Web page runs for
                          me:
                          ~/Scrap $ java -jar javaelectrons/electrons.jar
                          It took 11474 milliseconds to do 20000 generations

                          This takes well over twice the time of all O'Caml versions except the
                          first, which is what I'd expect from Java. It is also around the speed
                          I got for your original version using Random.float.

                          Assuming I have got the algorithm right this time, it is clear that
                          there is little difference between the O'Caml versions; the benefit of
                          using a low-level array of 'float' is relatively small, and a
                          moderately 'functional' approach with lists extracts so little a time
                          penalty that this would be the approach that I would choose, all other
                          things being equal. If speed really is of the essence, I doubt if well
                          optimised C++ can be beaten - and there are much better optimising
                          compilers available than GCC.

                          Here are the listings, so that you can do your own comparisons:

                          ~/Scrap $ cat electrons.ml
                          open Unix

                          type position = {x:float; y:float}
                          type electron = {oldPosition:position; newPosition:position}
                          let main () =

                          let aPosition = {x=1.0; y=2.0} in
                          let anElectron = {oldPosition=aPosition; newPosition=aPosition} in
                          let new_position_from aPosition dx dy =
                          {x=aPosition.x+.dx; y=aPosition.y+.dy} in
                          let new_electron_from anElectron dx dy =
                          {oldPosition=anElectron.oldPosition;
                          newPosition=new_position_from anElectron.oldPosition dx dy} in

                          let new_array_from anArray =
                          let jitter = 2.0 in
                          let rand () = float_of_int (Random.int 10000000) /. 10000000.0 in
                          let addJitter anElectron =
                          new_electron_from anElectron
                          (jitter -. 2.0*.jitter*.rand ())
                          (jitter -. 2.0*.jitter*.rand ())
                          and updated_electron_from anElectron =
                          {oldPosition=anElectron.newPosition;
                          newPosition=anElectron.newPosition}
                          in
                          List.map updated_electron_from (List.map addJitter anArray)

                          in

                          let rec move_electron_times count anArray =
                          if count <=0 then ()
                          else
                          move_electron_times (count-1) (new_array_from anArray)
                          in

                          let rec create_electrons_aux theList = function
                          | 0 -> theList
                          | n -> create_electrons_aux (anElectron :: theList) (n - 1)
                          in

                          let create_electrons n = create_electrons_aux [] n in
                          let actualList = create_electrons 200 in

                          let t1 = gettimeofday ()
                          and t2 = gettimeofday
                          in
                          Random.self_init ();
                          move_electron_times 20000 actualList;
                          print_string("total time = " ^ string_of_float(t2() -. t1) ^ "\n")
                          ;;

                          main();;
                          ~/Scrap $ cat electrons1.ml
                          let no_of_electrons = 200
                          let iterations = 20000

                          let array_length = no_of_electrons * 4
                          let electrons = Array.make array_length 0.0

                          let initialise_array () =
                          let rec loop i =
                          if i < array_length - 1 then begin
                          electrons.(i) <- 1.0;
                          electrons.(i + 1) <- 2.0;
                          electrons.(i + 2) <- 1.0;
                          electrons.(i + 3) <- 2.0;
                          loop (i + 4)
                          end in
                          loop 0

                          let move_electrons num_times =
                          let max_rand = 2.0 ** 30.0 -. 1.0 in
                          let add_jitter x =
                          let jitter = 2.0 and
                          (* rand () = Random.float (1.0) in *)
                          rand () = float_of_int (Random.bits ()) /. max_rand in
                          x +. jitter -. 2.0 *. jitter *. rand () in

                          let rec adjust_electrons i =
                          if i < array_length - 1 then begin
                          electrons.(i) <- electrons.(i + 2);
                          electrons.(i + 1) <- electrons.(i + 3);
                          electrons.(i + 2) <- add_jitter electrons.(i + 2);
                          electrons.(i + 3) <- add_jitter electrons.(i + 3);
                          adjust_electrons (i + 4)
                          end in

                          let rec main_loop i =
                          if i <= num_times then begin
                          adjust_electrons 0;
                          main_loop (succ i);
                          end in
                          main_loop 1

                          let main () =
                          let t1 = Unix.gettimeofday () in
                          Random.self_init ();
                          initialise_array ();
                          move_electrons iterations;
                          let t2 = Unix.gettimeofday () in
                          print_string ("total time = " ^ string_of_float(t2 -. t1) ^ "\n");;

                          main ()
                          ~/Scrap $ cat electrons2.ml
                          let no_of_electrons = 200
                          let iterations = 20000

                          type position = {mutable x : float; mutable y : float}
                          type electron = {mutable old_position : position;
                          mutable new_position : position}

                          let move_electrons electrons num_times =

                          let max_rand = 2.0 ** 30.0 -. 1.0 in
                          let add_jitter x =
                          let jitter = 2.0 and
                          (* rand () = Random.float (1.0) in *)
                          rand () = float_of_int (Random.bits ()) /. max_rand in
                          x +. jitter -. 2.0 *. jitter *. rand () in

                          let adjust_electron e =
                          e.old_position <- e.new_position;
                          e.new_position.x <- add_jitter e.new_position.x;
                          e.new_position.y <- add_jitter e.new_position.y in

                          for time = 1 to num_times do
                          Array.iter adjust_electron electrons
                          done

                          let main () =
                          let t1 = Unix.gettimeofday () in
                          Random.self_init ();
                          let init_elements i =
                          {old_position = {x = 1.0; y = 2.0};
                          new_position = {x = 1.0; y = 2.0}} in
                          let electrons = Array.init no_of_electrons init_elements in
                          move_electrons electrons iterations;
                          let t2 = Unix.gettimeofday () in
                          print_string ("total time = " ^ string_of_float(t2 -. t1) ^ "\n");;

                          main ()
                          ~/Scrap $ cat electrons.cc
                          #include <iostream>
                          #include <vector>
                          #include <algorithm>
                          #include <cstdlib>
                          #include <ctime>

                          using namespace std;

                          const size_t no_of_electrons = 200;
                          const size_t iterations = 20000;

                          struct Position {float x; float y;};
                          struct Electron {Position old_position; Position new_position;};

                          struct Position position = {1.0, 2.0};
                          struct Electron electron = {position, position};

                          class adjust_electron {
                          static double frandom()
                          {
                          return double(rand()) / RAND_MAX;
                          }

                          static double add_jitter(double x)
                          {
                          const double jitter = 2.0;
                          return x + jitter - 2.0 * jitter * frandom();
                          }

                          public:
                          void operator()(Electron& e) {
                          e.old_position = e.new_position;
                          e.new_position.x = add_jitter(e.new_position.x);
                          e.new_position.y = add_jitter(e.new_position.y);
                          }
                          };

                          void move_electrons(vector<Electron>& electrons, size_t num_times)
                          {
                          for (size_t time = 1; time <= num_times; time++)
                          for_each(electrons.begin(), electrons.end(), adjust_electron());
                          }

                          int main()
                          {
                          const clock_t t1 = clock();
                          srand(t1);
                          vector<Electron> electrons(no_of_electrons, electron);
                          move_electrons(electrons, iterations);
                          const clock_t t2 = clock();
                          cout << "total time = " << double(t2 - t1) / CLOCKS_PER_SEC << endl;
                          }
                          ~/Scrap $ cat electrons4.ml
                          let no_of_electrons = 200
                          let iterations = 20000

                          type position = {x : float; y : float}
                          type electron = {old_position : position; new_position : position}

                          let make_list () =
                          let make_element () =
                          {old_position = {x = 1.0; y = 2.0};
                          new_position = {x = 1.0; y = 2.0}} in
                          let rec loop j e =
                          if j > no_of_electrons then
                          e
                          else
                          loop (succ j) (make_element () :: e) in
                          loop 1 []

                          let move_electrons electrons =

                          let max_rand = 2.0 ** 30.0 -. 1.0 in
                          let add_jitter x =
                          let jitter = 2.0 and
                          (* rand () = Random.float (1.0) in *)
                          rand () = float_of_int (Random.bits ()) /. max_rand in
                          x +. jitter -. 2.0 *. jitter *. rand () in

                          let adjust_electron e =
                          {old_position = e.new_position;
                          new_position = {x = add_jitter e.new_position.x;
                          y = add_jitter e.new_position.y}} in

                          List.map adjust_electron electrons

                          let main () =
                          let t1 = Unix.gettimeofday () in
                          Random.self_init ();
                          let electrons = make_list () in
                          for i = 1 to iterations do
                          ignore (move_electrons electrons)
                          done;
                          let t2 = Unix.gettimeofday () in
                          print_string ("total time = " ^ string_of_float(t2 -. t1) ^ "\n");;

                          main ()


                          Alwyn
                        • zuluzombie1
                          ... [of the business model for Electron jitter ala Electrons.py] http://www.faroth.com/pub/ocaml/Electrons.html#modelonly ... Thanks again for everyone s help
                          Message 12 of 18 , Sep 20, 2004
                            > I put together a quick web page with some timings (and code)
                            [of the business model for Electron jitter ala Electrons.py]

                            http://www.faroth.com/pub/ocaml/Electrons.html#modelonly

                            > which shows, as discussed, that the PRNG time is significant.
                            > I removed both PRNG calls, in both the Java code and the Ocaml
                            > code which unfortunately resulted in a dead heat in execution speed
                            > I'm hoping Ocaml will yield a speed increase ..

                            Thanks again for everyone's help especially David Alwyn who rewrote
                            the code several ways to improve the speed, and Chris who also
                            rewrote it.

                            I've updated the website with David's code (and timings on my
                            machine). I'll put Chris's code up tomorrow.
                            at:

                            http://www.faroth.com/pub/ocaml/Electrons.html#modelonly

                            I'd like to summarize here what I've learned (with everyone's help)

                            1) I had an error in my code .. the List was being thrown away ..
                            Oops ..

                            2) Floats are boxed so it was suggested that an Array of floats
                            would be faster .. David's experiment implies that this is not
                            significant (in this case at least)

                            3) The imperative approach was suggested (that would certainly
                            make me feel more comfortable :-) since I've still not had
                            the Ocaml 'eureka' experience). This also appears, in this
                            case, to not be significant.

                            4) Profiling was suggested (I did originally try the ocamlprof
                            which at least helped tell me that the number of executions
                            for each line was correct) using gprof.

                            5) This led to the PRNG issue which turns out to be quite slow
                            (at least compared to Java)
                            I got an email from Chris who indicated that Java's PRNG is ..

                            public double nextDouble() {
                            return (((long)next(26) 27) + next(27))
                            / (double)(1L 53);
                            }
                            and a lot simpler than Ocaml's.

                            6) Even with no PRNG's (in neither the Java not the Ocaml code)
                            it appears that Java and Ocaml are similar in speed for this
                            type of app. This was a surprise to me in light of the
                            Great Computer Shootout page

                            http://shootout.alioth.debian.org/bench/ary/

                            which indicated that 'Array Access' was about 6 times faster
                            in Ocaml than Java.

                            Java 0.27 secs
                            Ocaml 0.043 secs

                            I noted that the test was performed with
                            a command line parameter of 9000. I ran both programs on
                            my machine and got similar results. When I ran both programs
                            again (Java and Ocaml Array Access code at the Shootout Site)
                            with 1000000 as a command line parameter I got execution times
                            of:

                            Java 15 secs
                            Ocaml 12 secs

                            I'm guessing that this means either the VM Load time (although
                            I believe he filters this out) is now less significant,
                            or the hotspot interpreter now has the time to 'figure'
                            out the 'hotspot'.

                            It looks like I won't get a speed increase by switching to
                            Ocaml but (if I ever get the 'eureka' experience) I
                            might get a 'power' increase from closures.

                            Thanks again for all the help,

                            Tom
                          • Alwyn
                            ... The relative speed is a largely a quality of implementation issue. Java will always be limited because it has to run from bytecode, but the Hotspot engine
                            Message 13 of 18 , Sep 20, 2004
                              On 20 Sep 2004, at 10:33, zuluzombie1 wrote:

                              >   I'm guessing that this means either the VM Load time (although
                              >   I believe he filters this out) is now less significant,
                              >   or the hotspot interpreter now has the time to 'figure'
                              >   out the 'hotspot'.
                              >
                              > It looks like I won't get a speed increase by switching to
                              > Ocaml but (if I ever get the 'eureka' experience)  I
                              > might get a 'power' increase from closures.

                              The relative speed is a largely a quality of implementation issue. Java
                              will always be limited because it has to run from bytecode, but the
                              Hotspot engine can make quite aggressive optimisations, given time. On
                              the other hand, the only optimisation 'ocamlopt' does is to inline
                              short functions. (I note that the Ocaml test in the 'shootout' was
                              compiled in 'unsafe' mode (i.e. without bounds checking); this is
                              slightly unfair, as you cannot do this in Java!)

                              As for a 'eureka moment', I doubt that will come. Some problem domains
                              are much easier to tackle using a functional approach, but yours isn't
                              one of them. Even so, `O'Caml is as good an 'all-rounder' as I've ever
                              seen. With C++, I often have to fight with the language to get it to do
                              what I want, while O'Caml is much more docile.


                              Alwyn
                            • Richard Jones
                              ... It s fair to say that you ll only get the near C performance from OCaml if you: * use an imperative approach (arrays, for loops) * compile with -unsafe
                              Message 14 of 18 , Sep 20, 2004
                                On Mon, Sep 20, 2004 at 09:33:32AM -0000, zuluzombie1 wrote:
                                > It looks like I won't get a speed increase by switching to
                                > Ocaml but (if I ever get the 'eureka' experience) I
                                > might get a 'power' increase from closures.

                                It's fair to say that you'll only get the "near C" performance from
                                OCaml if you:

                                * use an imperative approach (arrays, for loops)
                                * compile with -unsafe

                                Also when compiling code which uses floats, it appears you may need to
                                inline some functions by hand if they can't be inlined automatically
                                by ocamlopt, or at least that is my reading of
                                http://caml.inria.fr/ocaml/numerical.html

                                Rich.

                                --
                                Richard Jones. http://www.annexia.org/ http://www.j-london.com/
                                Merjis Ltd. http://www.merjis.com/ - improving website return on investment
                                Perl4Caml lets you use any Perl library in your type-safe Objective
                                CAML programs. http://www.merjis.com/developers/perl4caml/


                                [Non-text portions of this message have been removed]
                              • Remi Vanicat
                                ... Welln not only. There are severall optimization, like the unboxing one (that is do not rebox all float after all operation, but only when needed), the tail
                                Message 15 of 18 , Sep 20, 2004
                                  On Mon, 20 Sep 2004 12:51:01 +0100, Alwyn <dt015a1979@...> wrote:
                                  > On 20 Sep 2004, at 10:33, zuluzombie1 wrote:
                                  >
                                  > > I'm guessing that this means either the VM Load time (although
                                  > > I believe he filters this out) is now less significant,
                                  > > or the hotspot interpreter now has the time to 'figure'
                                  > > out the 'hotspot'.
                                  > >
                                  > > It looks like I won't get a speed increase by switching to
                                  > > Ocaml but (if I ever get the 'eureka' experience) I
                                  > > might get a 'power' increase from closures.
                                  >
                                  > The relative speed is a largely a quality of implementation issue. Java
                                  > will always be limited because it has to run from bytecode, but the
                                  > Hotspot engine can make quite aggressive optimisations, given time. On
                                  > the other hand, the only optimisation 'ocamlopt' does is to inline
                                  > short functions.

                                  Welln not only. There are severall optimization, like the unboxing one
                                  (that is do not rebox all float after all operation, but only when
                                  needed), the tail call and tail recursive optimization... By the way,
                                  I don't know how good or poorly is optimized the code we are speaking
                                  of with regard to the unboxing optimizatation.
                                • Alwyn
                                  ... Je vous l accorde, monsieur. :-) :-) ... Well, when I write code like this: let electrons = Array.make array_length 0.0 let initialise_array () =
                                  Message 16 of 18 , Sep 20, 2004
                                    On 20 Sep 2004, at 15:30, Remi Vanicat wrote:

                                    > On Mon, 20 Sep 2004 12:51:01 +0100, Alwyn
                                    > <dt015a1979@...> wrote:
                                    > > The relative speed is a largely a quality of implementation issue.
                                    > Java
                                    > > will always be limited because it has to run from bytecode, but the
                                    > > Hotspot engine can make quite aggressive optimisations, given time.
                                    > On
                                    > > the other hand, the only optimisation 'ocamlopt' does is to inline
                                    > > short functions.
                                    >
                                    > Welln not only. There are severall optimization, like the unboxing one
                                    > (that is do not rebox all float after all operation, but only when
                                    > needed), the tail call and tail recursive optimization...

                                    Je vous l'accorde, monsieur. :-) :-)

                                    > By the way,
                                    > I don't know how good or poorly is optimized the code we are speaking
                                    > of with regard to the unboxing optimizatation.

                                    Well, when I write code like this:

                                    let electrons = Array.make array_length 0.0

                                    <snip>

                                    let initialise_array () =
                                       let rec loop i =
                                         if i < array_length - 1 then begin
                                           electrons.(i) <- 1.0;
                                           electrons.(i + 1) <- 2.0;
                                           electrons.(i + 2) <- 1.0;
                                           electrons.(i + 3) <- 2.0;
                                           loop (i + 4)
                                         end in
                                       loop 0

                                    <snip>

                                       let rec adjust_electrons i =
                                         if i < array_length - 1 then begin
                                           electrons.(i) <- electrons.(i + 2);
                                           electrons.(i + 1) <- electrons.(i + 3);
                                           electrons.(i + 2) <- add_jitter electrons.(i + 2);
                                           electrons.(i + 3) <- add_jitter electrons.(i + 3);
                                           adjust_electrons (i + 4)
                                         end

                                    it is not elegant, but I would at least expect the floating point
                                    accesses to be well optimised. The first function has manual loop
                                    unrolling too! (And it is rather faster than passing a function to
                                    Array.init.) But I must admit it did not occur to me to me to inline
                                    'add_jitter' by hand.


                                    Alwyn
                                  • Alwyn
                                    ... Yes, but such loops have quite limited applicability in O Caml. ... Yes, the effect of this can vary from nil to something quite substantial. ... I was
                                    Message 17 of 18 , Sep 21, 2004
                                      On 20 Sep 2004, at 13:08, Richard Jones wrote:

                                      > It's fair to say that you'll only get the "near C" performance from
                                      > OCaml if you:
                                      >
                                      > * use an imperative approach (arrays, for loops)

                                      Yes, but such loops have quite limited applicability in O'Caml.

                                      > * compile with -unsafe

                                      Yes, the effect of this can vary from nil to something quite
                                      substantial.

                                      > Also when compiling code which uses floats, it appears you may need to
                                      > inline some functions by hand if they can't be inlined automatically
                                      > by ocamlopt, or at least that is my reading of
                                      > http://caml.inria.fr/ocaml/numerical.html

                                      I was inclined to disbelieve you on this one, but I tried:
                                      let max_rand = 2.0 ** 30.0 -. 1.0 and
                                      jitter = 2.0 in
                                      let rec adjust_electrons i =
                                      if i < array_length then begin
                                      electrons.(i) <- electrons.(i + 2);
                                      electrons.(i + 1) <- electrons.(i + 3);
                                      electrons.(i + 2) <-
                                      electrons.(i + 2) +. jitter -. 2.0 *. jitter *.
                                      float_of_int (Random.bits ()) /. max_rand;
                                      electrons.(i + 3) <-
                                      electrons.(i + 3) +. jitter -. 2.0 *. jitter *.
                                      float_of_int (Random.bits ()) /. max_rand;
                                      adjust_electrons (i + 4)
                                      end in

                                      and together with '-unsafe' got a pretty substantial performance
                                      increase:

                                      ~/Scrap $ ./electrons1
                                      total time = 3.09126091003

                                      However, it is still nowhere near as good as optimised C++ with GCC:

                                      ~/Scrap $ ./electrons3
                                      total time = 1.94

                                      Seeing that GCC has relatively poor optimisation support, at least on
                                      my platform (the IBM compiler is reputed to much better, and so is the
                                      Intel compiler on the Intel architecture), I would suggest that people
                                      who want performance at all costs should look at C or C++, or, in the
                                      case of pure numerical computing, Fortran.


                                      Alwyn
                                    Your message has been successfully submitted and would be delivered to recipients shortly.