Loading ...
Sorry, an error occurred while loading the content.

getting rid of a global variable

Expand Messages
  • Martin DeMello
    I m writing a wordsearch program that reads a dictionary file into a global array:
    Message 1 of 12 , Feb 2, 2008
    • 0 Attachment
      I'm writing a wordsearch program that reads a dictionary file into a
      global array:

      (*--------------------------------------------------------------------------------------------------*)

      open Bigarray
      open Printf

      let dawg =
      let fd = Unix.openfile "csw.dwg" [ Unix.O_RDONLY ] 0 in
      Array1.map_file fd int32 c_layout false (-1);;

      (* bitfield accessors *)
      let w_pos = Int32.shift_left Int32.one 23;;
      let n_pos = Int32.shift_left Int32.one 22;;
      let ptr_mask = Int32.of_string "0b0000000000011111111111111111111";;
      let start_node = 1;;

      let _letter node = Char.chr(Int32.to_int(Int32.shift_right_logical node 24));;
      let _wordp node = (Int32.logand node w_pos) <> Int32.zero;;
      let _lastp node = (Int32.logand node n_pos) <> Int32.zero;;
      let _ptr node = Int32.to_int (Int32.logand node ptr_mask);;

      (* access nodes via their dawg index *)
      let lastp ix = _lastp dawg.{ix};;
      let wordp ix = _wordp dawg.{ix};;
      let letter ix = _letter dawg.{ix};;
      let ptr ix = _ptr dawg.{ix};;

      (*--------------------------------------------------------------------------------------------------*)

      and thereafter uses the lastp, wordp, letter and ptr functions as the
      public interface for the rest of the code. This saves me having to
      pass the dawg as a parameter to every function in the program, but is
      getting in the way of refactoring the program now, and makes the code
      feel unmaintainable. My instinctive reaction is to wrap the dawg and
      its accessors into a class, but from what I've gathered, most OCaml
      programmers don't really use classes, especially for speed-critical
      code. What's the best way to go about doing this?

      martin
    • Richard Jones
      ... I m not quite sure where your pain point is, but if I m understanding this you want to hide the _symbols? As you said, avoid classes. Instead, use
      Message 2 of 12 , Feb 2, 2008
      • 0 Attachment
        On Sat, Feb 02, 2008 at 06:42:26PM +0530, Martin DeMello wrote:
        > I'm writing a wordsearch program that reads a dictionary file into a
        > global array:
        >
        > (*--------------------------------------------------------------------------------------------------*)
        >
        > open Bigarray
        > open Printf
        >
        > let dawg =
        > let fd = Unix.openfile "csw.dwg" [ Unix.O_RDONLY ] 0 in
        > Array1.map_file fd int32 c_layout false (-1);;
        >
        > (* bitfield accessors *)
        > let w_pos = Int32.shift_left Int32.one 23;;
        > let n_pos = Int32.shift_left Int32.one 22;;
        > let ptr_mask = Int32.of_string "0b0000000000011111111111111111111";;
        > let start_node = 1;;
        >
        > let _letter node = Char.chr(Int32.to_int(Int32.shift_right_logical node 24));;
        > let _wordp node = (Int32.logand node w_pos) <> Int32.zero;;
        > let _lastp node = (Int32.logand node n_pos) <> Int32.zero;;
        > let _ptr node = Int32.to_int (Int32.logand node ptr_mask);;
        >
        > (* access nodes via their dawg index *)
        > let lastp ix = _lastp dawg.{ix};;
        > let wordp ix = _wordp dawg.{ix};;
        > let letter ix = _letter dawg.{ix};;
        > let ptr ix = _ptr dawg.{ix};;
        >
        > (*--------------------------------------------------------------------------------------------------*)
        >
        > and thereafter uses the lastp, wordp, letter and ptr functions as the
        > public interface for the rest of the code. This saves me having to
        > pass the dawg as a parameter to every function in the program, but is
        > getting in the way of refactoring the program now, and makes the code
        > feel unmaintainable. My instinctive reaction is to wrap the dawg and
        > its accessors into a class, but from what I've gathered, most OCaml
        > programmers don't really use classes, especially for speed-critical
        > code. What's the best way to go about doing this?

        I'm not quite sure where your pain point is, but if I'm understanding
        this you want to hide the _symbols? As you said, avoid classes.
        Instead, use nesting:

        let lastp, wordp, letter, ptr =
        (* All the above code, indented by 2 spaces, with ';;' replaced by 'in' *)
        lastp, wordp, letter, ptr ;;

        Then the following code has access to only lastp, wordp, letter and
        ptr, and no access to the other symbols.

        If I've missed the point please let me know.

        Rich.

        --
        Richard Jones
        Red Hat
      • Martin DeMello
        ... No, I now have let dawg = let fd = Unix.openfile csw.dwg [ Unix.O_RDONLY ] 0 in Array1.map_file fd int32 c_layout false (-1);; which is a global array
        Message 3 of 12 , Feb 2, 2008
        • 0 Attachment
          On Feb 3, 2008 1:48 AM, Richard Jones <rich@...> wrote:
          >
          > I'm not quite sure where your pain point is, but if I'm understanding
          > this you want to hide the _symbols?

          No, I now have

          let dawg =
          let fd = Unix.openfile "csw.dwg" [ Unix.O_RDONLY ] 0 in
          Array1.map_file fd int32 c_layout false (-1);;

          which is a global array initialized when the program is run. Then I have, say,

          let ptr ix = _ptr dawg.{ix};;

          which depends on the global variable 'dawg', and later on, some client
          function that uses ptr. Now I want to move all the code I posted into
          its own compilation unit, in order to separate out what in an OOP
          model would be the member functions of the dawg object from the code
          that actually uses the dawg to do stuff. So I'd need to have more
          control over when the dawg itself was initialised:

          let read_dawg filename =
          let dawg =
          let fd = Unix.openfile filename [ Unix.O_RDONLY ] 0 in
          Array1.map_file fd int32 c_layout false (-1)
          in dawg;;

          and in my main file:

          let dawg = read_dawg "csw.dwg"

          but then my accessor signatures would need to change from

          let ptr ix = _ptr dawg.{ix};;

          to

          let ptr dawg ix = _ptr dawg.{ix};;

          which would ripple out until every function that used the dawg would
          need to have it as an explicit parameter. The exact problem is I'd
          like dawg to be defined as a global variable in main.ml but be visible
          from functions in another file, though if there's a better way to do
          things I'd love to hear that too.

          (Reading your previous email, I realise I can say

          let lastp =
          let dawg = read_dawg in
          let lastp ix = _lastp dawg ix in
          lastp;;

          but that's a boilerplate repeating of my interface definition in the
          main file, which seems ugly.)

          martin
        • Jon Harrop
          ... Yes: take dawg out of the main file and put it in its own file that everything using it can depend upon. This is typically done for the definition of
          Message 4 of 12 , Feb 2, 2008
          • 0 Attachment
            On Saturday 02 February 2008 20:52:57 Martin DeMello wrote:
            > The exact problem is I'd
            > like dawg to be defined as a global variable in main.ml but be visible
            > from functions in another file, though if there's a better way to do
            > things I'd love to hear that too.

            Yes: take "dawg" out of the "main" file and put it in its own file that
            everything using it can depend upon.

            This is typically done for the definition of an "expr" type that must be
            visible both in a parser and in an evaluator.

            In more complicated situations you can also parameterize your code over the
            definitions that it uses via functors rather than higher-order functions.
            This was discussed in the latest OCaml Journal article.

            --
            Dr Jon D Harrop, Flying Frog Consultancy Ltd.
            http://www.ffconsultancy.com/products/?e
          • Martin DeMello
            ... How does that work with code that has side effects? Would appreciate a pointer to the relevant section of the docs - I didn t know where to look. martin
            Message 5 of 12 , Feb 3, 2008
            • 0 Attachment
              On Feb 3, 2008 4:13 AM, Jon Harrop <jon@...> wrote:
              >
              > On Saturday 02 February 2008 20:52:57 Martin DeMello wrote:
              > > The exact problem is I'd
              > > like dawg to be defined as a global variable in main.ml but be visible
              > > from functions in another file, though if there's a better way to do
              > > things I'd love to hear that too.
              >
              > Yes: take "dawg" out of the "main" file and put it in its own file that
              > everything using it can depend upon.

              How does that work with code that has side effects? Would appreciate a
              pointer to the relevant section of the docs - I didn't know where to
              look.

              martin
            • Richard Jones
              ... Hmmm, still not entirely clear to me. There is only one dawg ? It must be initialized once when the program starts up? If so, just make it a global
              Message 6 of 12 , Feb 3, 2008
              • 0 Attachment
                On Sun, Feb 03, 2008 at 02:22:57AM +0530, Martin DeMello wrote:
                > which would ripple out until every function that used the dawg would
                > need to have it as an explicit parameter. The exact problem is I'd
                > like dawg to be defined as a global variable in main.ml but be visible
                > from functions in another file, though if there's a better way to do
                > things I'd love to hear that too.

                Hmmm, still not entirely clear to me.

                There is only one 'dawg'? It must be initialized once when the
                program starts up? If so, just make it a global variable. OCaml will
                initialize it at the right time (ie. before any functions which
                require it). And there is no need for 'dawg' to be passed into
                functions -- they can just reference the global.

                Of course this does require that the database file exists / is
                readable when the program starts up, and the file will need to have a
                fixed name. This may be a problem.

                If you want the database file to be read at a controlled point later,
                or you want to allow the user to interactively set the filename, then
                you can do it with a global reference initialized like this:

                let dawg = ref None ;;

                and later:

                let open_dawg filename =
                let fd = Unix.openfile "csw.dwg" [ Unix.O_RDONLY ] 0 in
                dawg := Some (Array1.map_file fd int32 c_layout false (-1)) ;;

                and then the accessor functions have to check explicitly for the
                uninitialized case:

                let ptr ix =
                match !dawg with
                | None -> failwith "sorry, you need to call open_dawg first"
                | Some dawg -> _ptr dawg.{ix} ;;

                Rich.

                --
                Richard Jones
                Red Hat
              • Jon Harrop
                ... Same as C/C++/Java/... -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/products/?e
                Message 7 of 12 , Feb 3, 2008
                • 0 Attachment
                  On Sunday 03 February 2008 08:39:30 Martin DeMello wrote:
                  > On Feb 3, 2008 4:13 AM, Jon Harrop <jon@...> wrote:
                  > > On Saturday 02 February 2008 20:52:57 Martin DeMello wrote:
                  > > > The exact problem is I'd
                  > > > like dawg to be defined as a global variable in main.ml but be visible
                  > > > from functions in another file, though if there's a better way to do
                  > > > things I'd love to hear that too.
                  > >
                  > > Yes: take "dawg" out of the "main" file and put it in its own file that
                  > > everything using it can depend upon.
                  >
                  > How does that work with code that has side effects? Would appreciate a
                  > pointer to the relevant section of the docs - I didn't know where to
                  > look.

                  Same as C/C++/Java/...

                  --
                  Dr Jon D Harrop, Flying Frog Consultancy Ltd.
                  http://www.ffconsultancy.com/products/?e
                • Martin DeMello
                  ... Thanks, that works nicely. I m taking an extra dereference hit every time I access the dawg, but I can always optimise later. martin
                  Message 8 of 12 , Feb 3, 2008
                  • 0 Attachment
                    On Feb 3, 2008 3:31 PM, Richard Jones <rich@...> wrote:
                    >
                    > If you want the database file to be read at a controlled point later,
                    > or you want to allow the user to interactively set the filename, then
                    > you can do it with a global reference initialized like this:
                    >
                    > let dawg = ref None ;;

                    Thanks, that works nicely. I'm taking an extra dereference hit every
                    time I access the dawg, but I can always optimise later.

                    martin
                  • Jon Harrop
                    ... You might also like to try: let dawg = lazy (Array. ...) and accessing it with Lazy.force dawg . The code is then evaluated only the first time it is
                    Message 9 of 12 , Feb 3, 2008
                    • 0 Attachment
                      On Sunday 03 February 2008 14:06:11 Martin DeMello wrote:
                      > On Feb 3, 2008 3:31 PM, Richard Jones <rich@...> wrote:
                      > > If you want the database file to be read at a controlled point later,
                      > > or you want to allow the user to interactively set the filename, then
                      > > you can do it with a global reference initialized like this:
                      > >
                      > > let dawg = ref None ;;
                      >
                      > Thanks, that works nicely. I'm taking an extra dereference hit every
                      > time I access the dawg, but I can always optimise later.

                      You might also like to try:

                      let dawg = lazy (Array. ...)

                      and accessing it with "Lazy.force dawg". The code is then evaluated only the
                      first time it is used.

                      --
                      Dr Jon D Harrop, Flying Frog Consultancy Ltd.
                      http://www.ffconsultancy.com/products/?e
                    • Martin DeMello
                      ... That s pretty neat! martin
                      Message 10 of 12 , Feb 4, 2008
                      • 0 Attachment
                        On Feb 3, 2008 2:19 PM, Jon Harrop <jon@...> wrote:
                        > You might also like to try:
                        >
                        > let dawg = lazy (Array. ...)
                        >
                        > and accessing it with "Lazy.force dawg". The code is then evaluated only
                        > the
                        > first time it is used.

                        That's pretty neat!

                        martin
                      • Peng Zang
                        ... Hash: SHA1 ... Just would like to point out there s nothing wrong with using classes. One of the best points of OCaml in my opinion is the ability to
                        Message 11 of 12 , Feb 4, 2008
                        • 0 Attachment
                          -----BEGIN PGP SIGNED MESSAGE-----
                          Hash: SHA1

                          On Saturday 02 February 2008 08:12:26 am Martin DeMello wrote:
                          > feel unmaintainable. My instinctive reaction is to wrap the dawg and
                          > its accessors into a class, but from what I've gathered, most OCaml
                          > programmers don't really use classes, especially for speed-critical
                          > code. What's the best way to go about doing this?
                          >
                          > martin

                          Just would like to point out there's nothing wrong with using classes. One of
                          the best points of OCaml in my opinion is the ability to express your code
                          the way it best suites you. As to speed, someone one on this list (can't
                          recall who just now) mentioned that objects can be quite fast. Method class
                          for example, are cached (although I don't think they can be inlined so you'll
                          always incur a function application cost -- but that's pretty low).

                          So if you think classes work well here (eg. if you may have multiple
                          dictionaries open at once and so a global variable would be bad), a classes
                          approach like this is fine:

                          (* bitfield accessors *)
                          let w_pos = Int32.shift_left Int32.one 23;;
                          let n_pos = Int32.shift_left Int32.one 22;;
                          let ptr_mask = Int32.of_string "0b0000000000011111111111111111111";;
                          let start_node = 1;;

                          let _letter node = Char.chr(Int32.to_int(Int32.shift_right_logical node 24));;
                          let _wordp node = (Int32.logand node w_pos) <> Int32.zero;;
                          let _lastp node = (Int32.logand node n_pos) <> Int32.zero;;
                          let _ptr node = Int32.to_int (Int32.logand node ptr_mask);;

                          (* access nodes via their dawg index *)
                          class wordsearcher dictfilename = object
                          val dawg = Array1.map_file (Unix.openfile dictfilename [ Unix.O_RDONLY ] 0)
                          int32 c_layout false (-1)
                          method lastp ix = _lastp dawg.{ix}
                          method wordp ix = _wordp dawg.{ix}
                          method letter ix = _letter dawg.{ix}
                          method ptr ix = _ptr dawg.{ix}
                          end
                          -----BEGIN PGP SIGNATURE-----
                          Version: GnuPG v2.0.7 (GNU/Linux)

                          iD8DBQFHpxepfIRcEFL/JewRAv+YAJ0Sl6PXgGvFYR0wqAQESx9/RZWZ4QCfboI6
                          NjP6B7FQSnlDlDVZveASVCo=
                          =Tkg3
                          -----END PGP SIGNATURE-----
                        • yami_no_shoryuu
                          I would propose a functor-based code which does pretty much the same :) like Jon has suggested in http://tech.groups.yahoo.com/group/ocaml_beginners/
                          Message 12 of 12 , Feb 6, 2008
                          • 0 Attachment
                            I would propose a functor-based code which does pretty much the same :)
                            like Jon has suggested in http://tech.groups.yahoo.com/group/ocaml_beginners/
                            message/9311

                            (* code starts *)
                            module Ops_fd(S: sig val fd: Unix.file_descr end) = struct
                            open Bigarray
                            open Printf

                            let dawg = Array1.map_file S.fd int32 c_layout false (- 1);;

                            (* bitfield accessors *)
                            let w_pos = Int32.shift_left Int32.one 23;;
                            let n_pos = Int32.shift_left Int32.one 22;;
                            let ptr_mask = Int32.of_string "0b0000000000011111111111111111111";;
                            let start_node = 1;;

                            let _letter node = Char.chr(Int32.to_int(Int32.shift_right_logical node 24));;
                            let _wordp node = (Int32.logand node w_pos) <> Int32.zero;;
                            let _lastp node = (Int32.logand node n_pos) <> Int32.zero;;
                            let _ptr node = Int32.to_int (Int32.logand node ptr_mask);;

                            (* access nodes via their dawg index *)
                            let lastp ix = _lastp dawg.{ ix };;
                            let wordp ix = _wordp dawg.{ ix };;
                            let letter ix = _letter dawg.{ ix };;
                            let ptr ix = _ptr dawg.{ ix };;

                            (* Close operation *)
                            let close () =
                            Unix.close S.fd
                            ;;

                            end

                            module Ops(S: sig val filename: string end) = struct
                            include Ops_fd(struct
                            let fd = Unix.openfile S.filename [ Unix.O_RDONLY ] 0
                            end)
                            end

                            (* Usage example *)
                            let () =
                            let module Op = Ops(struct let filename="csw.dwg" end ) in
                            ignore (Op.letter 1) ;
                            Op.close ()
                            ;;
                            (* Or *)
                            module Op' = Ops(struct let filename="csw.dwg" end )
                            open Op'
                            let () =
                            ignore (letter 1) ;
                            close ()
                            ;; (* I'd prefer first variant to avoid namespace cluttering *)
                            (* code ends *)

                            The pure functional approach is to avoid usng implicit state at all and use something like
                            let lastp a ix = _lastp a.{ ix };;
                            let wordp a ix = _wordp a.{ ix };;
                            let letter a ix = _letter a.{ ix };;
                            let ptr a ix = _ptr a.{ ix };;
                            probably making shorthands for actual usage like
                            let a = .... in
                            ...
                            let ( ~! ) = lastp a in ...

                            And objects are also quite a natural solution here, so it is a question of taste entirely :)
                            --
                            Good luck, Igor Borski
                          Your message has been successfully submitted and would be delivered to recipients shortly.