Loading ...
Sorry, an error occurred while loading the content.
 

I need help deciphering Martin's hash table code

Expand Messages
  • Jean Saint-Remy
    Hi Everyone, Happy New Year 2012! I have been trying to decipher hash table code of Martin Jambon. The code can be found at
    Message 1 of 3 , Jan 1, 2012
      Hi Everyone,


      Happy New Year 2012!

      I have been trying to decipher hash table code of Martin Jambon. The code can be found at http://martin.jambon.free.fr/hashtbl2.ml.html.

      type ('a, 'b) t = ('a, 'b list ref) Hashtbl.t

      let add tbl key data =
      let r =
      try Hashtbl.find tbl key
      with Not_found ->
      let r = ref [] in
      Hashtbl.add tbl key r;
      r in
      r := data :: !r


      The type definition on the left side of the "=" sign is the same as in the OCaml Hashtbl reference.
      The right side of the "=" sign indicates that of any two types that are a hash, the second is bound as a list reference. The problem I am having is understanding the segment "let r =" in the add function. When adding a key to the hash table, first we check to see if it already exists in the table, if it is not found, we treat it as a list reference, assigning the "new" key to the table, and consing the value part into the ref list (here the k,v pairs are denoted by variables 'key' and 'data' respectively).

      Why are we assigning the "try...with" block to the variable 'r'? Do we need to emit some kind of a message that the key already exists in the hash table, or are we simply going to overwrite the value? What about the second 'let r = ref [] in", is it shadowing the variable 'r'? Meaning, the "try..with' block is no longer valid, instead 'r' is now a reference list, the table takes the "key" and "data" is inserted at the head of the list, and the content of 'r' is the tail of the list.

      I have faith that this is the most efficient add function as far as hash tables are concerned, but it is the first time that I come across a dangling variable "r in" in a statement. This seems to me very strange. Could we omit the "try...with" block and simply add the k,v pair to the hash table directly? In other words, skip to the second "let r = ref [] in" and assign the k,v pairs to the table?

      Any insight will be much appreciated.

      With kind regards,

      Jean
    • Dan Bensen
      ... That just means they both take two type arguments (and have the same name). Hashtbl2.t is a new type. ... The right side is a description of the new type.
      Message 2 of 3 , Jan 1, 2012
        > The type definition on the left side of the "=" sign is the same as in the
        >OCaml Hashtbl reference.

        That just means they both take two type arguments (and have the same name).
        Hashtbl2.t is a new type.

        > The right side of the "=" sign indicates that of any two types that are a hash,
        >the second is bound as a list reference.

        The right side is a description of the new type.
        'a is the type of the keys in a value of type Hashtbl2.t, and ('b list ref) is
        the type of the values.

        > The problem I am having is understanding the segment "let r =" in the add
        >function.

        First r is bound to a list ref, and then data is pushed onto it.
        This is the overall structure of the function:

        let add tbl key data = let r = ... in r := data :: !r

        This code should be equivalent (untested):

        (* add a new empty entry if there isn't one already,
        and return the value *)
        let find_or_make tbl key =
        try Hashtbl.find tbl key
        with Not_found ->
        let r = ref [] in
        Hashtbl.add tbl key r;
        r

        (* push data onto the (possibly new) entry for key *)
        let add tbl key data =
        let r = find_or_make tbl key in
        r := data :: !r

        > When adding a key to the hash table, first we check to see if it already exists
        >in the table

        When pushing a new element onto the *value* (which is a list ref), we first
        check to see if there's an entry for the *key*.

        > if it is not found, we treat it as a list reference

        If it's not found, we make a new one. We always treat the values as list refs,
        because that's what they are by the definition of Hashtbl2.t.

        > Why are we assigning the "try...with" block to the variable 'r'?

        r is bound to the value *returned* by the try expression.
        If tbl already has an entry for key,
        the try expression returns the value of (Hashtbl.find tbl key).
        If there's not an entry, the try expression returns the value of
        (let r = ref [] in Hashtbl.add tbl key r; r)

        > Do we need to emit some kind of a message that the key already exists in the
        >hash table

        No, the try form just returns the value.

        > or are we simply going to overwrite the value?

        The code overwrites the value in tbl after it creates a list with data consed
        onto the original list (or the new empty one).

        > What about the second 'let r = ref [] in", is it shadowing the variable 'r'?

        Yes, temporarily. That code is executed only if there's no pre-existing entry.

        [Non-text portions of this message have been removed]
      • Toby Kelsey
        ... This is what it looks like with clearer names and indentation, hopefully you can now see how this works. let add tbl key data = let val_list = try
        Message 3 of 3 , Jan 2, 2012
          On 02/01/12 01:47, Jean Saint-Remy wrote:
          > type ('a, 'b) t = ('a, 'b list ref) Hashtbl.t
          >
          > let add tbl key data =
          > let r =
          > try Hashtbl.find tbl key
          > with Not_found ->
          > let r = ref [] in
          > Hashtbl.add tbl key r;
          > r in
          > r := data :: !r

          This is what it looks like with clearer names and indentation, hopefully you can
          now see how this works.

          let add tbl key data =
          let val_list =
          try
          Hashtbl.find tbl key
          with Not_found ->
          let new_val_list = ref [] in
          (Hashtbl.add tbl key new_val_list; new_val_list)
          in
          val_list := data :: !val_list


          Toby
        Your message has been successfully submitted and would be delivered to recipients shortly.