Loading ...
Sorry, an error occurred while loading the content.

What is the best way to code this data structure?

Expand Messages
  • Max Hayden Chiz
    I have a structured list that I m trying to figure out how to code in a way that takes advantage of the type checker and pattern matching exhaustiveness
    Message 1 of 12 , Oct 31, 2012
    • 0 Attachment
      I have a structured list that I'm trying to figure out how to code in
      a way that takes advantage of the type checker and pattern matching
      exhaustiveness checks.

      The rules for structuring the list are that:
      The "end" of the list is always an "S".
      The head of the list is either an "X" or an "O".
      An "X" can be followed by either an "O" or an "S".
      An "O" can be followed by either an "X" or an "S".

      Put another way, "X" and "O" alternate until you reach the end of the
      list (where there is an "S").

      My first cut at coding this data structure just used a tuple to hold
      the data and had a field in the tuple that indicated which of the
      three types it was. The problem with this is that the type checker
      can't confirm that my functions aren't building the structure in
      disallowed ways, and that the exhaustiveness check for pattern
      matching throws lots of spurious warnings.

      So I'd like to alter my code to fix this, but the "obvious" way of
      doing it seems like a kludge, so I'm wondering if I'm doing this
      right.

      My idea was to have two mutually recursive types and a wrapper type to
      allow my functions to work on any list:

      type lu = X of (data * ld) | Su of data
      and ld = O of (data * lu) | Sd of data

      type mylist = Ul of lu | Dl of ld

      That ensures that you can't have two Xs or two Os in a row, but it
      adds some verbosity to the pattern matching to undo the wrapper and
      results in having two identical cases for "S" instead of just one.

      So is what I'm thinking of the "right" way to tell OCaml about my data
      structure's invariants, or is there a better way to do this?

      Thanks for any help you can provide.

      --Max H. Chiz
    • Dan Bensen
      ... list (where there is an S ). This will code all the information in the list: type xos_list = { bool starts_with_o; int num_xos; bool ends_with_x } Is that
      Message 2 of 12 , Oct 31, 2012
      • 0 Attachment
        > "X" and "O" alternate until you reach the end of the
        list (where there is an "S").

        This will code all the information in the list:

        type xos_list = { bool starts_with_o; int num_xos; bool ends_with_x }

        Is that what you're trying to do?


        [Non-text portions of this message have been removed]
      • Dan Bensen
        ... Oh. I thought those were strings. You don t need quotation marks around type names. type xos_list = { o_opt: O option; xos: (X*O) array; x_opt: X option;
        Message 3 of 12 , Oct 31, 2012
        • 0 Attachment
          > The "end" of the list is always an "S".
          > The head of the list is either an "X" or an "O".
          > An "X" can be followed by either an "O" or an "S".
          > An "O" can be followed by either an "X" or an "S".

          Oh. I thought those were strings.
          You don't need quotation marks around type names.

          type xos_list = {
          o_opt: O option;
          xos: (X*O) array;
          x_opt: X option;
          s: S }

          o_opt holds the first O if the list starts with an O.
          x_opt holds the last X if the list ends with an X.


          [Non-text portions of this message have been removed]
        • Jeff Massung
          This seems pretty mathematical in nature (meaning everything can be inferred from just a little information). Can t your data structure just be: type val = X |
          Message 4 of 12 , Oct 31, 2012
          • 0 Attachment
            This seems pretty mathematical in nature (meaning everything can be
            inferred from just a little information). Can't your data structure just be:

            type val = X | O

            type sequence =
            { start : val
            ; len : int
            }

            let inverse = function
            | X -> O
            | O -> X

            let get seq i =
            if i >= seq.len
            then None
            else
            if odd i then inverse seq.start else seq.start

            let seq_of_list = function ... (* can raise Invalid_sequence here *)

            Anyway, I'm not sure you actually need a list or array, etc. You just need
            a start and a length and if you want to validate anything, just validate in
            the seq_of_list function. Unless I misunderstood something...

            Jeff M.

            On Wed, Oct 31, 2012 at 1:24 PM, Max Hayden Chiz <max.chiz@...> wrote:

            > **
            >
            >
            > I have a structured list that I'm trying to figure out how to code in
            > a way that takes advantage of the type checker and pattern matching
            > exhaustiveness checks.
            >
            > The rules for structuring the list are that:
            > The "end" of the list is always an "S".
            > The head of the list is either an "X" or an "O".
            > An "X" can be followed by either an "O" or an "S".
            > An "O" can be followed by either an "X" or an "S".
            >
            > Put another way, "X" and "O" alternate until you reach the end of the
            > list (where there is an "S").
            >
            > My first cut at coding this data structure just used a tuple to hold
            > the data and had a field in the tuple that indicated which of the
            > three types it was. The problem with this is that the type checker
            > can't confirm that my functions aren't building the structure in
            > disallowed ways, and that the exhaustiveness check for pattern
            > matching throws lots of spurious warnings.
            >
            > So I'd like to alter my code to fix this, but the "obvious" way of
            > doing it seems like a kludge, so I'm wondering if I'm doing this
            > right.
            >
            > My idea was to have two mutually recursive types and a wrapper type to
            > allow my functions to work on any list:
            >
            > type lu = X of (data * ld) | Su of data
            > and ld = O of (data * lu) | Sd of data
            >
            > type mylist = Ul of lu | Dl of ld
            >
            > That ensures that you can't have two Xs or two Os in a row, but it
            > adds some verbosity to the pattern matching to undo the wrapper and
            > results in having two identical cases for "S" instead of just one.
            >
            > So is what I'm thinking of the "right" way to tell OCaml about my data
            > structure's invariants, or is there a better way to do this?
            >
            > Thanks for any help you can provide.
            >
            > --Max H. Chiz
            >
            >


            [Non-text portions of this message have been removed]
          • Lukasz Stafiniak
            ... The end of the list is always an S . ... I think you also need: an S is always the end of the list. ... polymorphic variants. Perhaps something like:
            Message 5 of 12 , Oct 31, 2012
            • 0 Attachment
              On Wed, Oct 31, 2012 at 7:24 PM, Max Hayden Chiz <max.chiz@...> wrote:

              > **
              >
              >
              > I have a structured list that I'm trying to figure out how to code in
              > a way that takes advantage of the type checker and pattern matching
              > exhaustiveness checks.
              >
              > The rules for structuring the list are that:
              >
              The "end" of the list is always an "S".
              >
              I think you also need: an S is always the end of the list.

              > The head of the list is either an "X" or an "O".
              > An "X" can be followed by either an "O" or an "S".
              > An "O" can be followed by either an "X" or an "S".
              >
              > My idea was to have two mutually recursive types and a wrapper type to
              > allow my functions to work on any list:
              >
              > type lu = X of (data * ld) | Su of data
              > and ld = O of (data * lu) | Sd of data
              >
              > type mylist = Ul of lu | Dl of ld
              >
              > This is already good. If you want to be "more clever", you need to use
              polymorphic variants. Perhaps something like:

              type lu = [`X of data * ld | `S of data]
              and ld = [`O of data * lu | `S of data]
              type l = [lu | ld]

              And for example:

              let f : l -> bool * data = function `X (d, _) -> true, d | `O (d, _) ->
              false, d | `S d -> false, d


              [Non-text portions of this message have been removed]
            • Max Hayden Chiz
              ... Well there s data attached to the Xs, Os, and Ss. The type constructors just specify which type of data goes in that spot of the list. A friend suggested
              Message 6 of 12 , Oct 31, 2012
              • 0 Attachment
                On Wed, Oct 31, 2012 at 2:18 PM, Jeff Massung <massung@...> wrote:
                > This seems pretty mathematical in nature (meaning everything can be
                > inferred from just a little information). Can't your data structure just be:
                >
                > type val = X | O

                Well there's data attached to the Xs, Os, and Ss. The type
                constructors just specify which type of data goes in that spot of the
                list.

                A friend suggested the following:
                type ('x, 'o, 's) alternatingList =
                Body of 'x * ('o, 'x, 's) alternatingList
                | Tail of 's

                So then you can do something like:
                let example = Body ({x = 1}, Body ({o = 1.0}, Tail {s = #"1"}))

                That seemed like the best solution I've seen, but the problem with
                this is that I need to do one thing if the data is an X and another
                thing if it is an O, and this data structure leaves me with no way to
                pattern match on whether the head of the list is an X, O, or S (since
                the functions manipulating the list have to be polymorphic). So I'm
                back to what I started with as the apparently reasonable solution.
              • Max Hayden Chiz
                ... No, X, O, and S are different types of data. So I need to be able to do something like: let myfun = fun (X, current)::(O,prev)::_ - doX current prev ...
                Message 7 of 12 , Oct 31, 2012
                • 0 Attachment
                  On Wed, Oct 31, 2012 at 1:46 PM, Dan Bensen <danbensen@...> wrote:

                  > **
                  >
                  >
                  > > "X" and "O" alternate until you reach the end of the
                  > list (where there is an "S").
                  >
                  > This will code all the information in the list:
                  >
                  > type xos_list = { bool starts_with_o; int num_xos; bool ends_with_x }
                  >
                  > Is that what you're trying to do?
                  >

                  No, X, O, and S are different types of data. So I need to be able to do
                  something like:

                  let myfun =
                  fun (X, current)::(O,prev)::_ -> doX current prev
                  | (X, current)::(S,prev)::_ -> doXS current prev
                  | (O, current)::(X, prev)::_ -> doO current prev
                  | (O, current)::(S, prev)::_ -> doOS current prev
                  | (S, sdata)::_ -> doS sdata

                  Similarly I need:
                  let myfun =
                  fun (X,current)::_::(X,prev)::_ -> doX current prev
                  | (O, current)::_::(O, prev)::_ -> doO current prev
                  | _::_::(S, prev)::_ -> doS prev 2
                  | _::(S, prev)::_ -> doS prev 1
                  | _::nil -> NONE

                  The problem with doing the above is that it is a simple list, and there's
                  no way to ensure that all of the different routines that build, manipulate,
                  and calculate things on the basis of this list are preserving
                  the invariants along the way. On top of this, the pattern matching is
                  non-exhaustive so it generates a ton of warnings.

                  What I suggested in my initial email would solve both of these problems,
                  but it seems unnecessarily messy.


                  [Non-text portions of this message have been removed]
                • Dan Bensen
                  ... That s what I wrote in my second response. You can ignore the first one. ... Your solution is okay. The problem is messy. [Non-text portions of this
                  Message 8 of 12 , Oct 31, 2012
                  • 0 Attachment
                    > No, X, O, and S are different types of data.

                    That's what I wrote in my second response.
                    You can ignore the first one.

                    > What I suggested in my initial email
                    > ... seems unnecessarily messy.

                    Your solution is okay. The problem is messy.


                    [Non-text portions of this message have been removed]
                  • Gabriel Scherer
                    I m fine with your solution (sometimes it s a good choice to get weaker static guarantees for less conceptual complexity), but I would like to point out that
                    Message 9 of 12 , Nov 1, 2012
                    • 0 Attachment
                      I'm fine with your solution (sometimes it's a good choice to get
                      weaker static guarantees for less conceptual complexity), but I would
                      like to point out that the polymorphic variant solution suggested by
                      Lukasz should work as well, and propose a simpler solution based on
                      irregular recursion:

                      Lukasz-inspired solution:

                      type ('x, 'o) lu = [ `X of 'x * ('x, 'o) ld | `S ]
                      and ('x, 'o) ld = [ `O of 'o * ('x, 'o) ld | `S ]

                      let rec iter doX doO doXS doOS = function
                      | `S -> ()
                      | `X (x, `S) -> doXS x
                      | `O (o, `S) -> doOS o
                      | `X (x, rest) -> doX x; iter doX doO doXS doOS rest
                      | `O (o, rest) -> doO o; iter doX doO doXS doOS rest

                      let () = iter print_int print_string (Printf.printf "%d\n") print_endline
                      (`X (1, `O ("o", `X (2, `O ("oo", `S)))))


                      Irregular recursive datatype:

                      type ('x, 'o) alterning =
                      | Stop
                      | One of 'x * ('o, 'x) alterning

                      (* abbreviation just to have a shorter signature *)
                      type 't u = 't -> unit

                      let rec iter : 'x 'o. 'x u -> 'o u -> 'x u -> 'o u -> ('x, 'o) alterning u =
                      fun do1 do2 do1S do2S ->
                      function
                      | Stop -> ()
                      | One (x, Stop) -> do1S x
                      | One (x, rest) -> do1 x; iter do2 do1 do2S do1S rest

                      let () = iter print_int print_string (Printf.printf "%d\n") print_endline
                      (One (1, One ("o", One (2, One ("oo", Stop)))))

                      The recursion pattern here is "irregular" because the recursive
                      occurence of "alternating" is done at a different type from its
                      definition (('o, 'x) vs. ('x, 'o)). The corresponding "iter" function
                      is therefore a so-called "polymorphic recursion", which does something
                      non-obvious with typing, and needs an explicit polymorphic annotation
                      (added since OCaml 3.12) to proceed.
                      (In slightly more details, the way recursive function are inferred in
                      ML is to pick a monomorphic type for them, type their body, and then
                      generalize what can be generalize; doing this on a function that calls
                      itself at a type different than its declaration will result in too
                      much unification and a weaker type than you want.)

                      On Thu, Nov 1, 2012 at 1:28 AM, Dan Bensen <danbensen@...> wrote:
                      >> No, X, O, and S are different types of data.
                      >
                      > That's what I wrote in my second response.
                      > You can ignore the first one.
                      >
                      >> What I suggested in my initial email
                      >> ... seems unnecessarily messy.
                      >
                      > Your solution is okay. The problem is messy.
                      >
                      >
                      > [Non-text portions of this message have been removed]
                      >
                      >
                      >
                      > ------------------------------------
                      >
                      > Archives up to December 31, 2011 are also downloadable at http://www.connettivo.net/cntprojects/ocaml_beginners
                      > The archives of the very official ocaml list (the seniors' one) can be found at http://caml.inria.fr
                      > Attachments are banned and you're asked to be polite, avoid flames etc.Yahoo! Groups Links
                      >
                      >
                      >
                    • Max Hayden Chiz
                      On Thu, Nov 1, 2012 at 3:04 AM, Gabriel Scherer ... Thank you for your help. I think this is probably the best way to go, but I m not sure how to figure out
                      Message 10 of 12 , Nov 1, 2012
                      • 0 Attachment
                        On Thu, Nov 1, 2012 at 3:04 AM, Gabriel Scherer
                        <gabriel.scherer@...>wrote:

                        > **
                        >
                        >
                        > I'm fine with your solution (sometimes it's a good choice to get
                        > weaker static guarantees for less conceptual complexity), but I would
                        > like to point out that the polymorphic variant solution suggested by
                        > Lukasz should work as well, and propose a simpler solution based on
                        > irregular recursion:
                        >
                        > Lukasz-inspired solution:
                        >
                        > type ('x, 'o) lu = [ `X of 'x * ('x, 'o) ld | `S ]
                        > and ('x, 'o) ld = [ `O of 'o * ('x, 'o) ld | `S ]
                        >
                        > let rec iter doX doO doXS doOS = function
                        > | `S -> ()
                        > | `X (x, `S) -> doXS x
                        > | `O (o, `S) -> doOS o
                        > | `X (x, rest) -> doX x; iter doX doO doXS doOS rest
                        > | `O (o, rest) -> doO o; iter doX doO doXS doOS rest
                        >
                        > let () = iter print_int print_string (Printf.printf "%d\n") print_endline
                        > (`X (1, `O ("o", `X (2, `O ("oo", `S)))))
                        >
                        > Irregular recursive datatype:
                        >
                        > type ('x, 'o) alterning =
                        > | Stop
                        > | One of 'x * ('o, 'x) alterning
                        >
                        > (* abbreviation just to have a shorter signature *)
                        > type 't u = 't -> unit
                        >
                        > let rec iter : 'x 'o. 'x u -> 'o u -> 'x u -> 'o u -> ('x, 'o) alterning u
                        > =
                        > fun do1 do2 do1S do2S ->
                        > function
                        > | Stop -> ()
                        > | One (x, Stop) -> do1S x
                        > | One (x, rest) -> do1 x; iter do2 do1 do2S do1S rest
                        >
                        Thank you for your help. I think this is probably the best way to go, but
                        I'm not sure how to figure out if the head of the list is 'x or 'o so that
                        I can pass the right do functions to iter.

                        Someone else had suggested something similar to this, but in order to be
                        able to start the recursion, it seemed like I had to have an extra type:

                        type mylist = X of (xdata,odata) alternating | O of (odata, xdata)
                        alternating

                        Is that right or is there a way to do away with this extra step?


                        [Non-text portions of this message have been removed]
                      • Lukasz Stafiniak
                        ... Thanks, nice example of polymorphic recursion! Thank you for your help. I think this is probably the best way to go, but ... first. The unfortunate thing
                        Message 11 of 12 , Nov 1, 2012
                        • 0 Attachment
                          On Thu, Nov 1, 2012 at 5:22 PM, Max Hayden Chiz <max.chiz@...> wrote:

                          > **
                          >
                          >
                          > On Thu, Nov 1, 2012 at 3:04 AM, Gabriel Scherer
                          > <gabriel.scherer@...>wrote:
                          > >
                          > > Irregular recursive datatype:
                          > >
                          > > type ('x, 'o) alterning =
                          > > | Stop
                          > > | One of 'x * ('o, 'x) alterning
                          > >
                          > > (* abbreviation just to have a shorter signature *)
                          > > type 't u = 't -> unit
                          > >
                          > > let rec iter : 'x 'o. 'x u -> 'o u -> 'x u -> 'o u -> ('x, 'o) alterning
                          > u
                          > > =
                          > > fun do1 do2 do1S do2S ->
                          > > function
                          > > | Stop -> ()
                          > > | One (x, Stop) -> do1S x
                          > > | One (x, rest) -> do1 x; iter do2 do1 do2S do1S rest
                          > >
                          >
                          Thanks, nice example of polymorphic recursion!

                          Thank you for your help. I think this is probably the best way to go, but
                          > I'm not sure how to figure out if the head of the list is 'x or 'o so that
                          > I can pass the right do functions to iter.
                          >
                          > The type system will keep reminding you which kind of elements comes
                          first. The unfortunate thing is you will not be able to neither "just not
                          care", nor "be prepared for both cases" -- the type system will require
                          that in a single function you only allow one type of elements to come first
                          in your list.



                          > Someone else had suggested something similar to this, but in order to be
                          > able to start the recursion, it seemed like I had to have an extra type:
                          >
                          > type mylist = X of (xdata,odata) alternating | O of (odata, xdata)
                          > alternating
                          >
                          > Is that right or is there a way to do away with this extra step?
                          >

                          This extra step remembers which kind of elements comes first, so that some
                          parts of your code can be agnostic about it, and other can handle both
                          cases by a single function. I think it's "right" (i.e. to avoid this step
                          and not introducing other complications you would need to go back to
                          polymorphic variant solution).


                          [Non-text portions of this message have been removed]
                        • danbensen@att.net
                          ... type x = int type o = string type s = string type xos_list = { o0o: o option; xos: (x*o) list; xfo: x option; s: s } let do_some f = function None - () |
                          Message 12 of 12 , Nov 1, 2012
                          • 0 Attachment
                            > I'm not sure how to figure out if the head of the list is 'x or 'o

                            type x = int
                            type o = string
                            type s = string

                            type xos_list = {
                            o0o: o option;
                            xos: (x*o) list;
                            xfo: x option;
                            s: s }

                            let do_some f = function None -> () | Some x -> f x

                            let rec iter do_x do_o { o0o; xos; xfo; _ } =
                            do_some do_o o0o; List.iter (fun (x,o) -> do_x x; do_o o) xos;
                            do_some do_x xfo

                            let () = iter print_int print_string
                            { o0o=None; xos = [(1,"o");(2,"oo")]; xfo = Some 3; s="example" }
                          Your message has been successfully submitted and would be delivered to recipients shortly.