## What is the best way to code this data structure?

Expand Messages
• I have a structured list that I m trying to figure out how to code in a way that takes advantage of the type checker and pattern matching exhaustiveness
Message 1 of 12 , Oct 31, 2012
I have a structured list that I'm trying to figure out how to code in
a way that takes advantage of the type checker and pattern matching
exhaustiveness checks.

The rules for structuring the list are that:
The "end" of the list is always an "S".
The head of the list is either an "X" or an "O".
An "X" can be followed by either an "O" or an "S".
An "O" can be followed by either an "X" or an "S".

Put another way, "X" and "O" alternate until you reach the end of the
list (where there is an "S").

My first cut at coding this data structure just used a tuple to hold
the data and had a field in the tuple that indicated which of the
three types it was. The problem with this is that the type checker
can't confirm that my functions aren't building the structure in
disallowed ways, and that the exhaustiveness check for pattern
matching throws lots of spurious warnings.

So I'd like to alter my code to fix this, but the "obvious" way of
doing it seems like a kludge, so I'm wondering if I'm doing this
right.

My idea was to have two mutually recursive types and a wrapper type to
allow my functions to work on any list:

type lu = X of (data * ld) | Su of data
and ld = O of (data * lu) | Sd of data

type mylist = Ul of lu | Dl of ld

That ensures that you can't have two Xs or two Os in a row, but it
adds some verbosity to the pattern matching to undo the wrapper and
results in having two identical cases for "S" instead of just one.

So is what I'm thinking of the "right" way to tell OCaml about my data
structure's invariants, or is there a better way to do this?

--Max H. Chiz
• ... list (where there is an S ). This will code all the information in the list: type xos_list = { bool starts_with_o; int num_xos; bool ends_with_x } Is that
Message 2 of 12 , Oct 31, 2012
> "X" and "O" alternate until you reach the end of the
list (where there is an "S").

This will code all the information in the list:

type xos_list = { bool starts_with_o; int num_xos; bool ends_with_x }

Is that what you're trying to do?

[Non-text portions of this message have been removed]
• ... Oh. I thought those were strings. You don t need quotation marks around type names. type xos_list = { o_opt: O option; xos: (X*O) array; x_opt: X option;
Message 3 of 12 , Oct 31, 2012
> The "end" of the list is always an "S".
> The head of the list is either an "X" or an "O".
> An "X" can be followed by either an "O" or an "S".
> An "O" can be followed by either an "X" or an "S".

Oh. I thought those were strings.
You don't need quotation marks around type names.

type xos_list = {
o_opt: O option;
xos: (X*O) array;
x_opt: X option;
s: S }

o_opt holds the first O if the list starts with an O.
x_opt holds the last X if the list ends with an X.

[Non-text portions of this message have been removed]
• This seems pretty mathematical in nature (meaning everything can be inferred from just a little information). Can t your data structure just be: type val = X |
Message 4 of 12 , Oct 31, 2012
This seems pretty mathematical in nature (meaning everything can be
inferred from just a little information). Can't your data structure just be:

type val = X | O

type sequence =
{ start : val
; len : int
}

let inverse = function
| X -> O
| O -> X

let get seq i =
if i >= seq.len
then None
else
if odd i then inverse seq.start else seq.start

let seq_of_list = function ... (* can raise Invalid_sequence here *)

Anyway, I'm not sure you actually need a list or array, etc. You just need
a start and a length and if you want to validate anything, just validate in
the seq_of_list function. Unless I misunderstood something...

Jeff M.

On Wed, Oct 31, 2012 at 1:24 PM, Max Hayden Chiz <max.chiz@...> wrote:

> **
>
>
> I have a structured list that I'm trying to figure out how to code in
> a way that takes advantage of the type checker and pattern matching
> exhaustiveness checks.
>
> The rules for structuring the list are that:
> The "end" of the list is always an "S".
> The head of the list is either an "X" or an "O".
> An "X" can be followed by either an "O" or an "S".
> An "O" can be followed by either an "X" or an "S".
>
> Put another way, "X" and "O" alternate until you reach the end of the
> list (where there is an "S").
>
> My first cut at coding this data structure just used a tuple to hold
> the data and had a field in the tuple that indicated which of the
> three types it was. The problem with this is that the type checker
> can't confirm that my functions aren't building the structure in
> disallowed ways, and that the exhaustiveness check for pattern
> matching throws lots of spurious warnings.
>
> So I'd like to alter my code to fix this, but the "obvious" way of
> doing it seems like a kludge, so I'm wondering if I'm doing this
> right.
>
> My idea was to have two mutually recursive types and a wrapper type to
> allow my functions to work on any list:
>
> type lu = X of (data * ld) | Su of data
> and ld = O of (data * lu) | Sd of data
>
> type mylist = Ul of lu | Dl of ld
>
> That ensures that you can't have two Xs or two Os in a row, but it
> adds some verbosity to the pattern matching to undo the wrapper and
> results in having two identical cases for "S" instead of just one.
>
> So is what I'm thinking of the "right" way to tell OCaml about my data
> structure's invariants, or is there a better way to do this?
>
>
> --Max H. Chiz
>
>

[Non-text portions of this message have been removed]
• ... The end of the list is always an S . ... I think you also need: an S is always the end of the list. ... polymorphic variants. Perhaps something like:
Message 5 of 12 , Oct 31, 2012
On Wed, Oct 31, 2012 at 7:24 PM, Max Hayden Chiz <max.chiz@...> wrote:

> **
>
>
> I have a structured list that I'm trying to figure out how to code in
> a way that takes advantage of the type checker and pattern matching
> exhaustiveness checks.
>
> The rules for structuring the list are that:
>
The "end" of the list is always an "S".
>
I think you also need: an S is always the end of the list.

> The head of the list is either an "X" or an "O".
> An "X" can be followed by either an "O" or an "S".
> An "O" can be followed by either an "X" or an "S".
>
> My idea was to have two mutually recursive types and a wrapper type to
> allow my functions to work on any list:
>
> type lu = X of (data * ld) | Su of data
> and ld = O of (data * lu) | Sd of data
>
> type mylist = Ul of lu | Dl of ld
>
> This is already good. If you want to be "more clever", you need to use
polymorphic variants. Perhaps something like:

type lu = [`X of data * ld | `S of data]
and ld = [`O of data * lu | `S of data]
type l = [lu | ld]

And for example:

let f : l -> bool * data = function `X (d, _) -> true, d | `O (d, _) ->
false, d | `S d -> false, d

[Non-text portions of this message have been removed]
• ... Well there s data attached to the Xs, Os, and Ss. The type constructors just specify which type of data goes in that spot of the list. A friend suggested
Message 6 of 12 , Oct 31, 2012
On Wed, Oct 31, 2012 at 2:18 PM, Jeff Massung <massung@...> wrote:
> This seems pretty mathematical in nature (meaning everything can be
> inferred from just a little information). Can't your data structure just be:
>
> type val = X | O

Well there's data attached to the Xs, Os, and Ss. The type
constructors just specify which type of data goes in that spot of the
list.

A friend suggested the following:
type ('x, 'o, 's) alternatingList =
Body of 'x * ('o, 'x, 's) alternatingList
| Tail of 's

So then you can do something like:
let example = Body ({x = 1}, Body ({o = 1.0}, Tail {s = #"1"}))

That seemed like the best solution I've seen, but the problem with
this is that I need to do one thing if the data is an X and another
thing if it is an O, and this data structure leaves me with no way to
pattern match on whether the head of the list is an X, O, or S (since
the functions manipulating the list have to be polymorphic). So I'm
back to what I started with as the apparently reasonable solution.
• ... No, X, O, and S are different types of data. So I need to be able to do something like: let myfun = fun (X, current)::(O,prev)::_ - doX current prev ...
Message 7 of 12 , Oct 31, 2012
On Wed, Oct 31, 2012 at 1:46 PM, Dan Bensen <danbensen@...> wrote:

> **
>
>
> > "X" and "O" alternate until you reach the end of the
> list (where there is an "S").
>
> This will code all the information in the list:
>
> type xos_list = { bool starts_with_o; int num_xos; bool ends_with_x }
>
> Is that what you're trying to do?
>

No, X, O, and S are different types of data. So I need to be able to do
something like:

let myfun =
fun (X, current)::(O,prev)::_ -> doX current prev
| (X, current)::(S,prev)::_ -> doXS current prev
| (O, current)::(X, prev)::_ -> doO current prev
| (O, current)::(S, prev)::_ -> doOS current prev
| (S, sdata)::_ -> doS sdata

Similarly I need:
let myfun =
fun (X,current)::_::(X,prev)::_ -> doX current prev
| (O, current)::_::(O, prev)::_ -> doO current prev
| _::_::(S, prev)::_ -> doS prev 2
| _::(S, prev)::_ -> doS prev 1
| _::nil -> NONE

The problem with doing the above is that it is a simple list, and there's
no way to ensure that all of the different routines that build, manipulate,
and calculate things on the basis of this list are preserving
the invariants along the way. On top of this, the pattern matching is
non-exhaustive so it generates a ton of warnings.

What I suggested in my initial email would solve both of these problems,
but it seems unnecessarily messy.

[Non-text portions of this message have been removed]
• ... That s what I wrote in my second response. You can ignore the first one. ... Your solution is okay. The problem is messy. [Non-text portions of this
Message 8 of 12 , Oct 31, 2012
> No, X, O, and S are different types of data.

That's what I wrote in my second response.
You can ignore the first one.

> What I suggested in my initial email
> ... seems unnecessarily messy.

Your solution is okay. The problem is messy.

[Non-text portions of this message have been removed]
• I m fine with your solution (sometimes it s a good choice to get weaker static guarantees for less conceptual complexity), but I would like to point out that
Message 9 of 12 , Nov 1, 2012
I'm fine with your solution (sometimes it's a good choice to get
weaker static guarantees for less conceptual complexity), but I would
like to point out that the polymorphic variant solution suggested by
Lukasz should work as well, and propose a simpler solution based on
irregular recursion:

Lukasz-inspired solution:

type ('x, 'o) lu = [ `X of 'x * ('x, 'o) ld | `S ]
and ('x, 'o) ld = [ `O of 'o * ('x, 'o) ld | `S ]

let rec iter doX doO doXS doOS = function
| `S -> ()
| `X (x, `S) -> doXS x
| `O (o, `S) -> doOS o
| `X (x, rest) -> doX x; iter doX doO doXS doOS rest
| `O (o, rest) -> doO o; iter doX doO doXS doOS rest

let () = iter print_int print_string (Printf.printf "%d\n") print_endline
(`X (1, `O ("o", `X (2, `O ("oo", `S)))))

Irregular recursive datatype:

type ('x, 'o) alterning =
| Stop
| One of 'x * ('o, 'x) alterning

(* abbreviation just to have a shorter signature *)
type 't u = 't -> unit

let rec iter : 'x 'o. 'x u -> 'o u -> 'x u -> 'o u -> ('x, 'o) alterning u =
fun do1 do2 do1S do2S ->
function
| Stop -> ()
| One (x, Stop) -> do1S x
| One (x, rest) -> do1 x; iter do2 do1 do2S do1S rest

let () = iter print_int print_string (Printf.printf "%d\n") print_endline
(One (1, One ("o", One (2, One ("oo", Stop)))))

The recursion pattern here is "irregular" because the recursive
occurence of "alternating" is done at a different type from its
definition (('o, 'x) vs. ('x, 'o)). The corresponding "iter" function
is therefore a so-called "polymorphic recursion", which does something
non-obvious with typing, and needs an explicit polymorphic annotation
(added since OCaml 3.12) to proceed.
(In slightly more details, the way recursive function are inferred in
ML is to pick a monomorphic type for them, type their body, and then
generalize what can be generalize; doing this on a function that calls
itself at a type different than its declaration will result in too
much unification and a weaker type than you want.)

On Thu, Nov 1, 2012 at 1:28 AM, Dan Bensen <danbensen@...> wrote:
>> No, X, O, and S are different types of data.
>
> That's what I wrote in my second response.
> You can ignore the first one.
>
>> What I suggested in my initial email
>> ... seems unnecessarily messy.
>
> Your solution is okay. The problem is messy.
>
>
> [Non-text portions of this message have been removed]
>
>
>
> ------------------------------------
>
> Archives up to December 31, 2011 are also downloadable at http://www.connettivo.net/cntprojects/ocaml_beginners
> The archives of the very official ocaml list (the seniors' one) can be found at http://caml.inria.fr
> Attachments are banned and you're asked to be polite, avoid flames etc.Yahoo! Groups Links
>
>
>
• On Thu, Nov 1, 2012 at 3:04 AM, Gabriel Scherer ... Thank you for your help. I think this is probably the best way to go, but I m not sure how to figure out
Message 10 of 12 , Nov 1, 2012
On Thu, Nov 1, 2012 at 3:04 AM, Gabriel Scherer
<gabriel.scherer@...>wrote:

> **
>
>
> I'm fine with your solution (sometimes it's a good choice to get
> weaker static guarantees for less conceptual complexity), but I would
> like to point out that the polymorphic variant solution suggested by
> Lukasz should work as well, and propose a simpler solution based on
> irregular recursion:
>
> Lukasz-inspired solution:
>
> type ('x, 'o) lu = [ `X of 'x * ('x, 'o) ld | `S ]
> and ('x, 'o) ld = [ `O of 'o * ('x, 'o) ld | `S ]
>
> let rec iter doX doO doXS doOS = function
> | `S -> ()
> | `X (x, `S) -> doXS x
> | `O (o, `S) -> doOS o
> | `X (x, rest) -> doX x; iter doX doO doXS doOS rest
> | `O (o, rest) -> doO o; iter doX doO doXS doOS rest
>
> let () = iter print_int print_string (Printf.printf "%d\n") print_endline
> (`X (1, `O ("o", `X (2, `O ("oo", `S)))))
>
> Irregular recursive datatype:
>
> type ('x, 'o) alterning =
> | Stop
> | One of 'x * ('o, 'x) alterning
>
> (* abbreviation just to have a shorter signature *)
> type 't u = 't -> unit
>
> let rec iter : 'x 'o. 'x u -> 'o u -> 'x u -> 'o u -> ('x, 'o) alterning u
> =
> fun do1 do2 do1S do2S ->
> function
> | Stop -> ()
> | One (x, Stop) -> do1S x
> | One (x, rest) -> do1 x; iter do2 do1 do2S do1S rest
>
Thank you for your help. I think this is probably the best way to go, but
I'm not sure how to figure out if the head of the list is 'x or 'o so that
I can pass the right do functions to iter.

Someone else had suggested something similar to this, but in order to be
able to start the recursion, it seemed like I had to have an extra type:

type mylist = X of (xdata,odata) alternating | O of (odata, xdata)
alternating

Is that right or is there a way to do away with this extra step?

[Non-text portions of this message have been removed]
• ... Thanks, nice example of polymorphic recursion! Thank you for your help. I think this is probably the best way to go, but ... first. The unfortunate thing
Message 11 of 12 , Nov 1, 2012
On Thu, Nov 1, 2012 at 5:22 PM, Max Hayden Chiz <max.chiz@...> wrote:

> **
>
>
> On Thu, Nov 1, 2012 at 3:04 AM, Gabriel Scherer
> <gabriel.scherer@...>wrote:
> >
> > Irregular recursive datatype:
> >
> > type ('x, 'o) alterning =
> > | Stop
> > | One of 'x * ('o, 'x) alterning
> >
> > (* abbreviation just to have a shorter signature *)
> > type 't u = 't -> unit
> >
> > let rec iter : 'x 'o. 'x u -> 'o u -> 'x u -> 'o u -> ('x, 'o) alterning
> u
> > =
> > fun do1 do2 do1S do2S ->
> > function
> > | Stop -> ()
> > | One (x, Stop) -> do1S x
> > | One (x, rest) -> do1 x; iter do2 do1 do2S do1S rest
> >
>
Thanks, nice example of polymorphic recursion!

Thank you for your help. I think this is probably the best way to go, but
> I'm not sure how to figure out if the head of the list is 'x or 'o so that
> I can pass the right do functions to iter.
>
> The type system will keep reminding you which kind of elements comes
first. The unfortunate thing is you will not be able to neither "just not
care", nor "be prepared for both cases" -- the type system will require
that in a single function you only allow one type of elements to come first

> Someone else had suggested something similar to this, but in order to be
> able to start the recursion, it seemed like I had to have an extra type:
>
> type mylist = X of (xdata,odata) alternating | O of (odata, xdata)
> alternating
>
> Is that right or is there a way to do away with this extra step?
>

This extra step remembers which kind of elements comes first, so that some
parts of your code can be agnostic about it, and other can handle both
cases by a single function. I think it's "right" (i.e. to avoid this step
and not introducing other complications you would need to go back to
polymorphic variant solution).

[Non-text portions of this message have been removed]
• ... type x = int type o = string type s = string type xos_list = { o0o: o option; xos: (x*o) list; xfo: x option; s: s } let do_some f = function None - () |
Message 12 of 12 , Nov 1, 2012
> I'm not sure how to figure out if the head of the list is 'x or 'o

type x = int
type o = string
type s = string

type xos_list = {
o0o: o option;
xos: (x*o) list;
xfo: x option;
s: s }

let do_some f = function None -> () | Some x -> f x

let rec iter do_x do_o { o0o; xos; xfo; _ } =
do_some do_o o0o; List.iter (fun (x,o) -> do_x x; do_o o) xos;
do_some do_x xfo

let () = iter print_int print_string
{ o0o=None; xos = [(1,"o");(2,"oo")]; xfo = Some 3; s="example" }
Your message has been successfully submitted and would be delivered to recipients shortly.