Loading ...
Sorry, an error occurred while loading the content.

Re: "ocaml_beginners"::[] Fwd: [Caml - list] Using zippers to handle huge trees

Expand Messages
  • Brian Hurt
    ... 1) They re easy to implement 2) They have very nice properties. In addition to fast searching, they also have reasonably fast inserts and deletes. 3)
    Message 1 of 7 , May 9, 2003
    • 0 Attachment
      On Fri, 9 May 2003, Stalkern 2 wrote:
      >
      > In Computer Science, trees are everywhere. Filesystems, LDAP, Document Bases,
      > XML, and so on and so forth.
      > Maybe it's one of the easiest recursive structures around.

      1) They're easy to implement

      2) They have very nice properties. In addition to fast searching, they
      also have reasonably fast inserts and deletes.

      3) Trees are a special case of the more general concept of graphs- which
      have their own branch of mathematical theory (called, coincidentally
      enough, graph theory) dedicated to them.

      > Now if I go and
      > look for trees in the Dictionary of Algorithms and Data Structures at
      > http://www.nist.gov/dads/ , or in Structure And Interpretation of Computer
      > Program by Abelson and Sussman, I get a... forest. But this forest seems to
      > be generally aimed at searching... on made-to-measure trees. So I'm in
      > trouble for recognizing these algorithms in real -life example. I wonder:

      You listed a whole slew of real-life uses of trees above :-). And you
      forgot/didn't notice hundreds of other examples.

      >
      > 1) What of the known tree structures is a plain XML file? I mean, the
      > structure is the one of the DTD, but does it have a specific name and theory
      > other than the one of Aristoteles, or is it "just a tree"?

      XML is just text. Now, once you *parse* the text, one valid way to view
      the text is as a tree. But it isn't the only way (for example, SAX views
      XML as a series of events- the events being SAX recognizing various tags).
      To be pendatic, it's DOM (or it's equivelents) which are trees, XML is
      just formatted text.

      >
      > 2) What of the known tree structures is its DOM? Is its
      > structure a footprint of teh XML's one, or does it come from some
      > other tradition or school of thought?

      DOM, the tree representation of an XML file, is a tree with different
      types of data of the nodes. You could implement it something like:

      type domData = Tag of string * (string * string) list * ...
      (* tag id, attributes, etc. *)
      | Cdata of string
      | ...

      type domNode = Leaf
      | Node of domData * domNode * domNode

      But note the similiarities between this definition and the definition from
      my previous post:

      type 'a tree = Leaf
      | Node of 'a * 'a tree * 'a tree

      I could have defined domNode just as:

      type domNode = domData tree

      The computer scientists just sort of assume you understand that
      generalization is going on. This allows them to think about trees in the
      abstract (and over the last 50 years, a lot of thinking about trees has
      gone on) and assume the programming will ungeneralize it as needed.

      >
      > 3) What of the known tree structures is a XML file with extended
      > links? Is it just a set of interwoven trees or is there a theory for
      > that, such for matrixes?

      Actually, how it works depends upon how the programmer decides to
      implement it. You can do it one of three obvious ways. First off, you
      can simply replace the link node with the root node of the tree the link
      node references. Or, the link node can simply contain a pointer to the
      root node of the tree it references, but still exist. Or the tree it
      references may or may not even exist in the program's memory, and instead
      the link node could contain data on how to fetch the tree (the URI/URL,
      for example). Depends upon what you want to do.

      >
      > 4) Trees can be ordered sets and for instance distinguish
      > left-branching and right-branching. Can they also implement n
      > dimension apart from ordered parenting and ordered sibling?

      Yes. A common data structure in graphs is the quad tree. Each node of a
      quad tree has four children. So the whole screeen (or window) would be
      the root node, and it's four children are the upper left, upper right,
      lower left, and lower right quadrants of the screen. Each quadrant in
      turn has four subquadrants, and so on. Where a particular graphics object
      is on the screen controls where it is in the quad tree. A variant of the
      quad tree for 3D is the oct tree. So these are examples of
      two-dimensional (or three-dimensional) partitioning of a tree.

      >
      > 5) Does linking between separate trees always require that a
      > supertree exists or can be thought of? (something like PXP's
      > superroots)

      No. See above.

      Brian
    • Stalkern 2
      ... You have been enlighting, Brian. I m touched and I have been thinking about your words until late night. Thank you so much... I m very curious about this
      Message 2 of 7 , May 11, 2003
      • 0 Attachment
        Il Friday 09 May 2003 17:50, Brian Hurt ha scritto:
        > On Fri, 9 May 2003, Stalkern 2 wrote:
        > > In Computer Science, trees are everywhere. Filesystems, LDAP, Document
        > > Bases, XML, and so on and so forth.
        > > Maybe it's one of the easiest recursive structures around.
        >
        > 1) They're easy to implement
        >
        > 2) They have very nice properties. In addition to fast searching, they
        > also have reasonably fast inserts and deletes.
        >
        > 3) Trees are a special case of the more general concept of graphs- which
        > have their own branch of mathematical theory (called, coincidentally
        > enough, graph theory) dedicated to them.

        You have been enlighting, Brian. I'm touched and I have been thinking about
        your words until late night.

        Thank you so much... I'm very curious about this knowledge.

        Ernesto
      • Stalkern 2
        ... [...] Following the words Nodes are doubly linked trees at http://www.ocaml-programming.de/packages/documentation/pxp/manual/x941.html I ve tried out
        Message 3 of 7 , May 13, 2003
        • 0 Attachment
          Il Monday 12 May 2003 00:59, Stalkern 2 ha scritto:
          > <html><body>
          >
          [...]

          Following the words "Nodes are doubly linked trees" at
          http://www.ocaml-programming.de/packages/documentation/pxp/manual/x941.html
          I've tried out trees as doubled-linked lists.
          I've been starting from
          http://caml.inria.fr/oreilly-book/html/book-ora031.html#toc43

          and wrote a type like

          type 'a node = {
          name : string ;
          mutable data : 'a ;
          mutable parent : 'a dll ;
          mutable level : int;
          mutable rank : int;
          mutable children : 'a dll list }
          and 'a dll = EmptyElement | Leaf of 'a | Node of 'a node ;;

          and a few functions:


          let add_child_by_name childName childData aNode =
          match aNode with
          EmptyElement ->
          (* when adding to a empty node, no parent and no children*)
          Node {
          name=childName ;
          data=childData ;
          parent=EmptyElement ;
          level=0 ;
          rank=0 ;
          children=[] }
          | Leaf s -> assert false
          | Node n as targetParent ->
          (* when adding a new Cell before a addressee Cell, we take as prev of
          the new Cell the former prev of addressee Cell and as next of the new Cell
          the addressee Cell itself.
          Then, we set the new Cell as previous of the adressee Cell and
          if the prev of the new Cell is not Empty, that could happen if the previous
          of the addressee cell was empty, we set as the new Cell as next of it*)
          let newNodeRecord = {
          name=childName ;
          data=childData ;
          parent=targetParent;
          level=(n.level + 1);
          rank=(List.length n.children) ;
          children=[] } in
          let newNode = Node newNodeRecord in
          n.children <- (n.children@[newNode]);
          targetParent;;



          let rec walk_vertically func aNode =

          let rec treat_nodeList aNodeList =
          match aNodeList with
          [] -> []
          | hd::tail ->
          let newHd =
          match hd with
          EmptyElement -> hd
          | Leaf s -> hd
          | Node n as aNode ->
          Node {
          name = n.name ;
          data = func n.data ;
          parent=n.parent ;
          level=n.level ;
          rank=n.rank ;
          children= (treat_nodeList n.children)
          } in

          newHd::(treat_nodeList tail)
          in
          treat_nodeList [aNode];;


          let rec show_structure data_to_string_function aNode =
          match aNode with
          EmptyElement -> ()
          | Leaf s -> ()
          | Node n ->
          let () = Printf.printf "%d %d %s %s\n" n.level n.rank n.name
          (data_to_string_function n.data) in
          let () = flush Pervasives.stdout in

          (List.iter (fun n -> show_structure data_to_string_function n)
          n.children);;


          And so:

          # let aT1 = add_child_by_name "twin child" 500 (
          add_child_by_name "twin child" 400 (
          add_child_by_name "second child" 300 (
          add_child_by_name "first child" 200 (
          add_child_by_name "root" 100 EmptyElement))));;

          # show_structure string_of_int aT1;;
          0 0 root 100
          1 0 first child 200
          1 1 second child 300
          1 2 twin child 400
          1 3 twin child 500
          - : unit = ()


          Apparently, this is very simple and can be switched to OOP very easily and so
          become as easy to use as PXP DOM.

          Is there any main drawback of this approach that I should know? For instance,
          I can't "see" this structure, the printout of ocaml loops; I can only set it.

          T I A
          Ernesto
        • Stalkern 2
          Il Tuesday 13 May 2003 12:04, Stalkern 2 ha scritto: [...] ... [...] About some kind of bad relationship between FP and circular structures... I have just
          Message 4 of 7 , May 13, 2003
          • 0 Attachment
            Il Tuesday 13 May 2003 12:04, Stalkern 2 ha scritto:
            [...]
            > Is there any main drawback of this approach that I should know? For
            > instance, I can't "see" this structure, the printout of ocaml loops; I can
            > only set it.
            [...]

            About some kind of bad relationship between FP and circular structures...

            I have just noticed that walking HORIZONTALLY a VERTICALLY double-linked tree
            is a kind of big mess if one goes the functional way (at least it was for me
            and I stopped my search) . In-place modification is quite easy.

            Cheers
            Ernesto
          • Stalkern 2
            ... Just to say that this function doesn t work. Now, I ve been converting the tree type to type a node = { name : string ; mutable data : a ; mutable parent
            Message 5 of 7 , May 14, 2003
            • 0 Attachment
              Il Tuesday 13 May 2003 12:04, Stalkern 2 ha scritto:
              > let add_child_by_name childName childData aNode =
              > match aNode with
              > EmptyElement ->
              > (* when adding to a empty node, no parent and no children*)
              > Node {
              > name=childName ;
              > data=childData ;
              > parent=EmptyElement ;
              > level=0 ;
              > rank=0 ;
              > children=[] }
              > | Leaf s -> assert false
              > | Node n as targetParent ->
              > (* when adding a new Cell before a addressee Cell, we take as prev
              > of the new Cell the former prev of addressee Cell and as next of the new
              > Cell the addressee Cell itself.
              > Then, we set the new Cell as previous of the adressee Cell and
              > if the prev of the new Cell is not Empty, that could happen if the
              > previous of the addressee cell was empty, we set as the new Cell as next of
              > it*) let newNodeRecord = {
              > name=childName ;
              > data=childData ;
              > parent=targetParent;
              > level=(n.level + 1);
              > rank=(List.length n.children) ;
              > children=[] } in
              > let newNode = Node newNodeRecord in
              > n.children <- (n.children@[newNode]);
              > targetParent;;

              Just to say that this function doesn't work.
              Now, I've been converting the tree type to

              type 'a node = {
              name : string ;
              mutable data : 'a ;
              mutable parent : 'a dll ;
              mutable level : int;
              mutable path : int list;
              mutable rank : int;
              mutable children : 'a dll list }
              and 'a dll = EmptyElement | Leaf of 'a | Node of 'a node ;;

              exception Inconsistent of string;;

              but I'm stuck in a sad problem, i.e. reparenting of added child nodes.
              The problem is that the tree is contained (in maximum n=depth passages) in
              every node (one can take any node and ask for parent, then for children of
              parent, then for parents of children, then for parents of parents of children
              and so on until all the tree has been walked) but the record structure
              implies that every node records its tree at least (1 = number of this node
              +(0 or 1 = number of parent nodes) +(number of children nodes)) times. The
              problem is every recording changes the tree itself! =8-O

              One technique that come sto my mind would be to have not a parent record, but
              something like a parent method, retracing the parent by its path (if I'm at
              node 0-4-2-1, parent will of course be node 0-4-2 and I could retrieve it
              when needed). But I don't know enough samples and I don't understand PXP
              enough to see what did its creators call "double-linked-trees" .

              Does anybody know the solution?

              T I A
              Ernesto
            • Stalkern 2
              ... Yes, somebody knew it! I have just been told that the mutable fields in a record pass values by reference, so if I duplicate something, it s my fault. I
              Message 6 of 7 , May 15, 2003
              • 0 Attachment
                Il Wednesday 14 May 2003 13:30, Stalkern 2 ha scritto:
                > but I'm stuck in a sad problem, i.e. reparenting of added child nodes.
                > The problem is that the tree is contained (in maximum n=depth passages) in
                > every node (one can take any node and ask for parent, then for children of
                > parent, then for parents of children, then for parents of parents of
                > children and so on until all the tree has been walked) but the record
                > structure implies that every node records its tree at least (1 = number of
                > this node +(0 or 1 = number of parent nodes) +(number of children nodes))
                > times. The problem is every recording changes the tree itself! =8-O
                >
                > One technique that come sto my mind would be to have not a parent record,
                > but something like a parent method, retracing the parent by its path (if
                > I'm at node 0-4-2-1, parent will of course be node 0-4-2 and I could
                > retrieve it when needed). But I don't know enough samples and I don't
                > understand PXP enough to see what did its creators call
                > "double-linked-trees" .
                >
                > Does anybody know the solution?

                Yes, somebody knew it! I have just been told that the mutable fields in a
                record pass values by reference, so if I duplicate something, it's my fault.
                I should have passed records as such, without opening and rebuilding them.

                Cheers
                Ernesto
              Your message has been successfully submitted and would be delivered to recipients shortly.