Loading ...
Sorry, an error occurred while loading the content.

Re: "ocaml_beginners"::[] Is ocaml smart on memory usage here?

Expand Messages
  • Remi Vanicat
    ... the {md3 with surfaces=new_surfaces} contruct make a copy of the old md3. Then all a record really contain in the C meaning of contain, is integer and
    Message 1 of 11 , Feb 13, 2007
    • 0 Attachment
      2007/2/13, Grant Olson <olsongt@...>:
      > I've got some rather large records that contain a bunch of OpenGL vertex
      > information. These also have associated textures. Sometimes these get
      > 're-skinned' where I replace the OpenGL textures with something like:
      >
      > {md3 with surfaces=new_surfaces}
      >
      > Quick spot checking of the memory seems to be okay, but I just wanted to
      > confirm if this copy operation is or isn't copying the contents of other
      > immutable fields within the record, in particular the extremely large vertex
      > data that is maintained in a different field. I can change my approach if
      > it is.

      the {md3 with surfaces=new_surfaces} contruct make a copy of the old
      md3. Then all a record really contain in the "C" meaning of contain,
      is integer and pointer. So your extremely large data is not copied,
      only the pointer to it is.
    • Joel Reymont
      Does it mean it s not necessary to make fields mutable? Is there a situation where mutable fields would be preferred? ... -- http://wagerlabs.com/
      Message 2 of 11 , Feb 13, 2007
      • 0 Attachment
        Does it mean it's not necessary to make fields mutable?

        Is there a situation where mutable fields would be preferred?

        On Feb 13, 2007, at 10:05 PM, Remi Vanicat wrote:

        > the {md3 with surfaces=new_surfaces} contruct make a copy of the old
        > md3. Then all a record really contain in the "C" meaning of contain,
        > is integer and pointer. So your extremely large data is not copied,
        > only the pointer to it is.

        --
        http://wagerlabs.com/
      • Karl Zilles
        ... Right. They don t need to be mutable. ... The construct you re using does make a copy of the record (but the record only contains pointers, so it s not
        Message 3 of 11 , Feb 13, 2007
        • 0 Attachment
          Joel Reymont wrote:
          > On Feb 13, 2007, at 10:05 PM, Remi Vanicat wrote:
          >> the {md3 with surfaces=new_surfaces} contruct make a copy of the old
          >> md3. Then all a record really contain in the "C" meaning of contain,
          >> is integer and pointer. So your extremely large data is not copied,
          >> only the pointer to it is.

          > Does it mean it's not necessary to make fields mutable?

          Right. They don't need to be mutable.

          > Is there a situation where mutable fields would be preferred?

          The construct you're using does make a copy of the record (but the
          record only contains pointers, so it's not too big). It also leaves
          the original record unchanged, which can be an advantage.

          Basically if you're programming in an imperative way with side effects,
          you have mutable fields and modify them directly.

          If you're programming in a functional way, without side effects then you
          don't want mutable fields and the {... with ...} syntax will be the way
          to go.
        • Jon Harrop
          ... This boils down to an absolutely pivotal concept in functional programming called referential transparency. This concept is one of the main stumbling
          Message 4 of 11 , Feb 13, 2007
          • 0 Attachment
            On Tuesday 13 February 2007 22:47, Joel Reymont wrote:
            > Does it mean it's not necessary to make fields mutable?

            This boils down to an absolutely pivotal concept in functional programming
            called referential transparency. This concept is one of the main stumbling
            blocks for imperative programmers learning FPLs like OCaml and F#. I
            certainly had trouble grokking this when I ditched C++.

            In essence, when a new immutable data structure is created from an existing
            data structure, the new data structure has a lot in common with the original.
            An imperative programmer immediately assumes (incorrectly) that the original
            data structure must have been copied. In fact, there is never any need to
            copy immutable data because you can simply refer back to the original, i.e.
            referencing is transparent.

            So FPLs effectively handle everything by pointer (mutable or immutable) and
            the only copying you'll ever incur is copying pointers. In this case, the
            pointers inside the record will be copied, which is nothing to worry about.

            In the vast majority of cases, succinct OCaml code is efficient OCaml code. As
            long as you don't explicitly copy lots of data, there won't be much copying
            going on "under the hood".

            The only notable exception to this rule crops up when you're traversing trees.
            Consider a function to substitute a value for a variable name in a symbolic
            expression:

            # let rec subst var value = function
            | `Add(f, g) -> `Add(subst var value f, subst var value g)
            | `Mul(f, g) -> `Mul(subst var value f, subst var value g)
            | `Int n -> `Int n
            | `Var v when v=var -> value
            | `Var v -> `Var v;;
            val subst :
            'a ->
            ([> `Add of 'b * 'b | `Int of 'c | `Mul of 'b * 'b | `Var of 'a ] as 'b) ->
            ([< `Add of 'd * 'd | `Int of 'c | `Mul of 'd * 'd | `Var of 'a ] as 'd) ->
            'b = <fun>

            This function might look ok but it is actually hugely inefficient because it
            copies the entire tree, even if no substitution is made.

            --
            Dr Jon D Harrop, Flying Frog Consultancy Ltd.
            OCaml for Scientists
            http://www.ffconsultancy.com/products/ocaml_for_scientists
          • Joel Reymont
            So what is the efficient alternative (thinking of Zippers in Haskell)? ... -- http://wagerlabs.com/
            Message 5 of 11 , Feb 13, 2007
            • 0 Attachment
              So what is the efficient alternative (thinking of Zippers in Haskell)?

              On Feb 14, 2007, at 12:07 AM, Jon Harrop wrote:

              > # let rec subst var value = function
              > | `Add(f, g) -> `Add(subst var value f, subst var value g)
              > | `Mul(f, g) -> `Mul(subst var value f, subst var value g)
              > | `Int n -> `Int n
              > | `Var v when v=var -> value
              > | `Var v -> `Var v;;
              > val subst :
              > 'a ->
              > ([> `Add of 'b * 'b | `Int of 'c | `Mul of 'b * 'b | `Var of 'a ]
              > as 'b) ->
              > ([< `Add of 'd * 'd | `Int of 'c | `Mul of 'd * 'd | `Var of 'a ]
              > as 'd) ->
              > 'b = <fun>
              >
              > This function might look ok but it is actually hugely inefficient
              > because it
              > copies the entire tree, even if no substitution is made.

              --
              http://wagerlabs.com/
            • roparzhhemon
              ... But is there a more efficient implementation in this case ? Ewan
              Message 6 of 11 , Feb 13, 2007
              • 0 Attachment
                --- In ocaml_beginners@yahoogroups.com, Jon Harrop <jon@...> wrote:
                >
                > The only notable exception to this rule crops up when you're traversing trees.
                > Consider a function to substitute a value for a variable name in a symbolic
                > expression:
                >
                > # let rec subst var value = function
                > | `Add(f, g) -> `Add(subst var value f, subst var value g)
                > | `Mul(f, g) -> `Mul(subst var value f, subst var value g)
                > | `Int n -> `Int n
                > | `Var v when v=var -> value
                > | `Var v -> `Var v;;
                > val subst :
                > 'a ->
                > ([> `Add of 'b * 'b | `Int of 'c | `Mul of 'b * 'b | `Var of 'a ] as 'b) ->
                > ([< `Add of 'd * 'd | `Int of 'c | `Mul of 'd * 'd | `Var of 'a ] as 'd) ->
                > 'b = <fun>
                >
                > This function might look ok but it is actually hugely inefficient because it
                > copies the entire tree, even if no substitution is made.

                But is there a more efficient implementation in this case ?

                Ewan
              • Frédéric van der Plancke
                ... Yes, for instance the line ... reconstructs a value from the existing one (and thus makes a copy), a cost you can avoir by writing ... so that it should be
                Message 7 of 11 , Feb 14, 2007
                • 0 Attachment
                  roparzhhemon wrote:
                  > --- In ocaml_beginners@yahoogroups.com, Jon Harrop <jon@...> wrote:
                  >
                  >> The only notable exception to this rule crops up when you're traversing trees.
                  >> Consider a function to substitute a value for a variable name in a symbolic
                  >> expression:
                  >>
                  >> # let rec subst var value = function
                  >> | `Add(f, g) -> `Add(subst var value f, subst var value g)
                  >> | `Mul(f, g) -> `Mul(subst var value f, subst var value g)
                  >> | `Int n -> `Int n
                  >> | `Var v when v=var -> value
                  >> | `Var v -> `Var v;;
                  >> val subst :
                  >> 'a ->
                  >> ([> `Add of 'b * 'b | `Int of 'c | `Mul of 'b * 'b | `Var of 'a ] as 'b) ->
                  >> ([< `Add of 'd * 'd | `Int of 'c | `Mul of 'd * 'd | `Var of 'a ] as 'd) ->
                  >> 'b = <fun>
                  >>
                  >> This function might look ok but it is actually hugely inefficient because it
                  >> copies the entire tree, even if no substitution is made.
                  >>
                  >
                  > But is there a more efficient implementation in this case ?
                  >
                  > Ewan
                  >
                  >
                  Yes, for instance the line

                  | `Var v -> `Var v;;

                  reconstructs a value from the existing one (and thus makes a copy), a cost you can avoir by writing

                  | `Var v as x -> x;;

                  so that it should be possible to return the original value when no substitution is made. OTOH, if substitutions are made, the part of the tree above the substitutions has to be copied, if you are working with immutable fields.

                  Frédéric vdP
                • Lukasz Stafiniak
                  ... To expand on this, you need to communicate whether a substitution was made, for example with a variant: let rec subst var value = function ... (match subst
                  Message 8 of 11 , Feb 15, 2007
                  • 0 Attachment
                    On 2/14/07, Frédéric van der Plancke <fvdp@...> wrote:
                    > roparzhhemon wrote:
                    > > --- In ocaml_beginners@yahoogroups.com, Jon Harrop <jon@...> wrote:
                    > >
                    > >> The only notable exception to this rule crops up when you're traversing trees.
                    > >> Consider a function to substitute a value for a variable name in a symbolic
                    > >> expression:
                    > >>
                    > >> # let rec subst var value = function
                    > >> | `Add(f, g) -> `Add(subst var value f, subst var value g)
                    > >> | `Mul(f, g) -> `Mul(subst var value f, subst var value g)
                    > >> | `Int n -> `Int n
                    > >> | `Var v when v=var -> value
                    > >> | `Var v -> `Var v;;
                    > >> val subst :
                    > >> 'a ->
                    > >> ([> `Add of 'b * 'b | `Int of 'c | `Mul of 'b * 'b | `Var of 'a ] as 'b) ->
                    > >> ([< `Add of 'd * 'd | `Int of 'c | `Mul of 'd * 'd | `Var of 'a ] as 'd) ->
                    > >> 'b = <fun>
                    > >>
                    > >> This function might look ok but it is actually hugely inefficient because it
                    > >> copies the entire tree, even if no substitution is made.
                    > >>
                    > >
                    > > But is there a more efficient implementation in this case ?
                    > >
                    > >
                    > Yes, for instance the line
                    >
                    > | `Var v -> `Var v;;
                    >
                    > reconstructs a value from the existing one (and thus makes a copy), a cost you can avoir by writing
                    >
                    > | `Var v as x -> x;;
                    >
                    > so that it should be possible to return the original value when no substitution is made. OTOH, if substitutions are made, the part of the tree above the substitutions has to be copied, if you are working with immutable fields.
                    >
                    To expand on this, you need to communicate whether a substitution was
                    made, for example with a variant:

                    let rec subst var value = function
                    | `Add(f, g) ->
                    (match subst var value f, subst var value g with
                    None, None -> None
                    | Some a1, None -> Some (`Add (a1, g))
                    | None, Some a2 -> Some (`Add (f, a2))
                    | Some a1, Some a2 -> Some (`Add (a1, a2))
                    )
                    | `Mult(f, g) ->
                    (match subst var value f, subst var value g with
                    None, None -> None
                    | Some a1, None -> Some (`Mult (a1, g))
                    | None, Some a2 -> Some (`Mult (f, a2))
                    | Some a1, Some a2 -> Some (`Mult (a1, a2))
                    )
                    | `Int n -> None
                    | `Var v when v=var -> Some value
                    | `Var v -> None;;

                    This way only substitution paths are copied.

                    Best wishes,
                    Lukasz
                  • roparzhhemon
                    ... Yes. So in fact, this is not really an example where short, recursive coding is inefficient : the point is just that you have to do some extra thinking
                    Message 9 of 11 , Feb 15, 2007
                    • 0 Attachment
                      --- In ocaml_beginners@yahoogroups.com, "Lukasz Stafiniak" <lukstafi@...> wrote:
                      > To expand on this, you need to communicate whether a substitution was
                      > made, for example with a variant:
                      >
                      > let rec subst var value = function
                      > | `Add(f, g) ->
                      > (match subst var value f, subst var value g with
                      > None, None -> None
                      > | Some a1, None -> Some (`Add (a1, g))
                      > | None, Some a2 -> Some (`Add (f, a2))
                      > | Some a1, Some a2 -> Some (`Add (a1, a2))
                      > )
                      > | `Mult(f, g) ->
                      > (match subst var value f, subst var value g with
                      > None, None -> None
                      > | Some a1, None -> Some (`Mult (a1, g))
                      > | None, Some a2 -> Some (`Mult (f, a2))
                      > | Some a1, Some a2 -> Some (`Mult (a1, a2))
                      > )
                      > | `Int n -> None
                      > | `Var v when v=var -> Some value
                      > | `Var v -> None;;
                      >
                      > This way only substitution paths are copied.

                      Yes. So in fact, this is not really an example where "short, recursive coding"
                      is inefficient : the point is just that you have to do some extra thinking before writing out
                      the code.

                      Ewan
                    • Jon Harrop
                      ... Exactly as Lukasz said, you must convey whether or not a substitution was made and, when no substitutions are made, you must return the input expression.
                      Message 10 of 11 , Feb 15, 2007
                      • 0 Attachment
                        On Wednesday 14 February 2007 07:17, Joel Reymont wrote:
                        > So what is the efficient alternative (thinking of Zippers in Haskell)?

                        Exactly as Lukasz said, you must convey whether or not a substitution was made
                        and, when no substitutions are made, you must return the input expression.

                        Physical equality is perfect for this. If no substitution is made then you
                        return the input unaltered and the caller can verify this by noticing that
                        the return value is physically equal to the original. Many other languages
                        (e.g. SML) do not provide physical equality AFAIK.

                        As Lukasz suggested, you can use other approaches like boxing the result with
                        None indicating that no substitution was made, but boxing is slow, or you can
                        raise an exception (e.g. Exit) if no substitution was made. Physical equality
                        is probably faster though, as it just compares pointers and the "exceptional"
                        route is the most common route in term rewriting.

                        In terms of software engineering, factor out the recursion into a separate
                        function, so you have a recursive higher-order rewrite function that accepts
                        a rule function.

                        The naive implementation is:

                        let rec rewrite rule e = match e with
                        | `Int _ | `Var _ -> rule e
                        | `Add(f, g) -> rule(`Add(rewrite rule f, rewrite rule g))
                        | `Mul(f, g) -> rule(`Mul(rewrite rule f, rewrite rule g));;

                        Using physical equality we can write:

                        let compose2 f g e f' op g' = if f==f' && g==g' then e else op f' g'
                        let ( +: ) f g = `Add(f, g)
                        let ( *: ) f g = `Mul(f, g)

                        let rec rewrite rule e = match e with
                        | `Int _ | `Var _ -> rule e
                        | `Add(f, g) -> compose f g e (rewrite rule f) ( +: ) (rewrite rule g)
                        | `Mul(f, g) -> compose f g e (rewrite rule f) ( *: ) (rewrite rule g)

                        The subst function can be written in terms of this optimised rewrite function:

                        let subst var value = rewrite (function `Var v when v=var -> value | e -> e)

                        This does only minimal copying. When implementing a term rewriter, like
                        Mathematica, this is one of the single most important optimisations. You can
                        even improve this optimisation by tagging expressions to avoid traversing
                        them again when they have already been completely rewritten according to a
                        given set of rules.

                        This optimisation applies to lots more than just term rewriting. Consider a
                        simple insertion sort designed to be fast for almost-sorted lists (one of my
                        entries on Code Codex):

                        # let rec sort = function
                        | [] -> []
                        | h1::t as list -> match sort t with
                        | h2::t when h1>h2 -> h2::sort(h1::t)
                        | t' -> if t==t' then list else h1::t';;
                        val sort : 'a list -> 'a list = <fun>

                        Thanks to physical equality, this returns as much of the original list as
                        possible, doing no allocation when applied to a sorted list. (If you
                        replace "if t==t' then list else h1::t'" with "h1::t'" then this function
                        will always copy the entire list)

                        When using immutable data structures in performance-critical code, you must
                        leverage this technique to avoid copying at all costs. This is particularly
                        important to remember because performance profiling does not explain the
                        problem but, rather, simply indicates a large amount of time spent in the GC
                        with no indication of which function is generating all the garbage.

                        --
                        Dr Jon D Harrop, Flying Frog Consultancy Ltd.
                        OCaml for Scientists
                        http://www.ffconsultancy.com/products/ocaml_for_scientists
                      Your message has been successfully submitted and would be delivered to recipients shortly.