Loading ...
Sorry, an error occurred while loading the content.

Re: "ocaml_beginners"::[] Monitor memory usgae from executing ocaml code

Expand Messages
  • Johan Mazel
    Thanks for the advice. An example from the ocaml-benchmark project ( https://forge.ocamlcore.org/projects/ocaml-benchmark/) show that I was wrong: array are
    Message 1 of 11 , Aug 1, 2011
    • 0 Attachment
      Thanks for the advice.
      An example from the ocaml-benchmark project (
      https://forge.ocamlcore.org/projects/ocaml-benchmark/) show that I was
      wrong: array are faster than array1 especially when unsafe function are
      used.

      I therefore try to use arrays.
      Here is the result of the benchmark in term of memory usage obtained from
      pmap:
      Array: 2148,352MB
      Array1: 613,832MB
      The theoretic size should be 570MB (cf end of my mail) which is coherent
      with the value from Array1.

      However, the amount of memory used by array is way too big.
      A first explanation might be the fact that Ocaml's float are double
      precision float which would mean that they are supposed to occupy twice the
      space of float32 of Bigarray, ie 64 bits. Considering the measured memory
      use, this can not be the only reason.
      Another explanation might be the internal behavior/implementation of the
      array module. But I could not find any information on this specific point.

      Regards.
      Johan Mazel


      I am using array of one dimension to store triangular matrix as I've been
      advised in this thread :
      http://tech.groups.yahoo.com/group/ocaml_beginners/message/12840.
      The theoretical number of element in such array for 17296 points is
      (17296×(17296-1)÷2) = 149567160.
      Which occupy 149567160×32÷(8×(1024^2)) = 570,553436279 MB



      2011/8/1 Francois Berenger <berenger@...>

      > **
      >
      >
      > On 07/30/2011 11:41 PM, Johan Mazel wrote:
      > > My bad, this is THE "problem".
      > > I did not know that BigArray were allocated outside of the heap.
      > >
      > > I am going to keep using bigArray and call pmap from Ocaml in order to
      > get
      > > the memory usage estimation.
      > > I am afraid that a switch to Ocaml's array will slow down my code
      >
      > Benchmark, then you will really know.
      >
      >
      > > that
      > > heavily depends on accessing these arrays.
      > >
      > > Thanks a lot for your help.
      > > Regards.
      > > Johan mazel
      > >
      > > 2011/7/30 Gabriel Scherer<gabriel.scherer@...>
      > >
      > >> **
      > >>
      > >>
      > >> When you say Array2, Array1, are you talking about Bigarray.Array1 and
      > >> BigArray.Array2 modules ? If yes, they do live outside the OCaml heap.
      > If
      > >> you mean plain arrays ('a array, float array array; but I don't know any
      > >> module named Array1 or Array2 for these arrays) then there is something
      > >> else
      > >> going on.
      > >>
      > >>
      > >> On Sat, Jul 30, 2011 at 3:57 PM, Johan Mazel<johan.mazel@...>
      > >> wrote:
      > >>
      > >>> I do not think it is the case.
      > >>> The main part of this memory use is caused by either Array2 or Array1
      > and
      > >>> as
      > >>> far as I know these types are not used by Ocaml through C bindings.
      > >>> Regards.
      > >>> Johan Mazel
      > >>>
      > >>> 2011/7/30 Gabriel Scherer<gabriel.scherer@...>
      > >>>
      > >>>> **
      > >>>>
      > >>>>
      > >>>> The Gc will only count the memory allocated on the OCaml heap. If you
      > >> use
      > >>>> some OCaml/C binding, or Bigarray, you may well have large amounts of
      > >>>> memory
      > >>>> allocated outside the OCaml heap.
      > >>>>
      > >>>>
      > >>>> On Sat, Jul 30, 2011 at 2:16 PM, Johan Mazel<johan.mazel@...>
      > >>>> wrote:
      > >>>>
      > >>>>> Hi
      > >>>>> I am trying to asses the memory usage of an Ocaml program in order to
      > >>>>> benchmark its execution.
      > >>>>>
      > >>>>> I tried to use the heap_words field from Gc stat as stated here
      > >>>>> http://tech.groups.yahoo.com/group/ocaml_beginners/message/3378 but
      > >>>> there
      > >>>>> is
      > >>>>> a discrepancy between this value and the one that pmap gives me on my
      > >>>>> process.
      > >>>>> In fact, heap_words give me a value 507904 which is B/MB
      > >>>>> 507904×64÷8=4063232B or 3.875MB (64 bits architecture) when pmap
      > >> gives
      > >>> me
      > >>>> a
      > >>>>> value of 211528kB×1024=216604672B or 206MB.
      > >>>>> Pmap's value is much more consistent with top's value which is 4.7%
      > >> of
      > >>>>> total
      > >>>>> RAM => 4058692×1024×0.047=195336728,576B or 186MB (the error can
      > >>>> partially
      > >>>>> be explained by the lack of precision of the 4.7%).
      > >>>>>
      > >>>>> One of the first explanation that I found was that maybe this error
      > >> was
      > >>>>> linked to the fact that heap_words only concerns words in the major
      > >>> heap
      > >>>>> and
      > >>>>> not in the minor heap. However, the error is so big that in order for
      > >>>> this
      > >>>>> to be the real reason of my problem, the minor heap should be bigger
      > >>> that
      > >>>>> the major one which impossible if I understood correctly how the GC
      > >>>> works.
      > >>>>> I think that I am missing something but I don't see what it is.
      > >>>>>
      > >>>>> Thanks in advance for your time.
      > >>>>> Regards.
      > >>>>> Johan Mazel
      > >>>>>
      > >>>>>
      > >>>>> [Non-text portions of this message have been removed]
      > >>>>>
      > >>>>>
      > >>>>>
      > >>>>> ------------------------------------
      > >>>>
      > >>>>>
      > >>>>> Archives up to December 31, 2010 are also downloadable at
      > >>>>> http://www.connettivo.net/cntprojects/ocaml_beginners
      > >>>>> The archives of the very official ocaml list (the seniors' one) can
      > >> be
      > >>>>> found at http://caml.inria.fr
      > >>>>> Attachments are banned and you're asked to be polite, avoid flames
      > >>>>> etc.Yahoo! Groups Links
      > >>>>
      > >>>>>
      > >>>>>
      > >>>>>
      > >>>>>
      > >>>>
      > >>>> [Non-text portions of this message have been removed]
      > >>>>
      > >>>>
      > >>>>
      > >>>
      > >>>
      > >>> [Non-text portions of this message have been removed]
      > >>>
      > >>>
      > >>>
      > >>> ------------------------------------
      > >>>
      > >>> Archives up to December 31, 2010 are also downloadable at
      > >>> http://www.connettivo.net/cntprojects/ocaml_beginners
      > >>> The archives of the very official ocaml list (the seniors' one) can be
      > >>> found at http://caml.inria.fr
      > >>> Attachments are banned and you're asked to be polite, avoid flames
      > >>> etc.Yahoo! Groups Links
      > >>>
      > >>>
      > >>>
      > >>>
      > >>
      > >> [Non-text portions of this message have been removed]
      > >>
      > >>
      > >>
      > >
      > >
      > > [Non-text portions of this message have been removed]
      > >
      > >
      > >
      > > ------------------------------------
      > >
      > > Archives up to December 31, 2010 are also downloadable at
      > http://www.connettivo.net/cntprojects/ocaml_beginners
      > > The archives of the very official ocaml list (the seniors' one) can be
      > found at http://caml.inria.fr
      > > Attachments are banned and you're asked to be polite, avoid flames
      > etc.Yahoo! Groups Links
      > >
      > >
      > >
      >
      >
      >


      [Non-text portions of this message have been removed]
    • r.schmitt@ocaml.de
      On Mon, 1 Aug 2011 20:27:34 +0200 ... It s the memory manager of ocaml. It always requests more memory from the operating system than you need at the moment.
      Message 2 of 11 , Aug 1, 2011
      • 0 Attachment
        On Mon, 1 Aug 2011 20:27:34 +0200
        Johan Mazel <johan.mazel@...> wrote:
        > However, the amount of memory used by array is way too big.
        > A first explanation might be the fact that Ocaml's float are double
        > precision float which would mean that they are supposed to occupy
        > twice the space of float32 of Bigarray, ie 64 bits. Considering the
        > measured memory use, this can not be the only reason.

        It's the memory manager of ocaml.
        It always requests more memory from the operating system than you need
        at the moment. It tries to allocate huge sequential chunks at once, so
        it doesn't need to call malloc/mmap again and again for every small
        allocation. Instead, it will (re-)use memory from this internal buffer.

        Bigarray however uses plain malloc without such tricks.
      • rixed@happyleptic.org
        ... Which is also how malloc works. IIRC from the runtime code, the main idea is to have all major heap values grouped together so that it s faster to tell if
        Message 3 of 11 , Aug 2, 2011
        • 0 Attachment
          > It always requests more memory from the operating system than you need
          > at the moment. It tries to allocate huge sequential chunks at once, so
          > it doesn't need to call malloc/mmap again and again for every small
          > allocation.

          Which is also how malloc works.
          IIRC from the runtime code, the main idea is to have all major heap
          values grouped together so that it's faster to tell if a value is in the major
          heap (thus must be traversed by the GC) by looking at its address (while a mere
          malloc would scatter objects here and there).

          Biggarays content does not need to be traversed by the GC (since they can contain
          only immediate values) and so are free to be malloced.

          Please correct me if I'm wrong.
        • r.schmitt@ocaml.de
          On Tue, 2 Aug 2011 09:09:19 +0200 ... I don t think, this is the reason, why so much additional space is requested. The memory manager always over-requests new
          Message 4 of 11 , Aug 2, 2011
          • 0 Attachment
            On Tue, 2 Aug 2011 09:09:19 +0200
            rixed@... wrote:
            > IIRC from the runtime code, the main idea is to have all major heap
            > values grouped together so that it's faster to tell if a value is in
            > the major heap (thus must be traversed by the GC) by looking at its
            > address (while a mere malloc would scatter objects here and there).

            I don't think, this is the reason, why so much additional space is requested.
            The memory manager always over-requests new memory according to
            Gc.control.space_overhead.

            You can temporarily set it down, if you only allocate a big array once
            and you know, that you don’t need that much memory after that:
            ---
            let cur = Gc.get() in
            let ns = { cur with Gc.space_overhead = 3 } in
            Gc.set ns ;
            Array.make big_i 0. ;
            Gc.set cur ;
            ---

            This way, only the needed memory is allocated (with a comparable small
            overhead).

            The manual (http://caml.inria.fr/pub/docs/manual-ocaml/libref/Gc.html)
            don't say much about space_overhead, but I would read it in the way
            I‘ve outlined above (I've took a very quick look at the source code and
            can't find anything that contradicts this interpretation. But perhaps
            somebody who is familiar with the runtime/source code could clarify
            this point. )


            > Which is also how malloc works.

            Yes and no. The additional requested space (in percentage) is
            negligible, if you allocate large amounts of memory with malloc/C (at
            least, if you don’t tweak it). It’s only relevant for very small
            requests.
          • rixed@happyleptic.org
            -[ Tue, Aug 02, 2011 at 04:41:56PM +0200, r.schmitt@ocaml.de ]---- ... I was only replying about why ocaml runtime does not call malloc for every values, not
            Message 5 of 11 , Aug 2, 2011
            • 0 Attachment
              -[ Tue, Aug 02, 2011 at 04:41:56PM +0200, r.schmitt@... ]----
              > On Tue, 2 Aug 2011 09:09:19 +0200
              > rixed@... wrote:
              > > IIRC from the runtime code, the main idea is to have all major heap
              > > values grouped together so that it's faster to tell if a value is in
              > > the major heap (thus must be traversed by the GC) by looking at its
              > > address (while a mere malloc would scatter objects here and there).
              >
              > I don't think, this is the reason, why so much additional space is requested.

              I was only replying about why ocaml runtime does not call malloc for every values,
              not why OP's program required so many memory.
            Your message has been successfully submitted and would be delivered to recipients shortly.