Loading ...
Sorry, an error occurred while loading the content.
 

Re: OT - mail archive

Expand Messages
  • Wietse Venema
    ... Faster disks don t solve algorithmic problems (problems related to the number of files per directory). Wietse
    Message 1 of 17 , Apr 26, 2013
      postfix@...:
      > On 26/04/2013 00:15, grarpamp wrote:
      > >> maildir format scale[s] quite well; pretty much the only
      > >> limitation is storage I/O.
      > > Depending on your FS and horsepower, anything over
      > > 1000 x (n * 10) files in a directory can start to sink you
      > > pretty quick. I've always wondered if there's a maildir split
      > > specified out there that applications could utilize...
      > > where n is your split width... tmp/n, new/n, cur/n.
      > what about shifting this problem to the storage layer?

      Faster disks don't solve algorithmic problems (problems related
      to the number of files per directory).

      Wietse
    • Robert Schetterer
      ... alternate you may use mdbox http://wiki2.dovecot.org/MailboxFormat/dbox ... Best Regards MfG Robert Schetterer -- [*] sys4 AG http://sys4.de, +49 (89) 30
      Message 2 of 17 , Apr 26, 2013
        Am 26.04.2013 13:20, schrieb Wietse Venema:
        > postfix@...:
        >> On 26/04/2013 00:15, grarpamp wrote:
        >>>> maildir format scale[s] quite well; pretty much the only
        >>>> limitation is storage I/O.
        >>> Depending on your FS and horsepower, anything over
        >>> 1000 x (n * 10) files in a directory can start to sink you
        >>> pretty quick. I've always wondered if there's a maildir split
        >>> specified out there that applications could utilize...
        >>> where n is your split width... tmp/n, new/n, cur/n.
        >> what about shifting this problem to the storage layer?
        >
        > Faster disks don't solve algorithmic problems (problems related
        > to the number of files per directory).

        alternate you may use mdbox

        http://wiki2.dovecot.org/MailboxFormat/dbox

        >
        > Wietse
        >



        Best Regards
        MfG Robert Schetterer

        --
        [*] sys4 AG

        http://sys4.de, +49 (89) 30 90 46 64
        Franziskanerstraße 15, 81669 München

        Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
        Vorstand: Patrick Ben Koetter, Axel von der Ohe, Marc Schiffbauer
        Aufsichtsratsvorsitzender: Florian Kirstein
      • grarpamp
        ... Both of these hold all messages in a single directory. So sdbox would be no advantage there. And mdbox does not support one message per file, nor without
        Message 3 of 17 , Apr 26, 2013
          >>>> specified out there that applications could utilize...
          >>>> where n is your split width... tmp/n, new/n, cur/n.

          > alternate you may use mdbox
          > http://wiki2.dovecot.org/MailboxFormat/dbox

          Both of these hold all messages in a single directory.
          So sdbox would be no advantage there.
          And mdbox does not support one message per file, nor
          without metadata added to it, so those needing that for
          other purposes would have no advantage.
          [There is a per file limit specified in bytes, not count. It's
          not clear what the behavior would be if a proposed new msg in
          a new file would exceed a lesser byte limit. Perhaps a safe
          bounce or queue.]

          I do like that they are reasonably well specified, publicly
          on a wiki as opposed to only in source, and have a comparison
          table of support on the parent page. All of which lead to easier
          review and adoption by interested parties.
        • Reindl Harald
          ... pff and you realized that the not a file per message is exactly the solution for problems with tens thousands of files in a folder?
          Message 4 of 17 , Apr 26, 2013
            Am 26.04.2013 21:24, schrieb grarpamp:
            >>>>> specified out there that applications could utilize...
            >>>>> where n is your split width... tmp/n, new/n, cur/n.
            >
            >> alternate you may use mdbox
            >> http://wiki2.dovecot.org/MailboxFormat/dbox
            >
            > Both of these hold all messages in a single directory.
            > So sdbox would be no advantage there.
            > And mdbox does not support one message per file

            pff and you realized that the "not a file per message" is
            exactly the solution for problems with tens thousands of
            files in a folder?
          • Patrick Domack
            ... It used to take me well over 2days to backup my email. I switched to using mdbox 2.2years ago, and I have been extreemly happy with it. Current stats on my
            Message 5 of 17 , Apr 26, 2013
              Quoting grarpamp <grarpamp@...>:

              >>>>> specified out there that applications could utilize...
              >>>>> where n is your split width... tmp/n, new/n, cur/n.
              >
              >> alternate you may use mdbox
              >> http://wiki2.dovecot.org/MailboxFormat/dbox
              >
              > Both of these hold all messages in a single directory.
              > So sdbox would be no advantage there.
              > And mdbox does not support one message per file, nor
              > without metadata added to it, so those needing that for
              > other purposes would have no advantage.
              > [There is a per file limit specified in bytes, not count. It's
              > not clear what the behavior would be if a proposed new msg in
              > a new file would exceed a lesser byte limit. Perhaps a safe
              > bounce or queue.]
              >
              > I do like that they are reasonably well specified, publicly
              > on a wiki as opposed to only in source, and have a comparison
              > table of support on the parent page. All of which lead to easier
              > review and adoption by interested parties.

              It used to take me well over 2days to backup my email.

              I switched to using mdbox 2.2years ago, and I have been extreemly
              happy with it.

              Current stats on my personal mailbox is 4.8gigs compressed, holding
              8.3gigs of email.
              470k emails in 300 mdbox files (exactly 300 currently), the current
              mdbox file number is 850

              I purge emails weekly out of my mdbox files.

              So with my average mailfolder holding 10k+ emails with maildir, or 300
              files in a folder for mdbox, mdbox wins for me.

              I never switched to dbox, but the gains with using it are also
              available for using mdbox, you can use single instance storage, and
              archival storage. I think the limit is using 3 different paths to hold
              your dbox/mdbox files.
            • grarpamp
              ... It is *a* solution, not *the* solution, and obviously not one of the type I describes. And a fine pff to you my friend.
              Message 6 of 17 , Apr 26, 2013
                >>>>>> specified out there that applications could utilize...
                >>>>>> where n is your split width... tmp/n, new/n, cur/n.

                > pff and you realized that the "not a file per message" is
                > exactly the solution for problems with tens thousands of

                It is *a* solution, not *the* solution, and obviously not one
                of the type I describes. And a fine pff to you my friend.
              • Robert Schetterer
                ... if done a mailbox archive test with postfix bcc, filtering with sieve to domain, user, date, mail-in , mail-out sort in subfolders i dont see urgent need
                Message 7 of 17 , Apr 26, 2013
                  Am 26.04.2013 21:24, schrieb grarpamp:
                  >>>>> specified out there that applications could utilize...
                  >>>>> where n is your split width... tmp/n, new/n, cur/n.
                  >
                  >> alternate you may use mdbox
                  >> http://wiki2.dovecot.org/MailboxFormat/dbox
                  >
                  > Both of these hold all messages in a single directory.
                  > So sdbox would be no advantage there.
                  > And mdbox does not support one message per file, nor
                  > without metadata added to it, so those needing that for
                  > other purposes would have no advantage.

                  if done a mailbox archive test with postfix bcc, filtering
                  with sieve to domain, user, date, mail-in , mail-out sort in subfolders

                  i dont see urgent need for having each mail a file, as you may connect
                  to the archive via imap client for restore one mail,
                  mdbox to my knowledge is the best open compromise on filesystem mailbox
                  formats for filesystems, but however maildir should work too if used in
                  a well desigend setup, you might need scripting taring partitioning too,
                  on the long run.
                  Other Soltuions do store archive in Databases etc

                  > [There is a per file limit specified in bytes, not count. It's
                  > not clear what the behavior would be if a proposed new msg in
                  > a new file would exceed a lesser byte limit. Perhaps a safe
                  > bounce or queue.]
                  >
                  > I do like that they are reasonably well specified, publicly
                  > on a wiki as opposed to only in source, and have a comparison
                  > table of support on the parent page. All of which lead to easier
                  > review and adoption by interested parties.
                  >

                  in germany companies 10 year archive for finance mails is a must have by
                  law, so there are serveral certified professional solutions, you may
                  have a look on.

                  Dovecot also offers some archive solution, and a object store
                  you might have a look on

                  At the end this isnt a real postfix theme, its more about storage etc


                  Best Regards
                  MfG Robert Schetterer

                  --
                  [*] sys4 AG

                  http://sys4.de, +49 (89) 30 90 46 64
                  Franziskanerstraße 15, 81669 München

                  Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
                  Vorstand: Patrick Ben Koetter, Axel von der Ohe, Marc Schiffbauer
                  Aufsichtsratsvorsitzender: Florian Kirstein
                • Stan Hoeppner
                  ... True. mbox solved this problem before it really began, back before people started mass archiving thousands or tens of thousands of emails. When used with
                  Message 8 of 17 , Apr 26, 2013
                    On 4/26/2013 9:32 PM, grarpamp wrote:
                    >>>>>>> specified out there that applications could utilize...
                    >>>>>>> where n is your split width... tmp/n, new/n, cur/n.
                    >
                    >> pff and you realized that the "not a file per message" is
                    >> exactly the solution for problems with tens thousands of
                    >
                    > It is *a* solution, not *the* solution,

                    True. mbox solved this problem before it really began, back before
                    people started mass archiving thousands or tens of thousands of emails.
                    When used with modern MUAs and multiple name spaces you can mitigate or
                    eliminate the locking problems, especially if using sieve to sort to
                    these folders/files during delivery.

                    I've been using such a setup with Dovecot, Thunderbird, and Roundcube,
                    for may years. The largest of my mbox files, XFS list mail, is only
                    19K+ emails. Full text searching it is relatively quick even if the FTS
                    index isn't primed, and especially given the age of the hardware. If I
                    was using maildir storage I can only assume FTS would take quite a while
                    longer, as well as backup.

                    --
                    Stan
                  • grarpamp
                    ... I must admit giving yourself the local equivalent of your own lifetime email account is an interesting approach if you don t really need access to the raw
                    Message 9 of 17 , Apr 26, 2013
                      > re: the last two posts

                      I must admit giving yourself the local equivalent
                      of your own lifetime email account is an interesting
                      approach if you don't really need access to the raw
                      message files on disk.
                    • Reindl Harald
                      ... boy you replied to Faster disks don t solve algorithmic problems (problems related to the number of files per directory) with And mdbox does not support
                      Message 10 of 17 , Apr 27, 2013
                        Am 27.04.2013 04:32, schrieb grarpamp:
                        >>>>>>> specified out there that applications could utilize...
                        >>>>>>> where n is your split width... tmp/n, new/n, cur/n.
                        >
                        >> pff and you realized that the "not a file per message" is
                        >> exactly the solution for problems with tens thousands of
                        >
                        > It is *a* solution, not *the* solution, and obviously not one
                        > of the type I describes. And a fine pff to you my friend.

                        boy you replied to "Faster disks don't solve algorithmic problems
                        (problems related to the number of files per directory)" with
                        "And mdbox does not support one message per file"

                        no it is not *the* solution, but "does not support one message
                        püer file is pure bullshit in this context because it is what
                        you want
                      • grarpamp
                        ... No, actually right up there is what I was surveying. But you failed to grok that in your search for more pfft. I m sure it s a nice day, go outside :)
                        Message 11 of 17 , Apr 27, 2013
                          >> specified out there that applications could utilize...
                          >> where n is your split width... tmp/n, new/n, cur/n.

                          > it is what you want

                          No, actually right up there is what I was surveying.
                          But you failed to grok that in your search for more pfft.
                          I'm sure it's a nice day, go outside :)
                        • Reindl Harald
                          ... maybe you should learn how to use a mail-client and quote before you post to a mail-server list - your answer above makes no sense at all in context of the
                          Message 12 of 17 , Apr 27, 2013
                            Am 27.04.2013 23:03, schrieb grarpamp:
                            >>> specified out there that applications could utilize...
                            >>> where n is your split width... tmp/n, new/n, cur/n.
                            >
                            >> it is what you want
                            >
                            > No, actually right up there is what I was surveying.
                            > But you failed to grok that in your search for more pfft.
                            > I'm sure it's a nice day, go outside :)

                            maybe you should learn how to use a mail-client and quote
                            before you post to a mail-server list - your answer above
                            makes no sense at all in context of the thread
                          Your message has been successfully submitted and would be delivered to recipients shortly.