Loading ...
Sorry, an error occurred while loading the content.

Re: OT - mail archive

Expand Messages
  • Jeroen Geilman
    ... What perceived limitations ? IMAP stores in maildir format scale quite well; pretty much the only limitation is storage I/O. If you mean you want a more
    Message 1 of 17 , Apr 25 12:21 PM
    • 0 Attachment
      On 04/25/2013 08:56 PM, John Allen wrote:
      I realize that this is off topic, but as there are more email experts assembled here than any where else I know of ....

      I have a couple of users who are using their maildir as online storage for emails (current and archival).
      They have done this on their own and are prepared to live with some of the limitations.

      What perceived limitations ?

      IMAP stores in maildir format scale quite well; pretty much the only limitation is storage I/O.

      If you mean you want a more efficient mailstore, you can look into dbmail or dbox storage (the former is a mysql mailstore and IMAP server; the latter is a newer mailstore format supported by dovecot, among others.)

      However, I think there must be a better way of doing this,

      A better way of doing *what* ? What "problem" do you want to solve?

      preferably one that could also be used to store non-mail documents, provides search  etc, WAN accessible.

      How is that related to users who use their IMAP mailstore as a.. mail store ?
      IMAP tends to be accessible from the outside in any case, and any MUA worth its salt can search.


      I have looked at a few things, but all of them seem to have problems.

      What things would those be ? You're not giving us much to go on, here.


      -- 
      J.
      
    • grarpamp
      ... Depending on your FS and horsepower, anything over 1000 x (n * 10) files in a directory can start to sink you pretty quick. I ve always wondered if there s
      Message 2 of 17 , Apr 25 3:15 PM
      • 0 Attachment
        > maildir format scale[s] quite well; pretty much the only
        > limitation is storage I/O.

        Depending on your FS and horsepower, anything over
        1000 x (n * 10) files in a directory can start to sink you
        pretty quick. I've always wondered if there's a maildir split
        specified out there that applications could utilize...
        where n is your split width... tmp/n, new/n, cur/n.
      • postfix@netorbit.it
        ... what about shifting this problem to the storage layer? Apart using SSDs, what about using having a striped array as a RAID 1+0 layout, in order to have
        Message 3 of 17 , Apr 26 3:25 AM
        • 0 Attachment
          On 26/04/2013 00:15, grarpamp wrote:
          >> maildir format scale[s] quite well; pretty much the only
          >> limitation is storage I/O.
          > Depending on your FS and horsepower, anything over
          > 1000 x (n * 10) files in a directory can start to sink you
          > pretty quick. I've always wondered if there's a maildir split
          > specified out there that applications could utilize...
          > where n is your split width... tmp/n, new/n, cur/n.
          what about shifting this problem to the storage layer?
          Apart using SSDs, what about using having a striped array as a RAID 1+0
          layout, in order to have some redudancy as well?
        • grarpamp
          ... Adding IOPS, tuning dirhash and whatnot definitely work up to a certain point. Sometimes you ll blow out your backup programs and other tools at various
          Message 4 of 17 , Apr 26 3:56 AM
          • 0 Attachment
            >> I've always wondered if there's a maildir split
            >> specified out there that applications could utilize...
            >> where n is your split width... tmp/n, new/n, cur/n.
            >
            > what about shifting this problem to the storage layer?
            > Apart using SSDs, what about using having a striped array as a RAID 1+0
            > layout, in order to have some redudancy as well?

            Adding IOPS, tuning dirhash and whatnot definitely work up to a
            certain point. Sometimes you'll blow out your backup programs and
            other tools at various points along the way till you're again bound
            by too many files in one dir or have hit the cpu/ram wall for it all.
          • Wietse Venema
            ... Faster disks don t solve algorithmic problems (problems related to the number of files per directory). Wietse
            Message 5 of 17 , Apr 26 4:20 AM
            • 0 Attachment
              postfix@...:
              > On 26/04/2013 00:15, grarpamp wrote:
              > >> maildir format scale[s] quite well; pretty much the only
              > >> limitation is storage I/O.
              > > Depending on your FS and horsepower, anything over
              > > 1000 x (n * 10) files in a directory can start to sink you
              > > pretty quick. I've always wondered if there's a maildir split
              > > specified out there that applications could utilize...
              > > where n is your split width... tmp/n, new/n, cur/n.
              > what about shifting this problem to the storage layer?

              Faster disks don't solve algorithmic problems (problems related
              to the number of files per directory).

              Wietse
            • Robert Schetterer
              ... alternate you may use mdbox http://wiki2.dovecot.org/MailboxFormat/dbox ... Best Regards MfG Robert Schetterer -- [*] sys4 AG http://sys4.de, +49 (89) 30
              Message 6 of 17 , Apr 26 4:34 AM
              • 0 Attachment
                Am 26.04.2013 13:20, schrieb Wietse Venema:
                > postfix@...:
                >> On 26/04/2013 00:15, grarpamp wrote:
                >>>> maildir format scale[s] quite well; pretty much the only
                >>>> limitation is storage I/O.
                >>> Depending on your FS and horsepower, anything over
                >>> 1000 x (n * 10) files in a directory can start to sink you
                >>> pretty quick. I've always wondered if there's a maildir split
                >>> specified out there that applications could utilize...
                >>> where n is your split width... tmp/n, new/n, cur/n.
                >> what about shifting this problem to the storage layer?
                >
                > Faster disks don't solve algorithmic problems (problems related
                > to the number of files per directory).

                alternate you may use mdbox

                http://wiki2.dovecot.org/MailboxFormat/dbox

                >
                > Wietse
                >



                Best Regards
                MfG Robert Schetterer

                --
                [*] sys4 AG

                http://sys4.de, +49 (89) 30 90 46 64
                Franziskanerstraße 15, 81669 München

                Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
                Vorstand: Patrick Ben Koetter, Axel von der Ohe, Marc Schiffbauer
                Aufsichtsratsvorsitzender: Florian Kirstein
              • grarpamp
                ... Both of these hold all messages in a single directory. So sdbox would be no advantage there. And mdbox does not support one message per file, nor without
                Message 7 of 17 , Apr 26 12:24 PM
                • 0 Attachment
                  >>>> specified out there that applications could utilize...
                  >>>> where n is your split width... tmp/n, new/n, cur/n.

                  > alternate you may use mdbox
                  > http://wiki2.dovecot.org/MailboxFormat/dbox

                  Both of these hold all messages in a single directory.
                  So sdbox would be no advantage there.
                  And mdbox does not support one message per file, nor
                  without metadata added to it, so those needing that for
                  other purposes would have no advantage.
                  [There is a per file limit specified in bytes, not count. It's
                  not clear what the behavior would be if a proposed new msg in
                  a new file would exceed a lesser byte limit. Perhaps a safe
                  bounce or queue.]

                  I do like that they are reasonably well specified, publicly
                  on a wiki as opposed to only in source, and have a comparison
                  table of support on the parent page. All of which lead to easier
                  review and adoption by interested parties.
                • Reindl Harald
                  ... pff and you realized that the not a file per message is exactly the solution for problems with tens thousands of files in a folder?
                  Message 8 of 17 , Apr 26 12:29 PM
                  • 0 Attachment
                    Am 26.04.2013 21:24, schrieb grarpamp:
                    >>>>> specified out there that applications could utilize...
                    >>>>> where n is your split width... tmp/n, new/n, cur/n.
                    >
                    >> alternate you may use mdbox
                    >> http://wiki2.dovecot.org/MailboxFormat/dbox
                    >
                    > Both of these hold all messages in a single directory.
                    > So sdbox would be no advantage there.
                    > And mdbox does not support one message per file

                    pff and you realized that the "not a file per message" is
                    exactly the solution for problems with tens thousands of
                    files in a folder?
                  • Patrick Domack
                    ... It used to take me well over 2days to backup my email. I switched to using mdbox 2.2years ago, and I have been extreemly happy with it. Current stats on my
                    Message 9 of 17 , Apr 26 3:37 PM
                    • 0 Attachment
                      Quoting grarpamp <grarpamp@...>:

                      >>>>> specified out there that applications could utilize...
                      >>>>> where n is your split width... tmp/n, new/n, cur/n.
                      >
                      >> alternate you may use mdbox
                      >> http://wiki2.dovecot.org/MailboxFormat/dbox
                      >
                      > Both of these hold all messages in a single directory.
                      > So sdbox would be no advantage there.
                      > And mdbox does not support one message per file, nor
                      > without metadata added to it, so those needing that for
                      > other purposes would have no advantage.
                      > [There is a per file limit specified in bytes, not count. It's
                      > not clear what the behavior would be if a proposed new msg in
                      > a new file would exceed a lesser byte limit. Perhaps a safe
                      > bounce or queue.]
                      >
                      > I do like that they are reasonably well specified, publicly
                      > on a wiki as opposed to only in source, and have a comparison
                      > table of support on the parent page. All of which lead to easier
                      > review and adoption by interested parties.

                      It used to take me well over 2days to backup my email.

                      I switched to using mdbox 2.2years ago, and I have been extreemly
                      happy with it.

                      Current stats on my personal mailbox is 4.8gigs compressed, holding
                      8.3gigs of email.
                      470k emails in 300 mdbox files (exactly 300 currently), the current
                      mdbox file number is 850

                      I purge emails weekly out of my mdbox files.

                      So with my average mailfolder holding 10k+ emails with maildir, or 300
                      files in a folder for mdbox, mdbox wins for me.

                      I never switched to dbox, but the gains with using it are also
                      available for using mdbox, you can use single instance storage, and
                      archival storage. I think the limit is using 3 different paths to hold
                      your dbox/mdbox files.
                    • grarpamp
                      ... It is *a* solution, not *the* solution, and obviously not one of the type I describes. And a fine pff to you my friend.
                      Message 10 of 17 , Apr 26 7:32 PM
                      • 0 Attachment
                        >>>>>> specified out there that applications could utilize...
                        >>>>>> where n is your split width... tmp/n, new/n, cur/n.

                        > pff and you realized that the "not a file per message" is
                        > exactly the solution for problems with tens thousands of

                        It is *a* solution, not *the* solution, and obviously not one
                        of the type I describes. And a fine pff to you my friend.
                      • Robert Schetterer
                        ... if done a mailbox archive test with postfix bcc, filtering with sieve to domain, user, date, mail-in , mail-out sort in subfolders i dont see urgent need
                        Message 11 of 17 , Apr 26 10:37 PM
                        • 0 Attachment
                          Am 26.04.2013 21:24, schrieb grarpamp:
                          >>>>> specified out there that applications could utilize...
                          >>>>> where n is your split width... tmp/n, new/n, cur/n.
                          >
                          >> alternate you may use mdbox
                          >> http://wiki2.dovecot.org/MailboxFormat/dbox
                          >
                          > Both of these hold all messages in a single directory.
                          > So sdbox would be no advantage there.
                          > And mdbox does not support one message per file, nor
                          > without metadata added to it, so those needing that for
                          > other purposes would have no advantage.

                          if done a mailbox archive test with postfix bcc, filtering
                          with sieve to domain, user, date, mail-in , mail-out sort in subfolders

                          i dont see urgent need for having each mail a file, as you may connect
                          to the archive via imap client for restore one mail,
                          mdbox to my knowledge is the best open compromise on filesystem mailbox
                          formats for filesystems, but however maildir should work too if used in
                          a well desigend setup, you might need scripting taring partitioning too,
                          on the long run.
                          Other Soltuions do store archive in Databases etc

                          > [There is a per file limit specified in bytes, not count. It's
                          > not clear what the behavior would be if a proposed new msg in
                          > a new file would exceed a lesser byte limit. Perhaps a safe
                          > bounce or queue.]
                          >
                          > I do like that they are reasonably well specified, publicly
                          > on a wiki as opposed to only in source, and have a comparison
                          > table of support on the parent page. All of which lead to easier
                          > review and adoption by interested parties.
                          >

                          in germany companies 10 year archive for finance mails is a must have by
                          law, so there are serveral certified professional solutions, you may
                          have a look on.

                          Dovecot also offers some archive solution, and a object store
                          you might have a look on

                          At the end this isnt a real postfix theme, its more about storage etc


                          Best Regards
                          MfG Robert Schetterer

                          --
                          [*] sys4 AG

                          http://sys4.de, +49 (89) 30 90 46 64
                          Franziskanerstraße 15, 81669 München

                          Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
                          Vorstand: Patrick Ben Koetter, Axel von der Ohe, Marc Schiffbauer
                          Aufsichtsratsvorsitzender: Florian Kirstein
                        • Stan Hoeppner
                          ... True. mbox solved this problem before it really began, back before people started mass archiving thousands or tens of thousands of emails. When used with
                          Message 12 of 17 , Apr 26 10:56 PM
                          • 0 Attachment
                            On 4/26/2013 9:32 PM, grarpamp wrote:
                            >>>>>>> specified out there that applications could utilize...
                            >>>>>>> where n is your split width... tmp/n, new/n, cur/n.
                            >
                            >> pff and you realized that the "not a file per message" is
                            >> exactly the solution for problems with tens thousands of
                            >
                            > It is *a* solution, not *the* solution,

                            True. mbox solved this problem before it really began, back before
                            people started mass archiving thousands or tens of thousands of emails.
                            When used with modern MUAs and multiple name spaces you can mitigate or
                            eliminate the locking problems, especially if using sieve to sort to
                            these folders/files during delivery.

                            I've been using such a setup with Dovecot, Thunderbird, and Roundcube,
                            for may years. The largest of my mbox files, XFS list mail, is only
                            19K+ emails. Full text searching it is relatively quick even if the FTS
                            index isn't primed, and especially given the age of the hardware. If I
                            was using maildir storage I can only assume FTS would take quite a while
                            longer, as well as backup.

                            --
                            Stan
                          • grarpamp
                            ... I must admit giving yourself the local equivalent of your own lifetime email account is an interesting approach if you don t really need access to the raw
                            Message 13 of 17 , Apr 26 11:36 PM
                            • 0 Attachment
                              > re: the last two posts

                              I must admit giving yourself the local equivalent
                              of your own lifetime email account is an interesting
                              approach if you don't really need access to the raw
                              message files on disk.
                            • Reindl Harald
                              ... boy you replied to Faster disks don t solve algorithmic problems (problems related to the number of files per directory) with And mdbox does not support
                              Message 14 of 17 , Apr 27 3:49 AM
                              • 0 Attachment
                                Am 27.04.2013 04:32, schrieb grarpamp:
                                >>>>>>> specified out there that applications could utilize...
                                >>>>>>> where n is your split width... tmp/n, new/n, cur/n.
                                >
                                >> pff and you realized that the "not a file per message" is
                                >> exactly the solution for problems with tens thousands of
                                >
                                > It is *a* solution, not *the* solution, and obviously not one
                                > of the type I describes. And a fine pff to you my friend.

                                boy you replied to "Faster disks don't solve algorithmic problems
                                (problems related to the number of files per directory)" with
                                "And mdbox does not support one message per file"

                                no it is not *the* solution, but "does not support one message
                                püer file is pure bullshit in this context because it is what
                                you want
                              • grarpamp
                                ... No, actually right up there is what I was surveying. But you failed to grok that in your search for more pfft. I m sure it s a nice day, go outside :)
                                Message 15 of 17 , Apr 27 2:03 PM
                                • 0 Attachment
                                  >> specified out there that applications could utilize...
                                  >> where n is your split width... tmp/n, new/n, cur/n.

                                  > it is what you want

                                  No, actually right up there is what I was surveying.
                                  But you failed to grok that in your search for more pfft.
                                  I'm sure it's a nice day, go outside :)
                                • Reindl Harald
                                  ... maybe you should learn how to use a mail-client and quote before you post to a mail-server list - your answer above makes no sense at all in context of the
                                  Message 16 of 17 , Apr 27 2:10 PM
                                  • 0 Attachment
                                    Am 27.04.2013 23:03, schrieb grarpamp:
                                    >>> specified out there that applications could utilize...
                                    >>> where n is your split width... tmp/n, new/n, cur/n.
                                    >
                                    >> it is what you want
                                    >
                                    > No, actually right up there is what I was surveying.
                                    > But you failed to grok that in your search for more pfft.
                                    > I'm sure it's a nice day, go outside :)

                                    maybe you should learn how to use a mail-client and quote
                                    before you post to a mail-server list - your answer above
                                    makes no sense at all in context of the thread
                                  Your message has been successfully submitted and would be delivered to recipients shortly.